Judge’s AI-Video Rejection Evokes Broader Tech, Evidence Issues

A Washington state judge’s exclusion of AI-enhanced video evidence puts a novel twist on controversial issues regarding what technologies to put in front of juries.

The video—offered by a man accused of shooting five people, killing three, outside a Seattle bar—relied on an artificial intelligence model that hadn’t been validated and could lead to confusion, King County Superior Court Judge Leroy McCullough ruled March 29. The model altered pixels with opaque methods and hadn’t been peer reviewed, which could have led to “a time-consuming trial within a trial” over its reliability, McCullough said.

The ruling comes as AI’s capabilities and applications continue to proliferate and raises questions about how to draw boundaries regarding the role AI-generated evidence can play in courts. And while AI technology is new, concerns about the ability to inspect, vet, and challenge it mirror ones legal professionals say persist for other technologies, often without satisfactory answers.

“That kind of black-box process can’t be explained in court,” criminal law professor Andrew Guthrie Ferguson of American University said of large-language models. “That should be troubling for courts and judges.”

While in the Washington case, it was defendant Joshua Puloka seeking to introduce the evidence to support his claim of self-defense, it’s more often prosecutors pushing courts—often successfully—to admit evidence generated or enhanced by technology, criminal law professor Rebecca Wexler of University of California at Berkeley said. Attention to AI “is sorely needed,” she said, but courts also fail to vet other technologies, and vendors often rely on contract law and trade secrets to evade transparency and scrutiny.

“It’s an opportunity to fix general problems that apply to AI technology and also have applied to forensic software for a long time,” said Wexler, who has testified before Congress on AI-based evidence. “There’s nothing about AI specifically that we should blanketly exclude it. But we should hold AI-makers to high standards to make sure the tools are fair and accountable.”

Ferguson supported McCullough’s decision and agreed it’s historically been prosecutors putting problematic technological tools in front of juries.

“For decades we’ve seen bad forensic science come into criminal court and be used for prosecution,” he said. “The same potential applies to AI as evidence, and the judge is being cautious, which is probably what they should have done in the past.”

Imprecise Standards

Today’s standards for what testimony and expert methodology can be used as evidence stem from the US Supreme Court’s 1993 decision in Daubert v. Merrell Dow Pharmaceuticals Inc. Relevant factors under that precedent include whether a theory or technique can be and has been tested, is subject to peer review, has a known or potential error rate, and has acceptance in the scientific community.

But courts have no definition for what counts as “peer review,” Wexler said, and that imprecision leaves judges a lot of discretion that often weighs in favor of prosecutors admitting evidence hinging on technology.

Aside from being open to interpretation, Ferguson said, the rules are often dated. He said he tells students, “most rules of evidence were written in an era they can’t even comprehend,” before the digital age, leading to “tension applying this old law to technology.”

Judges often have been too willing to accept prosecutor assertions that a technique is reliably used in law enforcement and vendors’ trade secret claims to deny defendant access, criminal law professor Brandon Garrett of Duke University said.

He noted that in December the Federal Rules of Evidence on expert testimony were amended for the first time since the 2000 codification of Daubert. To ensure judges were adequately exercising their gatekeeping function, language clarified a burden on the introducing party to demonstrate a likelihood of relevance and reliability. It also clarified that judges shouldn’t leave it to a jury to assess methodology.

Ferguson pointed to forensic bite-mark evidence as an area where courts often admitted evidence that purported to be more definitive than science allowed. Complex DNA samples, containing a mix of multiple unknown parties’ genetic materials, is another example.

In one capital murder case in Virginia a prosecutor’s expert testified there was a 1-in-1.1 billion chance elements in the DNA mix didn’t belong to defendant Leon Winston. But years later, criminology professor William C. Thompson of UC Irvine argued in a paper that the analytical methodology was flawed and the chance was closer to 1 in 2.

“Maybe we’ve learned our lesson that jumping into untested criminal science in criminal cases can lead to bad results,” Ferguson said.

Wexler, the co-director of the Berkeley Center for Law and Technology, testified before the Senate Judiciary Committee in January, arguing lawmakers should require AI tools used in court to be available for auditing by independent researchers. She cited an email from crime-scene DNA analysis company Cybergenetics that said the company “does not provide research licenses.” Congress also should bar using trade-secret privileges to block access to relevant criminal evidence, she said.

“Vendors of some forensic software systems that are consistently used in criminal prosecutions flat-out refuse to provide research licenses to independent experts seeking to conduct independent quality assurance and validation studies,” Wexler told Bloomberg Law. “Private forensic systems we use to put people behind bars or even take their lives are using contract law to block peer review of their products.”

‘Black Box’

Despite Judge McCullough’s caution, Wexler said, “judges are not going to be able to keep AI out of the evidence system for long"—which she says isn’t inherently bad. As in other fields, AI comes with the vast potential to assist courts in administering justice. She said it’s “going to be built into everything,” including pattern and facial recognition.

“We want the best, most fair, most accurate technologies in the criminal system. We want both sides to have access to those technologies,” Wexler said.

That means understanding how large-language models and other AI tools operate, what they can do, and what their limitations are, legal professionals said. Ferguson noted that the same prompt can lead a model like ChatGPT to produce different answers, and the popular chatbot operates without indicating the specific steps it took or what information factored into any specific output.

Garrett noted the “black box” problem leaving attorneys at a loss to challenge an expert witness. “How can the defense meaningfully cross-examine if the witness doesn’t know how the software works?” he asked.

Wexler told the Senate committee at least six people have been wrongfully arrested or jailed from mistaken hits from AI facial recognition software, and that a “state-of-the-art” AI system for estimating a person’s height and weight from a photo performed worse than regular people with no training. Meanwhile, 90% of police deployments initiated by Chicago’s AI gunshot detection system turned up no corroborating evidence of gunfire, but led to use of force on at least 82 mostly Black and Latino men as well as three false imprisonments, she said, citing data compiled by the MacArthur Justice Center.

“AI’s going to present new problems, the next generation of new problems of whether we can trust and test the evidence in a way that we feel is going to be reliable enough to give to jurors to be able to believe it and use it,” Ferguson said.

Learn more about Bloomberg Law or Log In to keep reading:

See Breaking News in Context

Bloomberg Law provides trusted coverage of current events enhanced with legal analysis.