Regular audits would build trust, confidence in AI

From self-driving cars to employment screening and chatbots, we live in a time of great uncertainty about the reliability and risks of artificial intelligence. How do we achieve a market for trustworthy AI products?

In many ways, the current AI moment is similar to the time, right after the invention of indoor plumbing and running water, when people weren’t exactly sure that what was coming out of their taps was safe. Eventually, thanks to advances in technology and regular testing, users gradually began to trust the system. Now, generally people can turn on the tap and be confident that their water is safe to drink.

Artificial intelligence is still at the point where you have to “test the water,” and one way to build confidence in the system is by conducting audits, according to a Cornell computer scientist and communication scholar.

“Over time, because we built systems of inspection and of safety testing, it’s now possible for your local water authority to say, yes, we are very confident: You can drink the water, and that means we don’t have to think about it,” said J. Nathan Matias, assistant professor of communication in the College of Agriculture and Life Sciences, and director of the Citizens and Technology (CAT) Lab.

“And that’s really the point of auditing,” Matias said. “We want to get AI to the point where it’s useful and reliable enough that you don’t have to think about it.”

Matias is a co-author of “Auditing AI” (2026, The MIT Press), which offers AI users from all walks of life an introduction into AI evaluation and a gentle nudge toward answering the question: How do I know if the AI I’m using is doing what I expect it to do, and not causing harm?

The book weaves in real-life stories from the distant and not-too-distant past – American Airlines’ foray into automated flight-booking in the 1960s, for example – with explanations and ideas for how to make audits as ordinary as regular water testing or yearly automobile inspections.

“We hope that by sharing stories and insights, we can help our readers apply general lessons about technology evaluation to whatever AI technology they’re facing, wherever they are in relation to the technology,” he said.

Matias talked to the Chronicle about the book.

Question: Early in the book, you write: “The deft auditor begins with the mindset of a skeptic.” Shouldn’t the deft AI user employ this mantra, too?

Answer: I think that’s right – a good mantra for life in general is “trust but verify.” And the book is really about that verify piece; especially with AI, individuals are not always so good at catching a problem. And on top of that, sometimes the problems of AI don’t show up for just one person. They’re a pattern of decisions or actions that play out over an entire organization, and that’s where auditing and analyzing the cumulative outputs of what an AI system is doing is especially valuable.

Q: Is it realistic to see the audit as a maintenance tool for AI, akin to yearly automobile inspections?

A: One of the things we’ve seen is that the behavior of AI systems does change over time. Some things you can test once and say, “Great, I know it’s going to work fine forever.” But take the mechanical condition of your car: It changes over time, and you need to get it inspected regularly. AI systems can diverge from what they were initially trained on, and one of the important techniques in auditing is to do regular or continuous monitoring. We talk about that in the book, because we want people to have confidence that they can use it reliably, both now and into the future.

Q: Who is well-positioned to conduct these audits?

A: We are on the verge of a whole new industry of people whose job is to ensure that AI is working for people. Some of those entrepreneurs are already out there – in nonprofits, in small companies, in large consultancies. I hope that some of my current students at Cornell will be able to go on to create these jobs, and figure out how to make AI work for society.

I think the people who are best positioned to do these kinds of jobs have a few characteristics. One, they understand people’s needs and organizations’ needs. You can’t evaluate a system unless you understand the human and organizational requirements. I also think you need an understanding of technology and statistics. The third part is knowing how to build trustworthy evidence, and that’s a combination of technical skills and communication skills. An AI audit only does its job if it’s reliable technically, and if people believe it.

Q: Is there a balance that has to be struck between designing useful audits while encouraging innovation? Do we necessarily have to accept some “normal accidents”?

A: I’m not an economist, but my understanding is that industries gain a competitive advantage when they’re able to show that their products are more reliable and safer than others. And auditing is about ensuring that the market has the information we need to make judgments about which products are the best ones.

For example, we’ve seen that the U.S. drug industry is so influential and powerful and successful around the world, partly because of the Food and Drug Administration’s testing. Due to that evaluation, the world trusts drugs developed in the United States.

So my view is that while AI evaluation might prevent some innovators from moving a little faster, it also prevents really bad products from getting out there and harming public trust in AI. In the long term, people will trust and work with AI if they think it works and is safe. That’s what these evaluations are designed to do.

Media Contact

Becka Bowyer