Humans are bad at making complex decisions. AI can call them out

When a list of pros and cons won’t cut it, a new decision-making tool developed by Cornell researchers can use artificial intelligence to help make difficult decisions.

But there’s a twist: Instead of checking AI’s result, AI is checking you.

Created in the lab of Abe Davis, assistant professor of computer science in the Cornell Ann S. Bowers College of Computing and Information Science, the tool is designed to help users rank a set of choices – such as job applicants, graduate schools, even Oscar candidates. While a human makes the final decisions, the tool can use AI and optimization to make the process more efficient, explainable and fair.

“Using technology to make decisions for us is often fraught,” Davis said. “Part of what makes this work exciting is that instead, we’re using technology to help us make better decisions.”

Chao Zhang, a doctoral student in the field of information science and first author on the new study, presented “Interactive Explainable Ranking” on April 16 at the Association of Computing Machinery CHI conference on Human Factors in Computing Systems, where it received a Best Paper Award.

Davis had the idea for the tool while attempting to evaluate hundreds of creative, open-ended projects turned in by his computer graphics students each year. Even with a clear set of grading criteria and multiple trained teaching assistants (TAs) evaluating every submission, his team of TAs struggled to ensure perfectly consistent grading standards.

“This really bothered me,” he said. “How do we build a better evaluation process that also scales?”

The underlying issue, Davis said, is a tension between consistency and bias. Humans are much better at making consistent decisions when they directly compare options, as opposed to rating multiple options subjectively. For example, ask someone if one light is brighter than another, and the answer is easy. But ask them to rate the brightness of each light on a scale of 1 to 10, and answers could vary wildly.

On the other hand, that consistency can sometimes come from unconscious bias, which is what the tool is designed to uncover.

“We ask users to describe what they value by weighing different criteria used for ranking, then we find where the values and rankings contradict,” Davis said. “If there are contradictions, the user can change their ranking or try to justify it with new criteria, but either way, they are forced to provide a clear and consistent explanation for their choices.”

Here’s how the tool works: Imagine someone is deciding which car to buy. First, the user rates the importance of several criteria – cost, then reliability, then fuel efficiency. Then the tool asks the user to choose between several pairs of cars to capture their preferences, using AI to determine which questions to ask and in what order.

If there’s a mismatch between the rankings based solely on their stated values and which cars the user actually prefers, the tool will highlight those inconsistencies. The user can then adjust the importance of each criteria to correct the mismatch, or the tool can predict if there’s a missing factor.

Perhaps the user unconsciously selected red cars over better options with a different paint job. In that case, the tool can show the user evidence of this bias, so they can either adjust their ranking or add color as an additional criterion. The end result is an optimal and totally explainable top choice.

Users can also turn off the AI function entirely for sensitive applications where using AI may be inappropriate.

“One of the most important parts of this project is not to use AI to make decisions for us, but to use AI to help us think through what we want,” Zhang said.

Zhang and Davis tested the tool in two case studies. First, they asked four participants to rank a series of short films. The individuals reported that the tool helped them move from making intuitive or emotional judgments about the films to applying specific criteria.

In the second experiment, they asked four TAs to rank 10 student projects from a previous computer graphics course. The rankings ultimately agreed with the students’ assigned grades, and were highly consistent among the four TAs, suggesting the tool yields accurate, repeatable assessments.

Davis now uses the decision-making tool, which is publicly available, to grade projects in his current class – with the AI function turned off.

“It’s for decisions where the stakes are high,” he said, “and the value of making a better decision is worth the extra rigor.”

Patricia Waldron is a writer for the Cornell Ann S. Bowers College of Computing and Information Science.

Media Contact

Becka Bowyer