Model estimates groups most affected by intimate partner violence

By Patricia Waldron
Cornell Ann S. Bowers College of Computing and Information Science

June 24, 2024

Intimate partner violence is notoriously underreported and correctly diagnosed at hospitals only around a quarter of the time, but a new method provides a more realistic picture of which groups of women are most affected, even when their cases go unrecorded.

PURPLE, an algorithm developed by researchers at Cornell and the Massachusetts Institute of Technology, estimates how often underreported health conditions occur in different demographic groups. Using hospital data, the researchers showed that PURPLE can better quantify which groups of women are most likely to experience intimate partner violence compared with methods that do not correct for underreporting.

The new method was developed by Divya Shanmugam, formerly a doctoral student at MIT who will join Cornell Tech as a postdoctoral researcher this fall, and Emma Pierson, the Andrew H. and Ann R. Tisch Assistant Professor of computer science at the Jacobs Technion-Cornell Institute at Cornell Tech and in the Cornell Ann S. Bowers College of Computing and Information Science. They describe their approach in “Quantifying Disparities in Intimate Partner Violence: a Machine Learning Method to Correct for Underreporting,” published May 15 in the journal npj Women’s Health.

“Often we care about how commonly a disease occurs in one population versus another, because it can help us target resources to the groups who need it most,” Pierson said. “The challenge is, many diseases are underdiagnosed. Underreporting is intimately bound up with societal inequality, because often it tends to affect groups more if they have worse access to health services.”

Shanmugam became interested in intimate partner violence after Pierson recommended the book “No Visible Bruises: What We Don’t Know About Domestic Violence Can Kill Us” by Rachel Louise Snyder. She realized that the pervasive issue of underreporting was something statistical methods could help address. The result was PURPLE (Positive Unlabeled Relative PrevaLence Estimator), a machine learning technique that estimates the relative prevalence of a condition when the true numbers of affected people in different groups are unknown.

The researchers applied PURPLE to two real-life datasets, one that included 293,297 emergency department visits to a hospital in the Boston area, and a second with 33.1 million emergency department visits to hospitals nationwide. PURPLE used demographic data along with actual diagnoses of intimate partner violence and associated symptoms, like a broken wrist or bruising, which could indicate the condition even when the patient was not actually diagnosed.

“These broad datasets, describing millions of emergency department visits, can produce relative prevalences that are misleading using only the observed diagnoses,” Shanmugam said. “PURPLE’s adjustments can bring us closer to the truth.”

PURPLE indicated that patients who are nonwhite, not legally married, on Medicaid or who live in lower-income or metropolitan areas are all more likely to experience intimate partner violence. These results match up with previous findings in the literature, demonstrating the plausibility of PURPLE’s results.

The results also show that correcting for underreporting is important to produce accurate estimates. Without this correction, the hospital datasets do not show a straightforward relationship between income level and rates of victimization. But PURPLE clearly shows that rates of violence are higher for women in lower income brackets, a finding that agrees with the literature.

Next, the researchers hope to see PURPLE applied to other often-underreported women’s health issues, such as endometriosis or polycystic ovarian syndrome.

“There’s still a lot more work to be done to measure the extent to which these outcomes are underdiagnosed, and I think PURPLE could be one tool to help answer that question,” Shanmugam said.

The new technique also has potential applications beyond health conditions. PURPLE could be used to reveal the relative prevalence of underreported police misconduct across precincts or the amounts of hate speech directed at different demographic groups.

Kaihua Hou, a doctoral student at the University of California, Berkeley, contributed to the study. Pierson also has an appointment with Weill Cornell Medicine.

Patricia Waldron is a writer for the Cornell Ann S. Bowers College of Computing and Information Science.

Computing & Information Sciences