Cornell team, EPA to partner on emissions big data project

A team from associate professor Max Zhang’s lab will work with the Environmental Protection Agency (EPA) over the next year on a machine learning model designed to predict fossil fuel emissions. The project was a winning entry in the EPA-sponsored EmPOWER Air Data Challenge.

Zhang directs the Energy and the Environment Research Laboratory at the Sibley School of Mechanical and Aerospace Engineering. His collaborators on the EPA project are Ye Jiang and Vignesh Rao, students in the Cornell Engineering master’s program in computer science; and doctoral student Jeff Sward.

Cornell impacting New York State

The project, “Predicting the Environmental Performance of Power Plants Using Machine Learning,” will apply the machine learning model to air pollution monitoring data from the EPA’s Clean Air Markets Division (CAMD) to predict emissions rates from fossil fuel-burning power plants. The group will develop the model further to identify anomalies in CAMD data to enhance the quality of the data.

According to the EPA, most of the fossil fuel-fired, electricity-generating units in the U.S. submit quarterly reports on their hourly nitrogen oxide, sulfur dioxide and carbon dioxide emissions, along with operating parameters such as hourly heat input and gross electricity generation.

“Electronic data auditing involves a massive data set,” Zhang said. “Any power plant larger than 25 megawatts in the United States, including generators at Cornell, is equipped with the continuous emission monitoring system (CEMS). To maintain the quality of the data is very important for the environment, and we will develop a better data-driven tool to improve the quality of the CEMS data.”

Zhang said his lab initially developed the machine learning model for emissions as part of a project funded by the NewYork State Energy Research and Development Authority to evaluate the impact of clean energy initiatives in the state. “The timing was perfect to compete for this EmPOWER Air Data Challenge,” he said, “because we had some results already.”

The EPA and the Sibley School have signed a Memorandum of Understanding on the project, to be completed by June 2020. One of the agency’s goals is to advance power sector-related research and knowledge.

The Cornell team will work with and have direct support from EPA staff, experts and peers in the research community, to assist with CAMD tools and data analysis.

The EPA selected three winning EmPOWER projects for the 2019-20 academic year, including Zhang’s. The others chosen are at the Georgia Institute of Technology, on data engagement in the classroom; and the University of California, Berkeley, and the University of Oregon, a joint project applying satellite data on pollution to estimate health impacts from a decline in coal emissions.

Zhang sees Cornell’s EmPOWER project as a good opportunity for students to gain practical experience working with a government agency on environmental issues.

“I have a number of computer science students working in my lab this academic year. They are all very motivated to make a real-world impact,” he said. “I’d welcome computer science and engineering students to look at what we’re doing and join us for socially relevant research projects.”

Media Contact

Jeff Tyson