Machine learning bests humans in whale call detection

By Pat Leonard
Cornell Lab of Ornithology

September 28, 2022

In a watershed “human vs. machine” moment – think Deep Blue vs. Gary Kasparov – a machine learning algorithm has detected tricky blue whale calls in sound recordings with greater accuracy and speed than human experts.

The algorithm, made possible by tools developed by the K. Lisa Yang Center for Conservation Bioacoustics at Cornell, was described in a study published Aug. 24 in the journal Remote Sensing in Ecology and Conservation.

“In a set of test data, the algorithm found about 90% of blue whale D-calls and the human experts [found] just over 70% of the calls,” said Brian Miller of the Australian Antarctic Division, the study’s lead author. “Machine learning was also better at detecting very quiet sounds. It took about 10 hours of human effort to identify the calls. It took the ML-detector 30 seconds – 1,200 time faster. And it doesn’t get tired.”

The sound data can help scientists better understand whale behavior and trends, and aid in conservation efforts.

Unlike the predictable pattern of much longer blue whale “songs,” D-calls are thought to be social calls made by both male and female whales on feeding grounds. The “D” in blue whale D-calls could easily stand for “difficult.” The calls vary across populations, from animal to animal, season to season, year to year.

That’s where Koogu comes in. Shyam Madhusudhana at the Yang Center created this toolbox and named it with a word in Kannada, his native language, that translates to the nouns “call” or “cry,” or the verb “to call.” (Kannada, spoken in the state of Karnataka in south India, is one of the world’s oldest languages, predating English and Hindi.)

Koogu streamlines the complicated coding needed to implement the various facets of a bioacoustics machine learning pipeline. The result is a model trained to recognize occurrences of blue whale D-calls in hundreds of thousands of hours of sound data collected by underwater recorders.

“Koogu isn’t a model by itself, but it’s a toolkit that helps bioacousticians train machine learning-driven detection/classification models with only a few lines of code,” said study co-author Madhusudhana. “I have built in lots of customizations at every step of the process so that it can be easily adapted to suit different projects’ needs. The real strength of Koogu comes from the higher emphasis it places on exploiting nuances in signal processing that most other ML practitioners tend to overlook.”

Koogu has also been used to create ML-based detectors for katydid sounds, elephant-poacher gunshots, North Atlantic right whale vocalizations and more. Koogu has been made available to other researchers around the world as an open-source product so it can be adapted for other bioacoustics studies.

The goal is to help other scientists use sound data to better understand animal behavior and population trends – information that can ultimately be used to create better conservation plans for creatures at risk from climate change and other factors.

This research was conducted by scientists from the Australian Antarctic Division, Australia; Curtin University, Centre for Marine Science and Technology, Australia; and the K. Lisa Yang Center for Conservation Bioacoustics, part of the Cornell Lab of Ornithology.

Pat Leonard is a writer for the Cornell Lab of Ornithology.

Energy, Environment & Sustainability