For decades, understanding sounds such as clicks, whistles, and burst pulses from dolphins has been a scientific challenge. But what if we could not only listen but also comprehend their complex communication patterns and generate lifelike responses?
On National Dolphin Day, Google partnered with researchers from Georgia Tech and the Wild Dolphin Project (WDP) to announce DolphinGemma—a foundational AI model trained to learn the acoustic structure of dolphins and generate new dolphin-like sound sequences. This marks a significant leap in the journey of interspecies communication, while also expanding the potential for connection between humans and the ocean world.
Decades of Dolphin Social Research
Understanding a species requires deep context—something WDP has provided. Since 1985, they have conducted the world’s longest-running underwater dolphin research project, focusing on Atlantic spotted dolphins in the Bahamas. They use non-invasive methods, recording video and audio combined with the identity and behavior of each individual dolphin.
Some distinctive sounds:
- Signature whistles for mothers to call their calves
- Burst pulses during conflicts
- Buzzing clicks during courtship or when chasing sharks
Introducing DolphinGemma
Google developed DolphinGemma by applying SoundStream audio technology and a 400-million-parameter model that can run directly on Pixel phones. This model recognizes, analyzes natural sound sequences, and predicts the next sound—similar to how AI processes language.
DolphinGemma has been used in the field, helping to detect repeating patterns, sound clusters, and potential meanings. Researchers have also used synthesized sounds attached to dolphins’ favorite objects to build a “shared vocabulary” for interactive communication.
Using Pixel Phones for Underwater Communication
WDP is also developing the CHAT system (Cetacean Hearing Augmentation Technology), in collaboration with Georgia Tech. This system uses synthesized sounds representing objects such as seaweed or towels, helping dolphins learn to mimic sounds to make “requests.”
How it works:
- Accurately listen amid ocean noise
- Correctly identify the mimicked sound
- Notify researchers via bone conduction headphones
- Provide the correct object in response to the dolphin
Pixel 6 has been used, and Pixel 9 (expected summer 2025) will be further improved, integrating both speaker/mic and running the AI model simultaneously, enhancing response speed and smoothness during interactions.
Sharing DolphinGemma with the Research Community
Google will open source DolphinGemma this summer. Although it was trained on Atlantic spotted dolphin data, the model can still be customized for use with other species such as bottlenose dolphins or spinner dolphins.
The combination of field research, engineering, and AI technology is opening new doors for humans to gain deeper understanding of intelligent marine creatures.
Source: https://blog.google/technology/ai/dolphingemma/