Photo credit: arstechnica.com
Bonobos, a species of great apes closely related to humans and chimpanzees, inhabit the forests of the Republic of Congo. They are known for their diverse vocalizations, including peeps, hoots, yelps, grunts, and whistles. Recently, a team of researchers from Switzerland, led by evolutionary anthropologist Melissa Berthet from the University of Zurich, revealed that these fascinating animals can merge basic vocal sounds into more complex semantic structures. This form of communication showcases a level of non-trivial compositionality, which was previously considered an ability exclusive to humans.
In their research, Berthet and her team created an extensive database comprising 700 distinct bonobo calls and analyzed them by employing techniques derived from distributional semantics. This approach has been instrumental in the reconstruction of ancient languages such as Etruscan and Rongorongo. Their findings provide an unprecedented insight into the meanings behind bonobo calls in natural settings.
The Importance of Context
The foundational principle of distributional semantics is that words that appear in similar contexts typically share similar meanings. To decode an unfamiliar language, it is crucial to compile a comprehensive collection of words and convert them into vectors—mathematical representations that organize them within a multidimensional semantic space. Additionally, contextual data identifying the circumstances of word usage is necessary, which can also be transformed into vectors. When these two sets of vectors are analyzed together, words with akin meanings usually cluster closely within this space. Berthet and her colleagues aimed to apply this analytical framework to the vocalizations of bonobos. While the concept appeared simple initially, the practical implementation proved to be quite challenging.
“We established a research camp in the forest, rising early at 3:30 AM and trekking for one to two hours to access the bonobo nests. As they awoke, I would activate my microphone to capture as many vocalizations as possible throughout the day,” Berthet recounts. Each recorded call required meticulous annotation with a complex array of contextual parameters. Berthet’s protocol included an extensive questionnaire with various inquiries: Is there a neighboring group present? Are any predators nearby? Is the caller engaged in feeding, resting, or grooming? Is another individual approaching the caller? For each of the 700 recorded calls, answers to 300 such questions were required.
Source
arstechnica.com