

By Matt Phillips
Introduction
Picture a human being before the dawn of language. They are returning to camp one afternoon. Walking along the beach, they stop to listen to the sounds of waves. Maybe theyโve never stopped like this before, or maybe itโs the tenth or the hundredth time, but they decided the waves make a pleasing soundโone theyโd like to imitate.
And so, they try producing something like fwwwos, bwwwos, fwosbwos. None of these is quite right and they know it. They keep trying.
Now, jump ahead thousands of years to Homerโs Iliadโtraditionally dated to the eighth century BCEโwhere the priest Chryses, harshly dismissed by Agamemnon, goes โin silence along the shore of the loud-roaring (polufloisboio) sea.โ Polu- is a common Greek and English prefix (as in polyglot). But -floisboio is not so common, and much more remarkable: It strikes the ear like, well, a crashing waveโor a lackluster imitation of one.
This noun from ancient Greek, polufloisbos (here, in the nominative case), is onomatopoeic: a word that somehow imitates or suggests the sound it references. We tend to like such words. Think of sizzle, hiss, or cuckoo.
But theyโre not just fun to say. Philosophers from Plato to Kant have wondered whether they offer a clue to the origins of language. The truth, however, may not be so simple.
Scholars across the globe have been engaged for centuries in a lively, and sometimes quite funny, debate over the origins of human language. Important work in this field continues today, drawing on disciplines from anthropology to biology, from psychology to linguistics, and from philosophy to literature.
The issue is complicated. Spoken language doesnโt leave behind many fossils, so itโs difficult to refute any one theory. A polymath-like knowledge of many disciplines is required for scholars to make headway, especially today.
Dorit Bar-On is one of these scholars, a professor of philosophy at the University of Connecticut and director of the Expression, Communication, and the Origins of Meaning research group. In 2020, she received an NEH fellowship to complete a book on the origins of language, drawing on studies of creatures other than humans that are minded as a foundation through which to investigate the subject.
Bar-On became interested in linguistics through earlier philosophical work on self-knowledge and what is called โtheory of mind,โ a problem that plagues scholars in these debates. Weโll return to the theory of mind after surveying the spirited history of the language-origins debate.
So how did our ancestors go from, say, an errant cry of pain or pleasure to the robust, organized system of language we know today?
Early Western Theories

In the eighteenth century, German philosopher Johann Gottfried von Herder had a radical idea, one that ran contrary to the popular notion that language was a divine gift.
Herderโs proposition was that vocal imitationโmimicry of the natural environmentโcould be the spark that, over time, led to fully developed language. Because nearby groups of humans share similar environments, the meaning of these imitations could be intuitively understood among them.
The idea was dismissed because most words are not onomatopoeic. But Herder was actually suggesting a first step, what scholars call a protolanguageโa foundation from which non-onomatopoeic words could later develop.
Another radical theory from the time, which Herder disputed in his Essay on the Origin of Language, suggested human language was derived from cries of pain: โI cannot conceal my astonishment at the fact that philosophers . . . can have arrived at the idea that the origins of human language [are] to be found in . . . emotional cries,โ he wrote. โAll animals, even fish, express their feelings by sounds; but not even the most highly developed animals have so much as the beginning of true human speech.โ
A harsh dismissal, perhaps, but both the onomatopoeic and cry-of-pain theories shared a willingness to consider language as evolved rather than divinely received. Pugnacious rhetoric would be right at home among linguistics scholars, especially when Oxfordโs Friedrich Max Mรผller entered the ring a century later.
A powerful and respected linguist, Mรผller dismissed both theories out of hand as the โbow-wowโ and โpooh-poohโ theories of language origin, establishing a long tradition of name-calling in linguistics scholarshipโsometimes, as here, to criticize opponents, and sometimes bestowed by scholars on their own theories to avoid an inevitable nickname of someone elseโs choosing.
An expert in Proto-Indo-Europeanโa theorized common ancestor of many languages still spoken today, as well as of Latin and ancient GreekโMรผller believed in a single origin point for all modern languages.
โLanguage is the Rubicon which divides man from beast,โ wrote Mรผller, โand no animal will ever cross it. . . . The science of language will yet enable us . . . to draw a hard and fast line between man and brute.โ
At the deepest roots of the linguistic tree, Mรผller thought we would find one shared language, the defining characteristic of the human soul bestowed by God. Mรผllerโs theory was widely critiqued even by his contemporaries, who, in an act of poetic justice, dubbed it the โding-dongโ theory.
Though Mรผller, in attacking the so-called โbow-wowโ and โpooh-poohโ theories, took aim at what he believed was the influence of Charles Darwin, it was not until laterโin 1871, to be exactโthat Darwin offered his own account on the origin
of language in The Descent of Man.
Like Herder, Darwin envisioned a protolanguage, coinciding with an increase in intelligenceโwhich modern science has confirmed through studies of hominid brain size. But Darwinโs protolanguage was, in contrast, musical and motivated by sex. In his vision, protolinguistic humans who could sing well were appealing to potential mates and frightening to rivals.
Alternative Protolanguages: Musical and Gestural

Today, most scholars accept the likelihood of a protolanguage and have adapted Darwinโs theory in interesting ways. We know from studies of modern languages that, historically, they can lose what is called tonalityโwhere the pitch of musical notes plays a defining role in articulating meaning. And even those who listen casually to music know that songs can translate a depth of emotionโdespair, joy, triumphโwithout words at all.
The Danish linguist Otto Jespersen, who developed a model for musical protolanguage, reflected on this in his 1922 book, Language: Its Nature, Development and Origin. โThe mere joy in sonorous combinations . . . no doubt counts for very much,โ he wrote.
Jespersen imagined human protolanguage as holistic. A group of musical notes or a short song might become tied to a particular act, like going on a hunt or shucking clams. Slowly, perhaps a specific song could evolve to signify not only being on the hunt but a desire to go hunting, as when food is in short supply.
Yet another theory advocates for gestural protolanguage as our intermediary step. Modern sign languageโa fully developed communication system in ways simple gestures are notโprovides some of the best evidence both for and against this theory. Sign shares all the same levels as spoken language, because it crafts nuance through movement and shape instead of tone and inflection. We know, then, that gestural communication is effective.
But if sign language shares virtually every advantage with spoken language, and the only difference is medium, why did we need to evolve spoken language at all? Maybe speech could have evolved to allow us to communicate under the cover of darkness. But gestural language has strengths as well, allowing people to talk about someone nearby without their hearing it. For every advantage of spoken language, there comes a disadvantage.
Perhaps emitting a sound was never an advantage at all and was selected for unintentionally.
British scientist Richard Paget conducted some amateur investigations into this subject, which led him to promote in his 1930 book a โmouth gestureโ theory: โOriginally man expressed his ideas by gesture, but as he gesticulated with his hands, his tongue, lips and jaw unconsciously followed suit in a ridiculous fashion, โunderstudyingโ . . . the action of the hands.โ
Pagetโs idea was little believed even then. American psychologist E. L. Thorndike quipped in reply: โPersonally, I do not believe that any human being before Sir Richard Paget ever made any considerable number of gestures with his mouth parts in sympathetic pantomime.โ
Some scholars of the twentieth and twenty-first centuries have advocated, nonetheless, for an unconscious transition from gesture to speech, including an intermediary step where limited vocalizationsโperhaps onomatopoeic or related to pain or pleasureโaccompanied gesture.
The prevalence of both gesture and speech cautions against selecting one to the exclusion of the other. Studies have shown that significant effort is required to not gesture, even in circumstances where gesture is clearly unnecessary, like talking to a friend on the telephone. It seems possible that, while gestural communication served our progenitors well in most scenarios, there were still times when imitative speech was necessary.
One theory, proposed by American linguist Derek Bickerton, imagines an early human hunter coming across an animal far too large for him to kill alone. Returning to his camp, in desperation to signal that an enormous source of meat looms nearby, he mimics the beastโs cry. (Bees and ants are capable of doing something similar by producing pheromones.) In Darwinian terms, such a situation models environmental pressure, and it leads to one more question: Why communicate in the first place?
At its simplest, theory of mind is our ability to grasp that others have a mental state just as we do. Itโs typically been seen as an either-or issue: You have full theory of mind, or you have none. Psychologists have created tests for measuring it in children, who, around the age of four, begin to demonstrate awareness of other minds. We need something like theory of mind to desire to speak in the first place, hence the problem it causes in origin-of-language debates.
What theory of mind in adults looks like is familiar enough, says Professor Bar-On.

โYou go to a bar, and you have an empty glass in front of you, and youโre deliberately catching the eye of the barwoman, and you tap your empty glass,โ Bar-On explains. โThe barwoman is going to recognize that youโre drawing her attention to that because you want it filled, and because she recognizes thatโs what you want, sheโs going to do it. So there is a kind of mutual song-and-dance.
โYour communicative act essentially depends on your relying on what she will be able to infer about your intention. Okay, so we now haveโdepends on the analysisโthree or four levels of intention. Thatโs called meta-representation; itโs a representation of somebody elseโs representation.
โAnd thatโs what you have to have before you can do anything like communicate using language. And here is my worry about this: Look at the structure of the thought that you have to have in order to engage in utterances with speaker meaning. Itโs very much the structure of language. The thought is: I want her to recognize that I want her to understand what Iโm doing. All this embedding, right?
โIt raises the question: How could such thought arise before language? And arenโt we assuming that now we have a psychological Rubicon where before we had a language Rubicon? In order to cross the psychological Rubicon, you have to have this language-like thought. And then our puzzle is exactly the same: How could language-like thought arise where it didnโt exist before?โ
Bar-On proposes an alternative way of looking at the problem, one that assumes theory of mind could evolve in parts. This approach draws on research that young children are still developing fuller theory of mind past age four, as well as studies of high-functioning persons with autism, who conventionally fail theory-of-mind tests but nevertheless have highly developed language.
If theory of mind can come in degrees and have various componentsโa theory that more psychologists espouse todayโmany of our problems are, if not solved, then certainly easier. We can imagine certain types of communicationโboth gestural and lexical, even musicalโthat donโt require such a sophisticated level of meta-representation. Mimicking a laugh to signal your own happiness does not require as deep a level of speaker meaning as signaling to get your beer refilled.
Perhaps, as human communication evolved, so too did mindedness; as one grew more complex, more capable of organizational structure, so did the other.
This both-and approach can guide how we think about lexical, musical, and gestural theories of language evolution. โEverybody at one point or another thought about the question, How did language come to be?โ Bar-On says. โAnd so itโs not surprising that there have been all these different myths, all these just-so stories.โ
After centuries of philosophical debate and scientific investigation, no one theory has succeeded in solving all the problems encompassing all the modes of communication needed to get through a day.
But Bar-On remains optimistic, including about past theories of bow-wowing and pooh-poohing our way to spoken language. โMy hunch would be that they all have something to offer. . . . Each suggests one possible element in the toolkit of our ancestors.โ
As we imagine the many distinct tasks performed by our progenitors on a daily basisโhunting large prey, instructing and caring for young, even just pausing to listen to wavesโwe can imagine equally many ways to gesture, sing, and mimic our way toward sharing those experiences with others. And though the fossils are hard to find, scholars like Bar-On continue charting new paths, weaving disciplines together, striking out on this mysterious Rubicon.
Originally published to the public domain by Humanities, the Magazine of the NEH 42:2 (Spring 2021).


