Combining Linguistics, Archaeology, and Ancient DNA Genetics to Understand Deep Human History


Each discipline tells us only part of the story. And so the truest picture of prehistory comes from triangulating these independent lines of evidence.


By Dr. Michael Dunn (left) and Dr. Annemarie Verkerk (right) / 03.29.2018
Dunn: Professor in Linguistics and Philology, Uppsala University
Verkerk: Postdoctoral Research Associate in Linguistics, Max Planck Institute for the Science of Human History

It’s difficult to understand what people mean when they say that a language is “old”. A person is old who was born a long time ago, but a language is recreated by its speakers every generation – so every generation, it changes.

It’s easier, though, to assign an age to a language family. By definition, a group of related languages ultimately descend from a common ancestor, and this common ancestor must have existed at some particular time. This language must have come from somewhere, too, of course, but we simply don’t have any linguistic evidence for what came before.

Until recently, working out how old language families are was based on informed extrapolations of specialists. But modern computational methods in linguistics can now let us infer the ages of language families in a more exact manner. These new methods, for example, recently let us propose a new age for the Dravidian language family: 4,500 years.

Since the mid-19th century it has been recognised that most of the 462 languages of India belong to two main stocks: the Dravidian family and the Indo-European family. More than a billion people live in India. Of these, about 20% speak a Dravidian language, such as TeluguMalayalamTamil and Kannada. Meanwhile,75% speak an Indo-European one, including Hindi, Punjabi and Urdu.

Some languages in both these families have literary traditions going back more than 2,000 years; many others are unwritten. Dravidian is spoken predominantly in the southern end of the subcontinent; the Indo-European languages cover most of the north, and extend across Eurasia (English, Welsh, French, Russian, Greek and Persian are all Indo-European languages, too).

Map of the Dravidian languages in India, Pakistan, Afghanistan and Nepal adapted from Ethnologue. Kolipakam et al. 2018

How the speakers of these languages reached their current locations is an enduring mystery in human prehistory.


Of course, linguistics is not alone in making claims about prehistory. Archaeology and ancient DNA genetics in particular also have a lot to tell us about human populations in the past. But each of these disciplines gives only one perspective on history.

Linguistics tells us about the transmission of cultural traditions, but not who the people transmitting these cultures were. Genetics lets us track people through biological descent, but not whether these people belonged to what we might consider the same group over time. Archaeology describes snapshots of the material products of cultures in the past, but has only so much to say about what those cultures were, where they came from, and who they became.

Each discipline tells us only part of the story. And so the truest picture of prehistory comes from triangulating these independent lines of evidence.

In order to investigate the history of the Dravidian language family, we therefore combined and compared relative research from linguistics and archaeology to study the Dravidian languages and their speakers.

The age we propose for Dravidian – 4,500 years – is consistent with archaeological hypotheses linking the dispersal of Dravidian with the South Indian Neolithic. This is the period from about 5,000 years ago when on the South Deccan plateau, archaeological evidence is found for locally domesticated crops and animals and thus a probably more sedentary lifestyle. This lifestyle spread and changed as the peoples associated with these crops and livestock moved south, east, and north.

It makes sense that the common linguistic ancestor of Dravidian was spoken at the same time as a dispersal of cultures and peoples who ultimately contribute to much of the Dravidian speaking population of today. But having a plausible hypothesis is still a long way from proof.

Population genetics

But, of course, further input from genetics is a possibility. Genetic reconstruction of Indian population history has shown that most Indians today carry ancestry suggesting two ancient populations, “Ancestral North Indians” (ANI) and “Ancestral South Indians” (ASI). ANI were related to Central Asians, Middle Easterners, Caucasians (that is, peoples of the Caucasus), and Europeans, while ASI, curiously, were not closely related to any population outside of the subcontinent. The strength of each component varies across Indian groups, with ANI ancestry being associated with Indo-European speakers and traditionally “higher” caste membership.

Further research has shown that the mixing of ANI and ASI ancestry is relatively recent, occurring 1,900–4,200 years ago. This period in Indian prehistory is marked by massive cultural and demographic change, including change from cultural norms where intermarriage between different groups was common, to a state where this was restricted, as well as the introduction of the Indo-Aryan languages in the subcontinent.

The link between ASI ancestry and the South Indian Neolithic, as well as the origins of Dravidian, all happened at or before the beginning of the mixing of ANI and ASI. The South Indian Neolithic is marked by domestication of local crops and ashmounds, giving it a distinct unique character. This matches the characterisation of ASI as distinctly local and Dravidian as “native” to the subcontinent, triangulating our finding that the language family is 4,500-years-old.

Prospects from ancient DNA

Our study focused on dating the Dravidian language family as well as estimating how its major branches have developed over time. Something that we haven’t done, but which would be really interesting, is phylogeography, where not only the family history of languages is reconstructed, but also the locations of ancestral languages. This would allow researchers to assess more closely the link between the Dravidian homeland and the South Indian Neolithic.

Such a study could be combined with work on ancient DNA, extracted from remains that are thousands of years old. Ancient DNA of people that lived during the South Indian Neolithic could tell us about the origin of the peoples that developed agriculture in South India. Triangulation with phylogeographic analysis in linguistics could in turn inform us on the chances of these people being Dravidian speakers.

Unfortunately, the current prospects for uncovering usable ancient DNA in India are dire because of the tropical climate. But since the techniques for extracting ancient DNA are still young, one can hope for exciting new links between disciplines in the future.

Originally published by The Conversation under a Creative Commons Attribution/No derivatives license.