The Indo-European Languages: A Summary

Since I have written several essays about the Indo-European language family I can sum up what I have found so far. Interesting things happened in Europe long before the Greeks and the Romans entered the scene. One of the hypotheses presented in my writings is that culture is the product of genes plus ideas. Since most of our genes date back to the Stone Age, you could successfully claim that our culture indirectly has roots dating back to that era. However, one the most ancient parts of the European cultural heritage still used is the language.

Modern English, Spanish, French, German, Dutch, Italian and Russian have roots back to the Late Neolithic and are all derived from a single mother language that is referred to as Proto-Indo-European. This is universally accepted, but where and when this language was initially spoken is more controversial. I belong to those who support the theory that PIE was originally spoken north of the Black Sea in what is today the Ukraine and southern Russia by 3500 BC, when the first expansion began with the introduction of wheeled vehicles. David W. Anthony writes about this subject in his book The Horse, the Wheel, and Language.

The PIE language which has been reconstructed by linguists over the past two centuries contains words for a technological package that according to archaeological evidence existed after 4000 BC. An early form may have existed just prior to this and a late form after 3000 BC, at which point PIE was breaking apart. Scholars J. P. Mallory and D. Q. Adams explain in The Oxford Introduction to Proto-Indo-European and the Proto-Indo-European World:

“[I]ndividual Indo-European groups are attested by c. 2000 BC. One might then place a notional date of c. 4500-2500 BC on Proto-Indo-European….By c. 2000 BC we have traces of Anatolian, and hence linguists are willing to place the emergence of Proto-Anatolian to c. 2500 BC or considerably earlier. We have already differentiated Indo-Aryan in the Mitanni treaty by c. 1500 BC so undifferentiated Proto-Indo-Iranian must be earlier, and dates on the order of 2500-2000 BC are often suggested. Mycenaean Greek, the language of the Linear B tablets, is known by c. 1300 BC if not somewhat earlier and is different enough from its Bronze Age contemporaries (Indo-Iranian or Anatolian) and from reconstructed PIE to predispose a linguist to place a date of c. 2000 BC or earlier for Proto-Greek itself.”

Greek, the Indo-European language of the palace-centered Bronze Age warrior kings who ruled at Mycenae and other strongholds, is attested in the mid-second millennium BC. The breakthrough in the decipherment of the Linear B tablets was made by the Englishmen Michael Ventris (1922-1956) and John Chadwick (1920-1998) in the early 1950s. Ventris was himself surprised to discover that the language in question was a very early form of Greek.

How was the Indo-European language family discovered? Similarities between European languages had been known for a long time, but systematic studies of them appeared gradually in early modern Europe. For instance, the scholar Joseph Scaliger constructed language groups based on their word for “god,” i.e. the Deus group (from Latin deus, with variations in the Romance languages), the Gott group (from Germanic god or Gott) and the Bog group (from Slavic bog). Suggestions of similarities between Indian and European languages began to be made by European visitors to India in the sixteenth century. Mallory and Adams again:

“Joseph Scaliger (1540-1609), French (later Dutch) Renaissance scholar and one of the founders of literary historical criticism, who incidentally also gave astronomers their Julian Day Count, could employ the way the various languages of Europe expressed the concept of 'god' to divide them into separate groups; in these we can see the seeds of the Romance, Germanic, and Slavic language groups. The problem was explaining the relationship between these different but transparently similar groups. The initial catalyst for this came at the end of the sixteenth century and not from a European language. By the late sixteenth century Jesuit missionaries had begun working in India – St Francis Xavier (1506-52) is credited with supplying Europe with its first example of Sanskrit, the classical language of ancient India, in a letter written in 1544 (he cited the invocation Om Srii naraina nama). Classically trained, the Jesuits wrote home that there was an uncanny resemblance between Sanskrit and the classical languages of Europe. By 1768 Gaston Cœurdoux (1691-1777) was presenting evidence to the French Academy that Sanskrit, Latin, and Greek were extraordinarily similar to one another and probably shared a common origin.”

The year which is usually seen as the birth of Indo-European studies is 1786, when the English scholar William Jones (1746-94) gave a speech to the Asiatic Society in Calcutta. Jones is said to have known thirteen languages well, and twenty-eight fairly well, at the time of his death. In 1783 he was appointed to the judgeship at the high court at Calcutta. He arrived in 1783 and was to stay there until his death.  He was to transform the intellectual life of India when he founded the Asiatic Society of Bengal and the associated journal, Asiatick Researches, dedicated to the scientific study of the literature, history and philosophy of India. In 1786 he elaborated a theory of the common origins of most European languages and those of much of India, an intuition that marks the beginning of comparative-historical linguistics:

“The Sanskrit language, whatever be its antiquity, is of a wonderful structure; more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either, yet bearing to both of them a stronger affinity, both in the roots of verbs and in the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists: there is a similar reason, though not quite so forcible, for supposing that both the Gothic [Germanic] and the Celtic, though blended with a very different idiom, had the same origin with the Sanskrit; and the Old Persian might be added to the same family, if this were the place for discussing any question concerning the antiquities of Persia.”

As Ibn Warraq says in Defending the West: A Critique of Edward Said’s Orientalism, “With his work on Indian chronology, and having created a solid framework for the understanding of India’s past, Jones, in effect, can be considered the father of Indian history. Jones’s translation of Sacontala (Shakuntala) had an enormous influence in Europe, inspiring Schiller, Novalis, Schlegel, and Goethe, who used its introductory scene as a model for the ‘Vorspiel auf dem Theater’ of Faust (1797). But even more remarkably, the collection, printing, and translations of Sanskrit texts by Jones and other Orientalists made available for the first time to Indians themselves aspects of their own civilization, changing forever their own self-image. Until now, these texts had only been accessible to a narrow coterie of Brahmins.”

By 1800 a preliminary model for this language family had been constructed. The English polymath Thomas Young (1773-1829) first used the term “Indo-European” in 1813. The great French philologist Jean-François Champollion (1790-1832) is correctly credited with having deciphered Egyptian hieroglyphs from the trilingual Rosetta Stone in 1822, but contributions were made by Young and others such as Johan David Åkerblad (1763-1819) from Sweden. In the early nineteenth century, progress was made by the German linguist Franz Bopp and the philologist Rasmus Rask from Denmark. The German linguist August Schleicher (1821-1868) in 1868 published the first artificial text composed in the suggested PIE language:

“The language family came to be known as Indo-Germanic (so named by Conrad Malte-Brun in 1810 as it extended from India in the east to Europe whose westernmost language, Icelandic, belonged to the Germanic group of languages) or Indo-European (Thomas Young in 1813). Where the relationship among language groups were relatively transparent, progress was rapid in the expansion of the numbers of languages assigned to the Indo-European family. Between the dates of the two early great comparative linguists, Rasmus Rask (1787-1832) and Franz Bopp (1791-1867), comparative grammars appeared that solidified the positions of Sanskrit, Iranian, Greek, Latin, Germanic, Baltic, Slavic, Albanian, and Celtic within the Indo-European family. Some entered easily while others initially proved more difficult. The Iranian languages, for example, were added when comparison between Iran’s ancient liturgical texts, the Avesta, was made with those in Sanskrit. The similarities between the two languages were so great that some thought that the Avestan language was merely a dialect of Sanskrit, but by 1826 Rask demonstrated conclusively that Avestan was co-ordinate with Sanskrit and not derived from it. He also showed that it was an earlier relative of the modern Persian language.”

A few Indo-European varieties still spoken today are not allocated to major IE sub-groups or branches. Greek has a long history, whereas Armenian dates from the first millennium AD. The earliest references to Albanian are found in the fourteenth century AD. The central part of the IE area is represented by little-known ancient languages such as Illyrian in the west Balkans, Dacian and Thracian in the east Balkans and Phrygian in Anatolia.

The sub-group parent known as Proto-Celtic may have been spoken by 1000 BC. Celtic-speaking groups spread across much of the European continent, except the far north, east and south, during the first millennium BC. As a consequence, Celtic place-names abound from Portugal to Poland. The name of the province of Galicia in northern Spain is definitely a Celtic one. The province of Galicia in southern Poland and western Ukraine may be so, too. The Celtic root of Gal-, indicating “Land of the Gaels or Gauls,” occurs in Portugal (possibly), Galicia in Spain, Gallia (Gaul), Pays des Galles (Wales) and in distant Galatia in Asia Minor. The western Celts called themselves Cymry or “compatriots,” but were dubbed Welsh or “foreigners” by their Germanic-speaking Anglo-Saxon neighbors in medieval times.

The Insular Celtic languages in Britain and Ireland of the first centuries BC later evolved into Welsh, Cornish and Breton. Welsh developed a rich literary tradition during the Middle Ages and is still a living language whereas Cornish became extinct by the end of the eighteenth century. Breton originated in Britain and was carried from there to Brittany from the fifth to seventh centuries AD, where it may have encountered surviving speakers of Gaulish Celtic. Irish yielded two languages derived from Irish – Scots Gaelic and Manx – that were imported to their historical positions in the Early Middle Ages. From a linguistic standpoint, the most important of the Celtic languages are Old and Middle Irish due to their large textual output.

In the Mediterranean zones at the onset of literacy a number of languages are attested, but because of the Roman Empire these were replaced by Latin and its descendants. The Italic branch of the Indo-European tree includes several extinct ancient languages such as Oscan, spoken south of Rome, and Umbrian spoken north of Rome, but also the Romance group comprising modern languages that descend from Vulgar Latin: Spanish (Castilian), Catalan, Galician and Portuguese in the Iberian Peninsula and its former colonies in Latin America as well as French, Italian, Romanian and a number of smaller languages. The development of these languages is relatively easy to record. The earliest textual evidence for the various Romance languages begins with the ninth century for French, the tenth century for Spanish and Italian, the twelfth century for Portuguese, and the sixteenth century for Romanian. Author James Clackson states in his book Indo-European Linguistics: An Introduction:

“In the case of the Romance languages, we have the bonus of having records of Classical Latin, which is close enough to the spoken variety from which the Romance group evolves to be considered the sub-group parent. We can see in Latin the word-forms which will eventually evolve to become the shared vocabulary of Romance: aqua ‘water’ can be considered the earlier form ancestral to Italian acqua and Spanish agua; pater ‘father’ develops into Italian padre and Spanish padre. For the Romance group, we can unearth the phonological changes which words have undergone in the centuries between Roman times and the present. We can identify which words are borrowings and which stem from Latin.”

Modern-day Spain and Portugal have a complex linguistic history. The Iberian language, a non-Indo-European tongue, went extinct during the Roman period. Celtic was spoken in the north before the Roman conquest and Germanic afterward. Semitic tongues came with traders from Carthage in Antiquity and Arab invaders during the Middle Ages. The Spanish and Portuguese languages evolved from Vulgar Latin in medieval times.

The Basque people inhabit the Pyrenees in northern Spain and southwest France. Their tongue is a language isolate, with no known living or dead relatives. It contains words for knife, axe and other tools which carry the root meaning of “stone.” Most likely it represents the last surviving descendant of the languages that existed in some regions of Stone Age Europe.

The Proto-Germanic language is not directly attested, but was probably spoken in the first millennium before the Christian era, possibly in southern Scandinavia. Eastern Germanic is attested by a single language, Gothic, the language of the Visigoths who settled in the Balkans. The spread of Christianity encouraged literacy in many languages, also non-Indo-European ones such as Georgian in the Caucasus. As with Slavic and Armenian, the oldest extensive written document in Germanic is a Christian text, the Gothic translation of the New Testament of the Bible by the bishop and missionary Ulfilas, or Wulfila (ca. AD 311-382).

The runic alphabet, often known as futhark after its first six letters, is thought to have been modeled on the Latin and/or Etruscan alphabet somewhere in south-central Europe where Germanic-speakers encountered literate peoples. In Old Norse, rune meant “inscription” or “mystery.” Runes are attested in Central Europe, Britain, Scandinavia and later Iceland from the 1st century AD to the 1600s for specialized purposes such as short inscriptions on weapons and in the sphere of cult and magic, although some longer runic texts were made during the Middle Ages. Their use continued into the Christian era in certain parts of Scandinavia, but runes appear to have been associated with the pagan religion. Mallory and Adams state:

“The northern group of Germanic languages is the earliest attested because of runic inscriptions that date from c. AD 300 onwards. These present an image of Germanic so archaic that they reflect not only the state of proto-Northern Germanic but are close to the forms suggested for the ancestral language of the entire Germanic group. But the runic evidence is meagre and the major evidence for Northern Germanic is to be found in Old Norse. This comprises a vast literature, primarily centred on or composed in Iceland. The extent of Old Norse literature ensures that it is also regarded as an essential comparative component of the Germanic group. By c. 1000, Old Norse was dividing into regional east and west dialects and these later provided the modern Scandinavian languages. Out of the west dialect came Icelandic, Faeroese, and Norwegian and out of East Norse came Swedish and Danish. The main West Germanic languages were German, Frankish, Saxon, Dutch, Frisian, and English….Incidentally, the closest linguistic relative to English is Frisian followed by Dutch.”

Among the Indo-European branches we find the Slavic or Slavonic languages. In the prehistoric period the Baltic languages, which include modern Lithuanian and Latvian, were so closely related to the Slavic ones that we can speak of a Balto-Slavic proto-language. The Slavic languages in early medieval times expanded over territories previously occupied by speakers of Baltic languages. Slavic place-names abound further west than the current limits of Slavic speech, especially in Germany and Austria. From the sixth century AD, Slavic tribes pushed south and west into the world of the Eastern Roman/Byzantine Empire to settle in the Balkans and East Central Europe. The first written Slavic language, Old Church Slavonic, relates to the Byzantine Christianizing missions of Saints Cyril and Methodius in the ninth century, with Biblical translations directed at Slavic speakers in Moravia and Macedonia.

The prestige of Old Church Slavonic, closely associated with the rituals of the Orthodox Church, made it play a major role in the evolution of the Slavic tongues. The South Slavic languages, Macedonian, Bulgarian, Slovenian and Serbo-Croatian, are recognizable about 1000-1100 AD. The East Slavic ones, Russian, Ukrainian and Byelorussian, are closer to each other and developed after 1600 due to the influence of OCS. The West Slavic languages, Polish, Czech and Slovak, were less affected by OCS since they came under the dominion of the Roman Catholic Church, which used Latin. The origins of the Slavs are shrouded in uncertainty and national myth. As The Oxford Illustrated History of Prehistoric Europe says:

“Earlier notions of a Slav genesis within a limited area have now been generally abandoned, though they are still made to appear in some accounts fully formed from the Pripyet Marshes. More plausibly, the Slavs, the Sclaveni of Byzantine sources, were an amalgam of cultural groups based between the Dnestr and Dnepr in the east and the Vistula and Oder in the west during the late fifth and earlier sixth centuries. They certainly had links with the Baltic peoples to the north and various Germanic groups to the west. Their movement westward and southward was facilitated by the advance of Germanic peoples into the Danube lands. Within a short time of their recognition in our written sources, Slav settlers had entered Bohemia, passed from there down the Elbe valley, extended north into Poland and eastern Germany, and south into the Balkans by way of Bulgaria. Further expansion into western Europe seemed inevitable, but the Frankish advance east of the Rhine brought it to a stop. The Slavs did hold on to the northern Balkans, though their occupation of Greece was ended by the Byzantines in the ninth century.”

Prior to recent mass immigration to Western Europe and apart from invading Muslims, the vast majority of Europeans spoke some form of Indo-European language, but one smaller linguistic family did traditionally exist on that continent: The Uralic family currently contains just over 20 million speakers in total, which is tiny compared to the billions of people now speaking an Indo-European language. This group is believed to have its original homeland, or Urheimat, somewhere close to the Ural Mountains where a few Uralic languages are still in use, perhaps at the same time as PIE.

Among those speaking Uralic languages are the Hungarians (Magyar) in East Central Europe, otherwise these languages are mainly concentrated in the Nordic and Baltic Sea region. Before 1800 the Hungarian linguist Sámuel Gyarmathi (1751-1830) managed to demonstrate that Hungarian was distantly related to Finnish, the national language of Finland. Much more closely related to Finnish is modern Estonian. Finally, we have the Uralic languages of the Sami peoples or “Lapps” who inhabit parts of northern Norway, Sweden, Finland and the Kola Peninsula in northwest Russia. This region is sometimes referred to as Lapland.

Two new branches were added to the Indo-European linguistic tree in the early twentieth century. The first one was Tocharian, once spoken in Central Asia and the far western border regions of China. The other was Anatolian, which includes Hittite and Luvian. The Hittites created a state in central Anatolia (present-day Turkey). Hittite is extensively documented through tablets from the mid-second millennium BC and was first suggested to be an Indo-European language by the Norwegian linguist Jørgen Alexander Knudtzon (1854-1917). The Czech linguist Bedrich Hrozny (1879-1952) deciphered the Hittite language some years later.

There are those who argue that through comparative studies, you can find traces of a Proto-Indo-European mythological universe distributed throughout the area of early Indo-European speech, obviously overlaid with later changes and influences from preexisting cultures. There could be elements of a PIE creation myth preserved in the traditions of the Celts, Germans, Slavs and Indo-Aryans. These traditions indicate a proto-myth whereby the universe is created from a primeval giant – either a cow such as the Norse Ymir or a “man” like the Vedic Purusa – who is sacrificed and dismembered. The various parts of his anatomy serve to provide a different element of nature; his flesh becomes the Earth, his hair grass, his bone yields stone, his blood water, his eyes the Sun, his mind the Moon, his brain the clouds, his breath the wind and his head becomes the heavens. Here are the authors J. P. Mallory and D. Q. Adams again:

“As to the identity of the sacrificer we have hints in a related sacrifice that serves as the foundation myth for the Indo-Iranians, Germans, and Romans (with a possible resonance in Celtic). Here we find two beings, twins, one known as ‘Man’ (with a lexical cognate between Germanic Mannus and Skt Manu) and his ‘Twin’ (Germanic Twisto, Skt Yama with a possible Latin cognate if Remus, the brother of Romulus, is derived from *Yemonos ‘twin’). In this myth ‘Man’, the ancestor of humankind, sacrifices his ‘Twin’. The two myths, creation and foundation of a people, find a lexical overlap in the Norse myth where the giant Ymir is cognate with Skt. Yama and also means ‘Twin’. The dismemberment of the primeval giant of the creation myth can be reversed to explain the origins of humans and we find various traditions that derive the various aspects of the human anatomy from the results of the original dismemberment, e.g. grass becomes hair, wind becomes breath. The creation myth is then essentially a sacrifice that brought about the different elements of the world.”

Through Indo-European comparative mythology, scholars have examined the narrative structure of the literary traditions of the various Indo-European speaking groups to reveal striking parallels between their different traditions. Much of the career of the Greek Odysseus is paralleled by distinct incidents in the lives of Arjuna in the Mahabharata, the Buddha in the earliest Buddhist texts in ancient India and CúChulainn in early Irish heroic literature. A few of these parallels are thought to be too close to have evolved purely by accident.


Iberian languages

Dear Fjordman: The pre-Latin iberian languages were indoeuropeans as many toponimes says: Segobriga, Burgos, Braga, Fraga and many more...

About this paragraph you are wrong:

"The Basque people inhabit the Pyrenees in northern Spain and southwest France. Their tongue is a language isolate, with no known living or dead relatives."

When Cartaginian Hannibal set up an army to atack Rome, he enroled soldier from around Cartagus, from Balearic Islands and from Iberia with Cartaginians officers, but the Berbers deserted when the knew they were to fight the Romans and they appeared centuries later looting the indoeuropean people around.

So it is not an isolated language, since it is quite similar to north-african Berber.

Basque v. Berber

@Tubal: You represent an old and pseudo-scientific hypothesis as a fact, which is clearly misleading. See Wikipedia : "Basques as migrants from the North of Africa, more exactly from the Berber ethnic group. This is an old hypothesis based in pseudo-scientific comparison between Basque and Tamazight languages that is now widely discredited."


@Luc: I didn´t say tha basques were inmigrants, I did say they abandonned the Hannibal army´s.

But not only the comparassion beetwin berber languages and basque language shows the similarity, the genetic comparation and blood group comparation show their relatives are around Argel.

Since this statement has implications beyond science and history and goes into politic ruining basque terrorisme´s basement claiming the basques are the original people of Europe and Indoerupeans are the invaders, the controversy shows up.

De Gaulle

Actually I understand that de Gaulle's family origins were never fully determined, and that he was either Flemish (Germanic) or Gaulish (Celtic).  His surname could be derived from gaule or Galle as much as the Dutch Van de Walle.  Gallia is a Latin derivation of Gaul...


Gallia can't be a Latin derivation of Gaul, as Gaul is a more recent word. What could be said is that Gallia is derived from a name that was also used by the people that we now call the Gauls. Here's what wiki says about "Gaul": "the word comes from Germanic *Walha-."

And here's what wiki say about Gallia and Gallus: "Latin Gallus might originally be from a Celtic ethnic or tribal name, perhaps borrowed into Latin during the early 5th century BC Celtic expansions into Italy. Its root may be the Common Celtic *galno, meaning 'power' or 'strength'. The Greek Galatai seems to be based on the same root, borrowed directly from the same hypothetical Celtic source which gave us Galli." (source)

In modern Breton, the word for power is still 'gallout' (a Celtic word), and 'gall' means French (from the Latin Gallia).


" dubbed Welsh or “foreigners” "

By the way, "Gaul" has the same root as "Welsh". It comes from a Frankish word (*walha) that meant "foreigner" (to the Franks, it meant both Roman or Celt).
And Gallia is a Latin name. I suppose it means the land of the Celts.
By a coincidence, both names (Gaul and Gallia) are similar.

Charles de Gaulle

@ Armor.
That is correct. Btw: Charles de Gaulle was of Flemish origin. The original version of his name, "Van (de) Wale", was Flemish/Dutch and has the same meaning as "de Gaulle". The Dutch language is an offspring of Frankish.
Wallonia, the French-speaking southern part of Belgium, is a Frankish name, derived from *walha.
Wallonia and Wales have the same meaning: The land of the Gauls.

Liberalism/marxism is a mental disease

How can you think about racism when reading an article about languages of Europe 5000 years ago?

This is the effect of marxism on the people in the western world.

God, help us all!


"Racist" is a completely meaningless word that people use to intimidate others when they have no real arguments of their own. If you mean that different groups of people have different talents due to their different genetic profiles then the answer is yes, I believe so. That's what science tells us. One of the reasons why the Indo-European language family has continued to spread for more than five thousand years, long after the initial advantage provided by wheeled vehicles ceased to exist, is because of the high genetic intelligence of its European speakers.

Ideas partly have a genetic basis, but not fully so. It is also possible that while genes shape ideas, ideas may in certain situations shape genes. Studies of gene-culture coevolution constitute a major branch of evolutionary biology. The most extreme case would be white Westerners, who are currently wiping themselves out because of a cultural belief in anti-racism.


that culture is the product of genes plus ideas

That you are a racist is well known, but I can't figure out "ideas". It seems to me that ideas are triggered by a mix of intelect and culture, not the other way around...