iBet uBet web content aggregator. Adding the entire web to your favor.

Showing posts with label Slavs. Show all posts

February 13, 2014

Human admixture common in human history (Hellenthal et al. 2014)

A string of recent papers argued for admixture in human populations at time scales from the Middle Pleistocene to recent centuries. A new paper in Science makes the point convincingly for extensive admixture in humans over the last few thousand years. The authors include the creators of Chromopainter/fineStructure software; the new "Globetrotter" method appears to be a natural extension of that method that seemed to work wonderfully well except for the limitation of producing only a tree of the studied populations.

The paper has a companion website in which you can look up the admixture history of individual populations.

While reading this study, it is important to remember its limitations. Two are immediately obvious: (i) admixture events can only be detected for the last few thousand years, as this method depends on pattern of linkage disequilibrium which decays exponentially with time due to recombination, and (ii) detection of admixture seems to depend on the presence of maximally differentiated populations from the edges of the human geographical range; for example, the Japanese appear unadmixed even though they are clearly of dual Jomon/Yayoi ancestry. On the other hand, the method does detect the admixture present in the San at a similar time scale.

The case of Northwestern Europe appears especially striking as none of the populations from the region show evidence of admixture. This may be because the mixtures taking place there (e.g., between "Celts" and "Anglo-Saxons" in Great Britain) involved populations that were not strongly differentiated. Alternatively, population admixture history may have preceded the last few thousand years and is thus beyond the temporal scope of this method.

An exception to the rule that populations at the edges of the human range appear to be unadmixed are the Armenians who appear to be the only * between the Atlantic and Pacific in Figure 2D (shown at the beginning of this post). The companion site lists their status as "uncertain".

Other results are more questionable; for example, the authors assert that Sardinians are an admixed population with one side being "Egyptian-like" and the other "French-like" whereas the ancient DNA evidence as it stands would rather indicate that Sardinians are the best approximation of Neolithic Europeans currently in existence and so are more likely to (mostly) possess a gene pool that traces back to ~8-9 thousand years in Europe. It will be quite the surprise if so many Europeans from 5kya or earlier look like modern Sardinians and ancient Sardinians don't!

The analysis of Eastern Europe is particularly interesting as it documents three way admixture (Northern/Southern/NE Asian) in most populations but two way admixture (Northern/Southern) in Greeks, estimated at ~37%. The authors claim that this is related to the Slavs, which seems reasonable given the 1,054AD age estimate. On the other hand, according to the companion website, the southern element in Greeks is inferred to be Cypriot-like and it's far from clear that the pre-Slavic population of Greece was Cypriot-like or indeed represented by any of the populations in the authors' dataset.

The three-way admixture in much of eastern Europe is not particularly surprising as history furnishes ample evidence for groups of steppe origin in the region during historical times. Some bequeathed their both language and name (e.g., Magyars), others only their name (e.g., Bulgarians) on the local Europeans, but records indicate a widespread presence of "eastern" groups in Europe from the time of the Huns to that of the Ottomans. A study of late Antique eastern Europeans from the Baltic to the Aegean may help better document how the twin phenomena of the eastern invasions and the spread of the Slavs shaped the present-day genetic diversity of the region.

I suspect that a few ancient samples will be far more informative for understanding the recent history of our species than the most sophisticated modeling of modern populations. Nonetheless, it's great to have a new method that maximizes what can be learned about the past from the messy palimpsest of the present.

Science 14 February 2014: Vol. 343 no. 6172 pp. 747-751 DOI: 10.1126/science.1243518

A Genetic Atlas of Human Admixture History

Garrett Hellenthal et al.

Modern genetic data combined with appropriate statistical methods have the potential to contribute substantially to our understanding of human history. We have developed an approach that exploits the genomic structure of admixed populations to date and characterize historical mixture events at fine scales. We used this to produce an atlas of worldwide human admixture history, constructed by using genetic data alone and encompassing over 100 events occurring over the past 4000 years. We identified events whose dates and participants suggest they describe genetic impacts of the Mongol empire, Arab slave trade, Bantu expansion, first millennium CE migrations in Eastern Europe, and European colonialism, as well as unrecorded events, revealing admixture to be an almost universal force shaping human populations.

Link

September 26, 2013

Y chromosomes of Slavic minorities inhabiting Vojvodina, Serbia

From the paper:

Scrutiny of predicted haplogroups revealed high incidence of haplogroup R1a in both Northern Slavic minorities inhabiting Serbia (42.0% and 44.0% in Slovaks and Ruthenians, respectively), which was comparable to its prevalence in the two Northern Slavic reference populations (45.1% and 43.5% in Slovaks and Ukrainians, respectively), but considerably higher than the one observed in Southern Slavic Serbs (15.1%, Table S4).

Forensic Science International: Genetics Volume 8, Issue 1, January 2014, Pages 126–131

Northern Slavs from Serbia do not show a founder effect at autosomal and Y-chromosomal STRs and retain their paternal genetic heritage

Krzysztof Rębała et al.

Studies on Y-chromosomal markers revealed significant genetic differentiation between Southern and Northern (Western and Eastern) Slavic populations. The northern Serbian region of Vojvodina is inhabited by Southern Slavic Serbian majority and, inter alia, Western Slavic (Slovak) and Eastern Slavic (Ruthenian) minorities. In the study, 15 autosomal STR markers were analysed in unrelated Slovaks, Ruthenians and Serbs from northern Serbia and western Slovakia. Additionally, Slovak males from Serbia were genotyped for 17 Y-chromosomal STR loci. The results were compared to data available for other Slavic populations. Genetic distances for autosomal markers revealed homogeneity between Serbs from northern Serbia and Slovaks from western Slovakia and distinctiveness of Serbian Slovaks and Ruthenians. Y-STR variation showed a clear genetic departure of the Slovaks and Ruthenians inhabiting Vojvodina from their Serbian neighbours and genetic similarity to the Northern Slavic populations of Slovakia and Ukraine. Admixture estimates revealed negligible Serbian paternal ancestry in both Northern Slavic minorities of Vojvodina, providing evidence for their genetic isolation from the Serbian majority population. No reduction of genetic diversity at autosomal and Y-chromosomal markers was found, excluding genetic drift as a reason for differences observed at autosomal STRs. Analysis of molecular variance detected significant population stratification of autosomal and Y-chromosomal microsatellites in the three Slavic populations of northern Serbia, indicating necessity for separate databases used for estimations of frequencies of autosomal and Y-chromosomal STR profiles in forensic casework. Our results demonstrate that regarding Y-STR haplotypes, Serbian Slovaks and Ruthenians fit in the Eastern European metapopulation defined in the Y chromosome haplotype reference database.

Link

August 11, 2013

Belorussian Y chromosomes and mtDNA

PLoS ONE 8(6): e66499. doi:10.1371/journal.pone.0066499

Uniparental Genetic Heritage of Belarusians: Encounter of Rare Middle Eastern Matrilineages with a Central European Mitochondrial DNA Pool

Alena Kushniarevich et al.

Ethnic Belarusians make up more than 80% of the nine and half million people inhabiting the Republic of Belarus. Belarusians together with Ukrainians and Russians represent the East Slavic linguistic group, largest both in numbers and territory, inhabiting East Europe alongside Baltic-, Finno-Permic- and Turkic-speaking people. Till date, only a limited number of low resolution genetic studies have been performed on this population. Therefore, with the phylogeographic analysis of 565 Y-chromosomes and 267 mitochondrial DNAs from six well covered geographic sub-regions of Belarus we strove to complement the existing genetic profile of eastern Europeans. Our results reveal that around 80% of the paternal Belarusian gene pool is composed of R1a, I2a and N1c Y-chromosome haplogroups – a profile which is very similar to the two other eastern European populations – Ukrainians and Russians. The maternal Belarusian gene pool encompasses a full range of West Eurasian haplogroups and agrees well with the genetic structure of central-east European populations. Our data attest that latitudinal gradients characterize the variation of the uniparentally transmitted gene pools of modern Belarusians. In particular, the Y-chromosome reflects movements of people in central-east Europe, starting probably as early as the beginning of the Holocene. Furthermore, the matrilineal legacy of Belarusians retains two rare mitochondrial DNA haplogroups, N1a3 and N3, whose phylogeographies were explored in detail after de novo sequencing of 20 and 13 complete mitogenomes, respectively, from all over Eurasia. Our phylogeographic analyses reveal that two mitochondrial DNA lineages, N3 and N1a3, both of Middle Eastern origin, might mark distinct events of matrilineal gene flow to Europe: during the mid-Holocene period and around the Pleistocene-Holocene transition, respectively.

Link

May 08, 2013

The Geography of Recent Genetic Ancestry across Europe (Ralph and Coop 2013)

This paper first came out last July on the arXiv and went through four versions there before its final form which has now appeared in PLoS Biology. It's great that its early release allowed other people to read it without having to wait for the completion of the peer review process.

I think that this is a good model: journals have the right and obligation to subject papers to close scrutiny according to their own procedures, but this process ought not interfere with the early availability of research results or the ability of anyone other than the chosen reviewers to comment on new results.

PLoS Biol 11(5): e1001555. doi:10.1371/journal.pbio.1001555

The Geography of Recent Genetic Ancestry across Europe

Peter Ralph, Graham Coop

The recent genealogical history of human populations is a complex mosaic formed by individual migration, large-scale population movements, and other demographic events. Population genomics datasets can provide a window into this recent history, as rare traces of recent shared genetic ancestry are detectable due to long segments of shared genomic material. We make use of genomic data for 2,257 Europeans (in the Population Reference Sample [POPRES] dataset) to conduct one of the first surveys of recent genealogical ancestry over the past 3,000 years at a continental scale. We detected 1.9 million shared long genomic segments, and used the lengths of these to infer the distribution of shared ancestors across time and geography. We find that a pair of modern Europeans living in neighboring populations share around 2–12 genetic common ancestors from the last 1,500 years, and upwards of 100 genetic ancestors from the previous 1,000 years. These numbers drop off exponentially with geographic distance, but since these genetic ancestors are a tiny fraction of common genealogical ancestors, individuals from opposite ends of Europe are still expected to share millions of common genealogical ancestors over the last 1,000 years. There is also substantial regional variation in the number of shared genetic ancestors. For example, there are especially high numbers of common ancestors shared between many eastern populations that date roughly to the migration period (which includes the Slavic and Hunnic expansions into that region). Some of the lowest levels of common ancestry are seen in the Italian and Iberian peninsulas, which may indicate different effects of historical population expansions in these areas and/or more stably structured populations. Population genomic datasets have considerable power to uncover recent demographic history, and will allow a much fuller picture of the close genealogical kinship of individuals across the world.

Link

March 11, 2013

Genomewide structure of populations from European Russia (Khrunin et al. 2013)

Notice:

The intermediate position of Estonians between Balts and Finns
The intermediate position of some Russian groups between Komi and the main body of Europeans.

PLoS ONE 8(3): e58552. doi:10.1371/journal.pone.0058552

A Genome-Wide Analysis of Populations from European Russia Reveals a New Pole of Genetic Diversity in Northern Europe

Andrey V. Khrunin et al.

Several studies examined the fine-scale structure of human genetic variation in Europe. However, the European sets analyzed represent mainly northern, western, central, and southern Europe. Here, we report an analysis of approximately 166,000 single nucleotide polymorphisms in populations from eastern (northeastern) Europe: four Russian populations from European Russia, and three populations from the northernmost Finno-Ugric ethnicities (Veps and two contrast groups of Komi people). These were compared with several reference European samples, including Finns, Estonians, Latvians, Poles, Czechs, Germans, and Italians. The results obtained demonstrated genetic heterogeneity of populations living in the region studied. Russians from the central part of European Russia (Tver, Murom, and Kursk) exhibited similarities with populations from central–eastern Europe, and were distant from Russian sample from the northern Russia (Mezen district, Archangelsk region). Komi samples, especially Izhemski Komi, were significantly different from all other populations studied. These can be considered as a second pole of genetic diversity in northern Europe (in addition to the pole, occupied by Finns), as they had a distinct ancestry component. Russians from Mezen and the Finnic-speaking Veps were positioned between the two poles, but differed from each other in the proportions of Komi and Finnic ancestries. In general, our data provides a more complete genetic map of Europe accounting for the diversity in its most eastern (northeastern) populations.

Link

January 17, 2013

Complete mtDNA sequences and the history of Slavs

PLoS ONE 8(1): e54360. doi:10.1371/journal.pone.0054360

The History of Slavs Inferred from Complete Mitochondrial Genome Sequences

Marta Mielnik-Sikorska et al.

To shed more light on the processes leading to crystallization of a Slavic identity, we investigated variability of complete mitochondrial genomes belonging to haplogroups H5 and H6 (63 mtDNA genomes) from the populations of Eastern and Western Slavs, including new samples of Poles, Ukrainians and Czechs presented here. Molecular dating implies formation of H5 approximately 11.5–16 thousand years ago (kya) in the areas of southern Europe. Within ancient haplogroup H6, dated at around 15–28 kya, there is a subhaplogroup H6c, which probably survived the last glaciation in Europe and has undergone expansion only 3–4 kya, together with the ancestors of some European groups, including the Slavs, because H6c has been detected in Czechs, Poles and Slovaks. Detailed analysis of complete mtDNAs allowed us to identify a number of lineages that seem specific for Central and Eastern Europe (H5a1f, H5a2, H5a1r, H5a1s, H5b4, H5e1a, H5u1, some subbranches of H5a1a and H6a1a9). Some of them could possibly be traced back to at least ~4 kya, which indicates that some of the ancestors of today's Slavs (Poles, Czechs, Slovaks, Ukrainians and Russians) inhabited areas of Central and Eastern Europe much earlier than it was estimated on the basis of archaeological and historical data. We also sequenced entire mitochondrial genomes of several non-European lineages (A, C, D, G, L) found in contemporary populations of Poland and Ukraine. The analysis of these haplogroups confirms the presence of Siberian (C5c1, A8a1) and Ashkenazi-specific (L2a1l2a) mtDNA lineages in Slavic populations. Moreover, we were able to pinpoint some lineages which could possibly reflect the relatively recent contacts of Slavs with nomadic Altaic peoples (C4a1a, G2a, D5a2a1a1).

Link

October 10, 2012

The Indo-European invasion of the Baltic

In some recent posts, I showed that South Asian populations (North Indian Brahmins, South Indian Brahmins) can be seen as mixtures of West Eurasian and South Indian populations, but also that West Eurasians (Bulgarians, Greeks, Armenians, and French) can be seen as mixtures of South Asian and Sardinian populations.

This may seem strange, but can be explained if we understand how f3-statistics and rolloff actually work. These methods do not require pure or unadmixed ancestral populations, but exploit allele frequency differences in the reference populations together with either (i) allele frequencies in the mixed population, in the case of f3-statistics, or (ii) admixture linkage disequilibrium in the mixed population, in the case of rolloff.

If a and b are allele frequencies in two ancestral populations A and B that mix, then:

The frequency of a will shift towards b if A experiences gene flow from B
The frequency of a will randomly shift if A experiences gene flow from an "outgroup" population
The frequency of a will shift towards b if A experiences gene flow from a third population that is geographically and genetically intermediate between A and B

An application to the Europe-South Asia cline

I took the following set of populations, and calculated all 1,365 possible f3-statistics:

"FIN30" "Lithuanians" "Russian" "Pathan" "Balochi" "North_Kannadi" "Polish_D" "Russian_D" "Mixed_Slav_D" "Bulgarian_D" "Serb_D" "Ukrainian_D" "Belorussian" "Bulgarians_Y" "Ukranians_Y"

In the following table, I report the lowest Z-scores for each target population (third column). So, for example, Polish_D can be seen as a mixture of Lithuanians and Balochi. Only negative scores are indicative of admixture. I highlight in bold the significant negative scores (Z less than -3)

Lithuanians North_Kannadi FIN30 0.001606 0.000259 6.193 280043
Ukrainian_D Belorussian Lithuanians 0.00078 0.000299 2.614 268493
Lithuanians North_Kannadi Russian -0.002738 0.000248 -11.045 279965
North_Kannadi Polish_D Pathan -0.006959 0.000229 -30.344 280220
North_Kannadi Bulgarians_Y Balochi -0.003636 0.000246 -14.781 281604
Pathan Ukrainian_D North_Kannadi 0.033802 0.000623 54.237 271858
Lithuanians Balochi Polish_D -0.001171 0.000178 -6.581 279519
Lithuanians Pathan Russian_D -0.001829 0.000166 -11.026 280658
Lithuanians Pathan Mixed_Slav_D -0.001715 2e-04 -8.594 277635
Lithuanians Balochi Bulgarian_D -0.001247 0.000313 -3.979 272342
Lithuanians Balochi Serb_D -0.00091 0.000377 -2.416 270807
Lithuanians Balochi Ukrainian_D -0.002222 0.000358 -6.211 270399
Lithuanians Balochi Belorussian -0.000897 0.00027 -3.325 273076
Balochi Polish_D Bulgarians_Y -0.001198 0.000185 -6.481 279632
Lithuanians Balochi Ukranians_Y -0.001727 0.000187 -9.236 278677

It is clear, that what I have described holds here: European populations appear like mixtures of Lithuanians and South Asians; conversely, South Asian populations appear like mixtures of Europeans and North Kannadi.

This does not mean that the populations that appear unadmixed (FIN30, Lithuanians, North_Kannadi, and Serbs) are in fact so, for at least two reasons:

The f3 statistic confirms, but does not reject the presence of admixture; in particular, it fails to find real admixture in highly drifted populations
The f3 statistics exploits allele frequency correlations between populations: but the North Kannadi and Lithuanians/Finns occupy opposite ends of the studied cline, so their lack of signal of admixture may be due to the non-existence of populations that are even more unadmixed than themselves.

In the case of South Indians, we are completely sure that this is the case. Reich et al. (2009) managed to show this not because there are any unadmixed Ancestral South Indians (ASI) left, but because they exploited the existence of the Onge, an isolated group from the Andaman Islands that was a sister group to the ASI. So, we can be fairly sure that southern Indians themselves have West Eurasian-like admixture, even the ones that are at the end of the West Eurasia-South India cline on its southern end.

The problem is: there is no isolated group of unadmixed Europeans left in existence that might serve a similar proxy function as the Onge did for South Asians.

Enter Pickrell et al. (2012) to the rescue. In that paper, the authors studied admixture in the Khoe-San of South Africa. Now, many of the Khoe-San sub-groups appeared to be admixed, but the "Juj'hoan North" population appeared to be at the "end of the cline": it's impossible to detect admixture in them using alelle frequency differences, because, quite simply, there are no populations that are less unadmixed than them: they're as pure descendants of "Ancestral Bushman" as exist on the earth today.

But, the clever thing is, that we don't have to detect admixture only using allele frequency differences, but also using admixture LD, i.e., by exploiting the correlation between linkage disequilibrium (the co-inheritance of physically separated markers on a chromosome) and allele frequency differences between populations. Pickrell el al. were able to do this not by conjuring up a more unadmixed population than the "Juj'hoan North" one available to them, but by splitting up that population, and using one half to find allele frequency differences, and the other half to detect admixture LD.

Admixture LD signal in Lithuanians

Using the aforementioned idea, I set out to see whether Lithuanians, who occupy the European end of the Europe-South Asia cline present such a signal of admixture LD. I used the Lithuanian_D sample from the Dodecad Project and the Balochi HGDP sample as reference populations (to calculate allele frequency differences), and the Behar et al. (2010) Lithuanians for admixture LD. There were only ~300k SNPs usuable in this set, but sufficient to detect the signal of admixture LD:

The admixture time estimate is 200.350 +/- 61.608 generations, or 5,810 +/- 1790 years. This is not very precise, probably because of the small number of SNPs and individuals used, but it certainly points to the Neolithic-to-Bronze Age for the occurrence of this admixture. The date is certainly reminiscent of the expansion of the Kurgan culture out of eastern Europe, or, the later Corded Ware culture of northern Europe.

So, it may well appear that at least some of the people participating in these groups of cultures, were indeed influenced by the Indo-Europeans as they expanded from their West Asian homeland. These intruders mixed with eastern Europeans who vacillated during the late Neolithic between a northern Europeoid pole akin to Mesolithic hunter gatherers from Gotland and Iberia, and a widely dispersed Sardinian-like population that is in evidence at least in the Sweden-Italian Alps-Bulgaria triangle. The gradual appearance of non-mtDNA U related lineages in Siberia and Ukraine is most likely related to this phenomenon.

It would seem that the Proto-Indo-Europeans mixed with different substrata in the four directions of their expansion: Sardinian-like people in southern Europe, Lithuanian-like people in northern Europe, South Indian-like people in South Asia, and East Eurasians in Siberia and east central Asia. Extant groups are descendants of divergent Neolithic population groups, brought closer together (genetically) because of variable admixture with the PIE population and its early offshoots.

Conclusion

There are mutual signals of admixture across a Europe-South Asia cline: Europeans appear to be mixed with South Asians, and South Asians appear to be mixed with Europeans. The simplest explanation for this pattern involves expansion of a third, geographically and genetically intermediate population that affected both Europe and South Asia. We can use the signal of admixture LD to prove that this expansion affected some of the most unadmixed populations in Europe (e.g., Lithuanians), just as it did the most unadmixed populations of India (e.g., Dravidians).

It will be interesting to use these techniques to study signals of admixture in other "end of the line" populations such as Sardinians, South Indians, etc.

UPDATE I (rolloff analysis of Poles):

I have carried out rolloff analysis of my 25-strong Polish_D sample using Lithuanians and Pathans as references:

The signal is fairly distinct, and corresponds to 149.296 +/- 38.783 generations or 4330 +/- 1120 years. I am guessing that either the different reference population (Pathans vs. Balochi), or, more likely the increased number of target individuals (25 vs. 10) have contributed to the narrowing down of the uncertainty. It will be interesting to explore this signal further with more population pairs.

UPDATE II (rolloff analysis of Finns):

I have also used the 1000 Genomes Finnish sample (FIN) in a similar manner as Lithuanians, using 15 individuals to estimate allele frequency differences, and 15 ones for admixture LD, and using the Pathans as a South Asian reference population. There is a clear signal of admixture:

This dates to 104.967 +/- 14.797 generations, or 3,040 +/- 430 years. Finland came under the influence of both Europeans (and likely Indo-Europeans) during the Bronze Age period (a mixture of Battle Axe with local Comb Ceramic seems to have occurred), as well as likely non-European (and likely Uralic) intrusions during the same time frame, as part of the Seima-Turbino phenomenon. It will be interesting to repeat this analysis with an East Eurasian reference population to isolate potential signals of admixture dating to either the Comb Ceramic or Seima-Turbino episodes of migration.

(Note; added Oct 14): I carried out rolloff analysis using Nganassans as suggested in the above paragraph here.

UPDATE III (rolloff analysis of Ukrainians):

I have used the Yunusbayev et al. sample of Ukrainians, and estimated its admixture time using Lithuanians and Balochi as reference populations:

The admixture time estimate is 191.078 +/- 35.079 generations, or 5,540 +/- 1,020 years. It seems very similar to that in Lithuanians, with a smaller standard error, perhaps on account of either the larger number of SNPs or larger number of individuals.

It is tempting to associate this admixture signal with the Maikop culture which appeared at around this time. Assuming that North_European/West_Asian (or Lithuanian-like and Balochi-like) gene pools existed north and south of the Pontic-Caspian-Caucasus set of geographical barriers, then the Maikop culture which shows links to both the early Transcaucasian culture and those of Eastern Europe would have been an ideal candidate region for the admixture picked up by rolloff to have taken place. There are, of course, other possibilities.

UPDATE IV (rolloff analysis of Lithuanians with Pathan reference):

I repeated the first analysis of this post, but this time, I used Pathans, rather than Balochi as a reference population:

The admixture time estimate of 217.501 +/- 51.170 generations, or 6,310 +/- 1,480 years appears to be similar with the original estimate of 5,810 +/- 1790 years, so it does not appear that the use of Balochi or Pathan as a reference population much affects this result.

October 07, 2012

rolloff analysis of Bulgarians as Sardinian+Pathan

Continuing my rolloff experiments, I have taken the Yunusbayev et al. sample of Bulgarians. This is interesting because of the recent evidence of a Sardinian-like individual from Iron Age Bulgaria, and also as a complement to a similar analysis on the Greeks. Bulgarians are Slavic speaking, but their ethnogenesis owes a great deal to the Bulgars, adding another potential element of complication. However, the paucity of East Eurasian admixture in Bulgarians, together with their Slavic language, probably suggests that this element represented a small elite that did not have a substantial role in the genetic formation of the Bulgarian population.

The top f3 statistics can be seen below:

Kshatriya_M Sardinian Bulgarians_Y -0.003813 0.000295 -12.918 237507
Velamas_M Sardinian Bulgarians_Y -0.003783 0.000285 -13.287 238276
Piramalai_Kallars_M Sardinian Bulgarians_Y -0.003693 0.000306 -12.061 238106
Kanjars_M Sardinian Bulgarians_Y -0.003643 0.000298 -12.227 237838
GIH30 Sardinian Bulgarians_Y -0.003638 0.000259 -14.028 240548
North_Kannadi Sardinian Bulgarians_Y -0.00355 0.000317 -11.187 237882
Muslim_M Sardinian Bulgarians_Y -0.003542 0.000333 -10.632 236964
Chamar_M Sardinian Bulgarians_Y -0.003505 0.000303 -11.585 238882
INS30 Sardinian Bulgarians_Y -0.003467 0.000264 -13.153 240279
Dharkars_M Sardinian Bulgarians_Y -0.003452 0.000309 -11.155 238211
Brahmins_from_Uttar_Pradesh_M Sardinian Bulgarians_Y -0.003448 0.000278 -12.42 238041
Indian_D Sardinian Bulgarians_Y -0.003411 0.000256 -13.308 241225
Iyer_D Sardinian Bulgarians_Y -0.003364 0.000291 -11.568 237509
Jatt_D Sardinian Bulgarians_Y -0.003327 0.000289 -11.513 236735
Pathan Sardinian Bulgarians_Y -0.003212 0.000239 -13.444 240969
Iyengar_D Sardinian Bulgarians_Y -0.003209 0.000308 -10.416 236840
Dusadh_M Sardinian Bulgarians_Y -0.003181 0.000313 -10.172 237512
Sindhi Sardinian Bulgarians_Y -0.003094 0.000239 -12.919 241268
Balochi Sardinian Bulgarians_Y -0.002804 0.00024 -11.686 240924

To maximize the number of SNPs and number of individuals, I used the Sardinian+Pathan pair as reference populations. 509,395 SNPs were used for this experiment. The exponential fit can be seen below:

There was a technical issue with the jackknife which I am currently investigating, but the mean time of the admixture was estimated at 126.83004 generations, or 3,680 years. This is similar to the value of 3,850 years I obtained on the Greek sample.

If this date is accepted, then the interesting issue is why an individual from Bulgaria was Sardinian-like during the Iron Age. Possibly, either this individual was Sardinian-like in the broad sense, despite having minority West Asian admixture, or a few centuries after the admixture event, there was still an uneven distribution of the constituent elements, with most individuals still predominantly Sardinian-like. Given that the indigenous element was probably most numerous, so only part of it would have the opportunity to admix with the intrusive West Asian-like population, and this influence would spread to the population-at-large over time.

In any case, this evidence, such as it is, appears consistent with my idea about a Bronze Age invasion of Europe from Asia.

Naturally, only a broad sampling of ancient DNA variation from the Balkans, perhaps targeting different sites, cultures, times, social status, and physical types will be sufficient to track the early appearance of an intrusive population.

September 13, 2012

Polish and German Y-chromosomes

I have often bemoaned the use of present-day populations as stand-ins for dealing with the subject of very old archaeological phenomena such as the Neolithic transition. Of course, I understand that until a few years ago, this was all we had to work with. But, this idea is now suspect, having been made so by a two-pronged attack. On the ancient DNA side, researchers have consistently discovered that for the better part of prehistory, ancient populations did not match modern ones: even if the constituent elements of later evolution could be identified, they were still in polarized non-admixed form as in the case of the Neolithic Swedes. On the recent side, researchers have used surnames or even toponyms to show that ethnic admixture in the recent historical past has shifted Y-chromosome frequencies around.

A new paper in EJHG follows on this tradition by comparing pre- and post-WWII patterns of Y chromosome variation in Germany and Poland. Y-haplogroup frequencies can be seen on top left. From the caption: "Phylogenetic relationship and frequencies of Y-chromosomal haplogroups in the studied populations. Ka Kaszuby; Ko Kociewie; Ku Kurpie; Lu Lusatia; Sl Slovakia; Me Mecklenburg; Ba Bavaria."

It is important to note that the researchers were able to study pre-war populations, because most everybody knows where their patrilineal ancestor lived less than 100 years ago. But, European history consists of many events whose effects on the current population are less known, because they occurred at an older time. In some cases, populations may have migrated (such as the Germans of eastern Europe following WWII), in others populations that have once existed there have almost disappeared, or become much less numerically significant (such as the Ashkenazi Jews due to persecution during WWII and after it through migration to Israel and elsewhere, or various Christian and Jewish communities that once flourished throughout the Middle East). Other, less known groups, such as the Old Prussians or the Jassic speakers of Hungary have been presumably absorbed by surrounding majorities, or through a process of elite dominance.

In many cases, the available information, in the form of linguistic, genealogical, or historical evidence, can be used to remove layers of admixture, migration, and extinction in history; but, the gap between the deep prehistoric past and the recent, historical one cannot be bridged by these methods alone. Ultimately, ancient DNA researchers must close in on the present by targeting more recent populations for analysis. As the realization of genetic change continues to amass on both sides of the divide, I suspect that this will come naturally, although I do expect some reticence to the findings as they begin to touch upon the most cherished origins traditions of the multitude of extant European nations.

European Journal of Human Genetics advance online publication 12 September 2012; doi: 10.1038/ejhg.2012.190

Contemporary paternal genetic landscape of Polish and German populations: from early medieval Slavic expansion to post-World War II resettlements

Krzysztof Rebala et al.

Abstract

Homogeneous Proto-Slavic genetic substrate and/or extensive mixing after World War II were suggested to explain homogeneity of contemporary Polish paternal lineages. Alternatively, Polish local populations might have displayed pre-war genetic heterogeneity owing to genetic drift and/or gene flow with neighbouring populations. Although sharp genetic discontinuity along the political border between Poland and Germany indisputably results from war-mediated resettlements and homogenisation, it remained unknown whether Y-chromosomal diversity in ethnically/linguistically defined populations was clinal or discontinuous before the war. In order to answer these questions and elucidate early Slavic migrations, 1156 individuals from several Slavic and German populations were analysed, including Polish pre-war regional populations and an autochthonous Slavic population from Germany. Y chromosomes were assigned to 39 haplogroups and genotyped for 19 STRs. Genetic distances revealed similar degree of differentiation of Slavic-speaking pre-war populations from German populations irrespective of duration and intensity of contacts with German speakers. Admixture estimates showed minor Slavic paternal ancestry (~20%) in modern eastern Germans and hardly detectable German paternal ancestry in Slavs neighbouring German populations for centuries. BATWING analysis of isolated Slavic populations revealed that their divergence was preceded by rapid demographic growth, undermining theory that Slavic expansion was primarily linguistic rather than population spread. Polish pre-war regional populations showed within-group heterogeneity and lower STR variation within R-M17 subclades compared with modern populations, which might have been homogenised by war resettlements. Our results suggest that genetic studies on early human history in the Vistula and Oder basins should rely on reconstructed pre-war rather than modern populations.

Link

August 22, 2012

East Eurasian-like ancestry in Northern Europe (part 3)

(This is the third part of the series. See part 1 and part 2.)

In the first two parts of the series, I showed that northern European populations show hints of East Eurasian ancestry when compared against Sardinians. I used Dai, Han, and Karitiana as reference populations for East Eurasia. In the current post, I extend this analysis by using HGDP Papuans and the Onge (Reich et al. 2009) from the Andaman Islands.

The f4 statistics using Karitiana, Papuan, and Onge populations can be found in this spreadsheet.

Below, you can see that they are all near perfectly correlated with each other.

The visual appraisal is confirmed when we calculate the correlation coefficients:

The fact that all three populations track the same signal is strong evidence for the direction of gene flow: from Asia into northern Europe. If the signal was present in only one of the three populations, then it could conceivably be an artefact of gene flow in the opposite direction (from northern Europeans to the affected population). But, the fact that all three populations show the same pattern would require northern European-like admixture in the Andaman Islands, Papuan New Guinea and South America, which does not appear very parsimonious.

While the signals from the three populations are correlated, their intensity varies. The Z-scores provide a measure of this intensity. The mean Z-scores using a Karitiana, Papuan, and Onge reference across all populations are respectively -17.7, -8.0, and -6.0.

While I did not include the Han reference of part 1 in this analysis, inspection of the f4 statistics (which can be obtained at the bottom of that part), suggests that the Z-scores become more significant when using an Onge, Papuan, Han, and Karitiana reference in that order. For example, for the Finnish_D population, they are: -10.037, -13.2949, -23.9305, and -27.764 respectively.

It thus appears that the element contributing East Eurasian-like ancestry in northern Europeans was derived from the northern spectrum of East Eurasians; the Karitiana may live in South America today, but they trace their ancestors to northern Eurasia, having entered the Americas c. 15ka.

In my opinion, the signal has been formed by a superposition of a few factors:

The fact that Y-haplogroup R, the main lineage in modern northern Europeans has a common origin (Y-haplogroup P) with haplogroup Q, the main lineage in modern Amerindians, and many Siberians. We can hypothesize that the population that brought R into Europe was intermediate genetically across the Caucasoid-Mongoloid spectrum. In West Eurasia, this population admixed with the Palaeo-West Eurasians (Y-haplogroups IJ, G, and possibly LT), and contributed their DNA primarily to the northern Europeoids.
Other population movements of more regional impact, such as Y-haplogroup N, which affected mainly Uralic, Baltic, and East Slavic populations, as well as elements from the mixed West/East Eurasian mtDNA contact zone that ancient DNA analysis has revealed in Eastern Europe and Siberia.

The raw dumps of fourpop output for Papuan and Onge reference can be found here.

East Eurasian-like admixture in Northern Europe (part 2)

This is a continuation of my earlier post. Please refer to it for the methodology. A new part 3 can be found here.

I have repeated the experiment with a much larger set of populations:

English_D, British_D, Ukranians_Y, Karitiana, Spaniards, Sardinian, Serb_D, Mordovians_Y, Irish_D, French, Finnish_D, Chuvashs_16, Romanian_D, N_Italian_D, French_Basque, Austrian_D, Russian_D, Hungarians_19, Kent_1KG, German_D, Belorussian, Tuscan, Lithuanian_D, Orkney_1KG, Dutch_D, TSI30, Ukrainian_D, Bulgarians_Y, Bulgarian_D, Russian, Swedish_D, Pais_Vasco_1KG, French_D, Castilla_Y_Leon_1KG, Lithuanians, San, Polish_D, Romanians_14, Orcadian, Cornwall_1KG, Valencia_1KG, North_Italian, FIN30, Norwegian_D, CEU30

I used Sardinians as the Caucasoid reference population, Karitiana for Mongoloids, and San for Africans. The latter two were chosen because they live at maximally opposite corners of the Earth (South America vs. South Africa).

A first plot of the f4 statistics used for f4 regression ancestry estimation is seen below:

Clearly, some evidence of a cline is present, but several populations appear to deviate from it. In order to get the cleanest possible cline, I carried out the following greedy procedure: I calculate the correlation coefficient of this set, and iteratively remove one population that leads to the maximum improvement of the correlation, until no further improvement takes place. The following populations were removed with this procedure:

Spaniards, Serb_D, Romanian_D, N_Italian_D, Tuscan, TSI30, Bulgarians_Y, Bulgarian_D, Castilla_Y_Leon_1KG, Romanians_14, Valencia_1KG

This seems to make sense, as all these are southern European populations. Note that their removal does not mean that they do not partake in the same phenomenon as northern Europeans: they also exhibit Karitiana-shift relative to the Sardinians, but there are probably other confounding factors that make them fall "off-cline". Including them would diminish the clarity of the cline for Northern European populations. The regression of the remaining populations can be seen on the right:

f4 regression ancestry estimation results are shown on the left. These appear to be much higher than was the case with the Han and Dai in the previous experiment.

I can't say that I've made any obvious mistakes, but these admixture proportions are substantial, and call for an explanation. Whatever their true levels, I am fairly confident on at least a few points:

First, it is evident that northern Europeans have higher levels of this element than southern Europeans; the latter are not altogether deficient in it, but they fall "off-cline", making estimation of their admixture proportions more difficult.

Second, within northern Europe, there is a fairly clear east-west cline of diminishing Amerasian-like admixture. The minimum occurs in Sardinians and secondarily in Southwest Europe. Romance, Celtic, and Germanic populations all have less of it than Balto-Slavic and Uralic ones. And, some populations of northeastern Europe seem to have a noticeable excess of it.

The groups with the most Amerasian-like admixture possess Y-haplogroup N, a clear trace of eastern ancestry that is not shared by most Europeans. The arrival of this haplogroup, either with Comb Ceramic of the Baltic Neolithic or later with Seima Turbino Bronze Age expansions is probably responsible for the local excess in Northeastern Europe. The Chuvash are, of course, a Turkic population but of Finno-Ugrian genetic origin.

But, the presence of this element even in Western Europe cannot be explained on the basis of typically Mongoloid elements which are almost completely lacking there. If Mesolithic Europeans were themselves Asian-shifted, then this would account for the presence of the element, but not necessarily for its clinal manifestation. The double (north-south and east-west) cline indicates every sign of an intrusive element. So, for the time being, I will propose that this is associated with late (e.g., Copper and Bronze Age) phenomena, such as the northern stream of the Bronze Age Indo-European invasion of Europe.

This may be due to the

(i) northern Indo-European groups picking up some native east European or Siberian elements as they made their way into Europe,
or (ii), more likely, in my opinion, that the Y-haplogroup R1 group of people, whose closest relatives are in Central/South Asia (R2) , and whose more distant relatives (Q) are in Siberia and the Americas, were from the beginning an "intermediate population" between West and East Eurasia. The R1 group of people in its R1b and R1a varieties first appear in Europe during the Copper Age, and they are lacking in early Neolithic sites.

Eight years ago, and in a totally different context, I wrote:

Similarly, 9 out of 10 Basques are descended from a man who has also fathered 9 out of 10 Kets from Siberia and 9 out of 10 Maya Indians from America. That man, founder of haplogroup P thus has descendants who belong to two of the major human races (or three, if Amerindians are considered as separate from Asian Mongoloids)

...

In conclusion, human continental populations form groups of genetic and phenotypic similarity, and these groups can be considered races in the phenetic sense. However, these groups are not monophyletic, hence in the cladistic sense they should not be considered as valid taxa. Since the principle of common descent is generally applied in modern systematics (or at least it should!), I think it's best not to recognize human subspecies.

If these data pan out, it may be revealed that the European branch of the Caucasoids is actually a product of admixture too, with at least two of its constituent elements being the "Palaeo-West Eurasians" (Y-haplogroups G, IJ, possibly LT) and the "Neo-NW Eurasians" (Y-haplogroups N1 and R1), with the "Neo-Afrasians" (Y-haplogroup E1b1b) forming a third element.

(A raw dump of fourpop output can be found here).

August 20, 2012

Visualizing admixture differences with ACD tool

Vaêdhya has created a new ACD tool that allows one to visualize differences between sets of populations in terms of admixture components. He also posts two examples of the application of his tool on data generated by myself in the Dodecad Project, as well as by the Harappa Project.

I have speculated about the origins of Indo-Iranians before, noting that the evidence links even the Kurds with a "South Asian" component; in subsequent higher-resolution analysis, such as the K12b, it appeared that this component was related to the Gedrosia component. In any case, the evidence is clear about the links of different Iranian and Indo-Aryan groups, so it is nice that this can be made evident with the ACD tool and data from the Harappa Project. Notice the excess of the Baloch (~Gedrosia) component in Kurds and Iranians in contradistinction to the Indo-European Armenians and Semitic Assyrians. It is fairly clear to me that the Iranian ancestral homeland is to be sought to the east, with the Bactria-Margiana Archaeological Complex (BMAC) being a good candidate for its location.

In a second plot, Vaêdhya uses Dodecad data to contrast patterns of differences in Northeastern Europe. Here, too, the patterns are clear, with Finns, and secondarily Russians showing an excess of Siberian ancestry relative to Poles. This is, no doubt, due to the Finnic element, which links Finns, and the Uralic substratum in Russians with Siberia. A second contrast is between Finns and Russians/Poles. The latter have more of the Caucasus component, a probable legacy of the Bronze Age Indo-European invasion of Europe. A final contrast is the higher Atlantic_Med element in Poles, which suggests an excess of early Neolithic farmer ancestry, or, admixture with West European populations such as Germans and others who possess more of this component than Slavs.

July 28, 2012

Complex Y chromosome structure in East Tyrol (and more IE thoughts)

Cultural diversity can disappear in a few generations, but genetic diversity -barring major genocides or disasters- usually persists.

The East Tyrol region in Austria has been Germanic-speaking since the Middle Ages, but historical evidence documents the presence of Romance, Germanic, and Slavic groups in its territory. How can we untangle the origin of the different groups when they are all jumbled up together now, and all Germanic-speaking? Previous research has shown that patrilineal groups can be isolated on the basis of surnames, but in the case of East Tyrol, the wide adoption of surnames happened after the region had become linguistically Germanic.

The authors of the new paper exploited the structure of local toponyms, in particular pasture names. The figure on the left shows the concentrations of Slavic (panel A), Romance (panel B), and Germanic (panel C) pasture names. While Germanic pasture names are evenly distributed, there is a contrast between those of Slavic and Romance origin. From the paper:

From the 853 analyzed pasture names in East Tyrol 71% were derived from Germanic (Bavarian) etymons, 17% from Slavic etymons, and 12% from Romance etymons. While pasture names with Germanic etymons were evenly distributed in high density within the whole study area the names with Slavic etymons were spatially focused in the east and north of East Tyrol (Fig. 2). Geographically, these are the lower Drau, Isel, Kals, Virgen and the Defereggen valleys (Fig. 1). No names with Slavic etymons were found in the southwestern Puster valley (Fig. 2). The pasture names with Romance etymons focus mainly in the southern part of East Tyrol (Gail, Puster, and Villgraten valley, Fig. 2). The slight northeastward trend observed in the distribution of Romance etymons is solely caused by the Kals valley, a medieval Romance linguistic enclave, which was separated from the Romance main territory in the 10th century [36]. On the basis of these results, East Tyrol was divided into two regions of former Romance (Puster, Gail, and Villgraten valley; region A) and Slavic (Isel, lower Drau, Defereggen, Virgen, and Kals valley; region B) main settlement (Fig. 2).

The authors dissected the occurrence of different haplogroups in the two contrasting regions (A: Romance, and B: Slavic) in some great detail:

Splitting the East Tyrolean population sample into regions A and B resulted in a partitioning of haplogroups E-M78, R-M17, R-M412/S167*, and R-S116*. E-M78, R-M17 and R-S116* Y chromosomes were exclusively found in region B whereas samples assigned to R-M412/S167*, R-U106/S21, and R-U152/S28 reached higher frequencies in region A (Fig. 3, Table S7). When attributing the samples to the fathers' and grandfathers' places of birth/residence, as reported by the participants, practically identical patterns were obtained for most of the haplogroups (Fig. 3).

Y chromosomes belonging to haplogroups G-P15, I-M253, and J-M304 showed much lower regionalization in their frequencies (Fig. 3) at all three generation levels.

The non-localization of the G-P15, I-M253, and J-M304 seems reasonable as these may represent what is common in these populations (and one could indeed speculate -on the basis of current ancient DNA knowledge- that they correspond to Neolithic, Paleolithic, and Bronze Age processes respectively)

Two of the most interesting findings are:

Haplogroup R-M412/S167* was found at low frequencies in the combined East Tyrolean sample. However, the R-M412/S167* chromosomes were sorted by the subdivision of the study area and reached in region A levels of ~14% whereas their frequency in region B was well below the 5% threshold. At the probands and fathers level of analysis region A featured approximately fourfold higher frequencies of these chromosomes than region B. This ratio changed to about nine when placing the samples at the grandfathers' places of birth/residence. These contrasts remained statistically significant after correcting for multiple comparisons [22] at the fathers and grandfathers analysis level.

and:

The western border of the geographic expansion of haplogroup R-M17 Y chromosomes is to be found in Central Europe and largely follows the political border separating present-day Poland (57%) and Germany (East: ~30%, South: ~14%, West: ~10%) [42]. Frequencies of about 15% and 10% were also found for Austria [18] and North-East Italy [48], respectively. In South Italy and in West Europe R-M17 chromosomes are not present at informative frequencies.

In this study, the proportion of Y chromosomes carrying the derived M17 allele was 14.1%, a value that nearly perfectly matched those reported for West Austria (North Tyrol, 15.4%) and South Germany (Munich; 14.3%) [18], [42]. However, haplogroup R-M17 was completely absent in the East Tyrolean sub-sample from region A, but made up to 16% in region B. This result remained practically unchanged when assigning the probands to their respective fathers' or grandfathers' places of birth/residence (Fig. 3).

The new study reinforces my belief that R-M17 was not originally present in the Italo-Celtic branch of Indo-European. Together with the paucity of the same lineage in Albanians (~5%), Armenians (less than 5%), and its quite uneven distribution in Greeks, it is becoming increasingly clear that R-M17 may represent a late entrant that affected minimally southern and western Europe.

The fountain of its spread was probably a trans-Caspian (?) Central Asian staging point that followed a counter-clockwise route into Europe that spawned the northern (Germanic and Balto-Slavic) groups of Europe and the Indo-Iranians, who remained longer in their BMAC homeland, finally breaking down during the 2nd millennium BC. This would also harmonize with the increasing evidence for complementary R-M17 distributions in Europe and Asia, associated with the Z93 marker.

It might appear that Z93+ chromosomes may track the later expansion of the Indo-Iranian world. I have observed before that R-M17 seems distributed in a long arc north and east of the Caspian, and it is perhaps in different points along this arc that the dominant European (NW) and Asian (SE) types emerged out of the early Neolithic population.

Combining this insight with an analysis of Y chromosome variation within the Graeco-Armeno-Aryan group, it appears that Graeco-Armenian is characterized predominantly by J2+R1b related lineages, while Indo-Iranian by J2+R1a related lineages. The evidence for Tocharian would involve J2+R1b related lineages. Overall, it would appear that the earliest J2 core of PIE affected two different groups of populations living on complementary sides of the Caspian:

The trans-Caspian R-M17 population followed an early (3rd, or late 4th millennium BC?) north-west trajectory into Europe (associated with northern European groups such as Balto-Slavic and Germanic) as well as a later expansion (2nd millennium BC? associated with climatic deterioration in BMAC) that brought Iranian speakers to the steppe, as well as to Iran, and Indo-Aryans to South Asia.
The cis-Caspian, trans-Caucasian R-M269 population followed an early (late 4th millennium, early 3rd millennium?) expansion into Europe, probably together with J2 in the Balkans (Graeco-Phrygian, perhaps Thracian), and arriving in the form of Bell Beakers in Western Europe (Italo-Celtic), as well as a later (2nd-1st millennium BC?) expansion to the east (Tocharians)

This long excursus was necessary as a preamble to an explanation on what happened in Europe itself, which brings us back to the topic of the current paper:

The lack of structure between regions A and B with respect to haplogroup J, together with the great difference in levels of this haplogroup between Italy and the Celtic world, suggests that Italian J-related lineages may have been inflated in proto-historical and historical times. There are candidates a-plenty: Greeks, Etruscans, Trojans to name but three. Excess of J in Italy, relative to the Celtic world, clearly relates to the abundant traditions of eastern origins for the historical groups of Italy.
It would appear that during proto-history, most of Europe was dominated by three sets of IE people (R-M269 in the west, who had transmitted Proto-Celto-Italic; R-M17 in the northeast of Proto-Balto-Slavic speech, and Proto-Germanic in-between, participating in both worlds, and --appropriately-- often linked with either Italo-Celtic or Balto-Slavic linguistically)
There were other (now-extinct) groups as well: the Illyrians vs. Thracians in the Balkans with complementary Y chromosome distributions, the former including an extra chunk of aboriginal legacy (haplogroup I), no doubt due to the much more difficult terrain of the western Balkans. These are contrasted with our final group, the Greeks who straddled three worlds (the Paleo-Mediterranean world of the first farmers, the Thraco-Phrygian world linked to the Indo-Iranians at a deeper level, and the Anatolian world)

The boundaries between these various groups were a little blurred in the course of history. But, apparently, they were still a little clearer during the Middle Ages, and probably much clearer before the Völkerwanderung of the Germans, and the expansion of the Slavs.

Geneticists are executing a remarkable pincer movement, zeroing in on the period of European ethnogenesis from both the remote past and the present: through a study of ancient DNA from the dawn of history, they are beginning to understand how Europe was peopled, layer after layer of settlement; and through the study of surnames and toponyms they are drilling ever deeper into the pre-genealogical past. Together with much anticipated technological progress related to full genome sequencing and ancient DNA extraction, it will not be long before the history of Europe will be laid bare in remarkable detail.

PLoS ONE 7(7): e41885. doi:10.1371/journal.pone.0041885

Pasture Names with Romance and Slavic Roots Facilitate Dissection of Y Chromosome Variation in an Exclusively German-Speaking Alpine Region

Harald Niederstatter et al.

The small alpine district of East Tyrol (Austria) has an exceptional demographic history. It was contemporaneously inhabited by members of the Romance, the Slavic and the Germanic language groups for centuries. Since the Late Middle Ages, however, the population of the principally agrarian-oriented area is solely Germanic speaking. Historic facts about East Tyrol's colonization are rare, but spatial density-distribution analysis based on the etymology of place-names has facilitated accurate spatial mapping of the various language groups' former settlement regions. To test for present-day Y chromosome population substructure, molecular genetic data were compared to the information attained by the linguistic analysis of pasture names. The linguistic data were used for subdividing East Tyrol into two regions of former Romance (A) and Slavic (B) settlement. Samples from 270 East Tyrolean men were genotyped for 17 Y-chromosomal microsatellites (Y-STRs) and 27 single nucleotide polymorphisms (Y-SNPs). Analysis of the probands' surnames revealed no evidence for spatial genetic structuring. Also, spatial autocorrelation analysis did not indicate significant correlation between genetic (Y-STR haplotypes) and geographic distance. Haplogroup R-M17 chromosomes, however, were absent in region A, but constituted one of the most frequent haplogroups in region B. The R-M343 (R1b) clade showed a marked and complementary frequency distribution pattern in these two regions. To further test East Tyrol's modern Y-chromosomal landscape for geographic patterning attributable to the early history of settlement in this alpine area, principal coordinates analysis was performed. The Y-STR haplotypes from region A clearly clustered with those of Romance reference populations and the samples from region B matched best with Germanic speaking reference populations. The combined use of onomastic and molecular genetic data revealed and mapped the marked structuring of the distribution of Y chromosomes in an alpine region that has been culturally homogeneous for centuries.

Link

July 18, 2012

fastIBD over 2,257 Europeans

Razib points me towards a very interesting new paper that applies fastIBD over the large POPRES dataset of Europeans. The most interesting thing about this is that the authors develop techniques for estimating the time depth of the pattern of common ancestry across Europe, and hence are able to conclude that the Slavic expansion has played a bigger role in European history than the Germanic one.

A worthwhile improvement would be to apply a clustering algorithm like I did back in January over the fastIBD output; that way, one does not have to arbitrarily partition Europe into regions, but have the partitions jump out of the data.

A different idea to confirm the scenario presented in this paper would be to drill into different European populations. For example, in the case of the Italians, it would be worthwhile to identify whether there are particular sub-populations with likely Greek or Albanian ancestry who share an excess of IBD with modern Greeks and Albanians.

Population averages may mask such interesting patterns lurking in the data. For example, sub-clusters within populations can be identified with both fineSTRUCTURE and fastIBD, and the corresponding clusters can be assessed with supervised ADMIXTURE to detect how they differ from each other. For example, using this technique, I was able to infer 3 sub-clusters within the ethnic Greek population:

pop8 (mainland Greek) with ~23% North_European
pop11 (Greek Cypriot) with ~5% North_European
pop14 (Cretan, islander, mainland+Asia Minor) with ~12% North_European
I have also a strong hunch based on a few half Pontic Greek+half mainland Greek data points that unmixed Pontic Greeks would be related to pop22 (Northeastern Anatolia) with ~5% North_European

Based on these results and the fastIBD analysis of Ralph and Coop (the POPRES Greek sample is from northern Greece), it might appear that a hefty portion of the North_European component in Greeks may date to the medieval period, since it is relatively smaller in eastern Greeks and Cypriots and also in the South Italian/Sicilian cluster pop16 of a different analysis, with Italians as a whole lacking the eastern European affiliations of some Greek groups.

Interestingly, ~5% North_European levels would be similar to those of Armenians who are the closest linguistic cousins of the Greeks within the Indo-European family, as well as the the Anatolian Turkish cluster pop13 at ~9%.

Overall, it would appear that some mainland Greek groups received some input as the result of the medieval Slavic intrusions, since the mainland North_European excess appears as a "wedge" within the South Italy/Sicily/Crete/Anatolia/Armenia arc and the fastIBD pattern of sharing suggests that this is due to fairly recent connections.

As I have pointed out before, one limitation of the method of counting shared blocks of ancestry is that it does not disclose the directionality of gene flow. For example, gene flow between Germans and Slavs is detected in this study, which could be ascribed to Germans living in eastern Europe and/or to Slavs becoming acculturated Germans as a result of living within Germanic states or intermarrying with them prior to the age of the nation state.

Finally -and most interestingly- I hope that similar haplotype-based methods can be applied to a wider dataset, because, as it is becoming clear, Europe has not been isolated from Asia or Africa during its long history. The authors mention "Slavic or Hunnic" as an explanation for the pattern of shared ancestry in eastern Europe, but it is only by including Asian groups that we can detect the existence of real Hunnic (or Avar, or Mongol, or Pecheneg, or, ...) ancestry.

Moreover, I am confident that the Bronze Age is well within the power of haplotype-based methods to detect IBD. For example, South Asian populations clearly show differential patterns of affiliation with modern West Eurasian groups, most of which can date to no later than the Bronze Age. Together with the gradual incorporation of the new ancient DNA genomes that are bound to be coming our way soon, it seems that our picture of not only recent history, but also of late prehistory is bound to become much sharper.

arXiv:1207.3815v1 [q-bio.PE]

The geography of recent genetic ancestry across Europe

Peter Ralph, Graham Coop
(Submitted on 16 Jul 2012)

The recent genealogical history of human populations is a complex mosaic formed by individual migration, large-scale population movements, and other demographic events. Population genomics datasets can provide a window into this recent history, as rare traces of recent shared genetic ancestry are detectable due to long segments of shared genomic material. We make use of genomic data for 2,257 Europeans (the POPRES dataset) to conduct one of the first surveys of recent genealogical ancestry over the past three thousand years at a continental scale. We detected 1.9 million shared genomic segments, and used the lengths of these to infer the distribution of shared ancestors across time and geography. We find that a pair of modern Europeans living in neighboring populations share around 10-50 genetic common ancestors from the last 1500 years, and upwards of 500 genetic ancestors from the previous 1000 years. These numbers drop off exponentially with geographic distance, but since genetic ancestry is rare, individuals from opposite ends of Europe are still expected to share millions of common genealogical ancestors over the last 1000 years. There is substantial regional variation in the number of shared genetic ancestors: especially high numbers of common ancestors between many eastern populations likely date to the Slavic and/or Hunnic expansions, while much lower levels of common ancestry in the Italian and Iberian peninsulas may indicate weaker demographic effects of Germanic expansions into these areas and/or more stably structured populations. Recent shared ancestry in modern Europeans is ubiquitous, and clearly shows the impact of both small-scale migration and large historical events. Population genomic datasets have considerable power to uncover recent demographic history, and will allow a much fuller picture of the close genealogical kinship of individuals across the world.

Link

February 29, 2012

Serbian Y-chromosomes

Gene. 2012 Jan 31. [Epub ahead of print]

High levels of Paleolithic Y-chromosome lineages characterize Serbia.

Regueiro M, Rivera L, Damnjanovic T, Lukovic L, Milasin J, Herrera RJ.

Abstract

Whether present-day European genetic variation and its distribution patterns can be attributed primarily to the initial peopling of Europe by anatomically modern humans during the Paleolithic, or to latter Near Eastern Neolithic input is still the subject of debate. Southeastern Europe has been a crossroads for several cultures since Paleolithic times and the Balkans, specifically, would have been part of the route used by Neolithic farmers to enter Europe. Given its geographic location in the heart of the Balkan Peninsula at the intersection of Central and Southeastern Europe, Serbia represents a key geographical location that may provide insight to elucidate the interactions between indigenous Paleolithic people and agricultural colonists from the Fertile Crescent. In this study, we examine, for the first time, the Y-chromosome constitution of the general Serbian population. A total of 103 individuals were sampled and their DNA analyzed for 104 Y-chromosome bi-allelic markers and 17 associated STR loci. Our results indicate that approximately 58% of Serbian Y-chromosomes (I1-M253, I2a-P37.2, R1a1a-M198) belong to lineages believed to be pre-Neolithic. On the other hand, the signature of putative Near Eastern Neolithic lineages, including E1b1b1a1-M78, G2a-P15, J1-M267 and J2-M172 and R1b1a2-M269 accounts for 39% of the Y-chromosome. Furthermore, an examination of the distribution of Y-chromosome filiations in Europe indicates extreme levels of Paleolithic lineages in a region encompassing Serbia, Bosnia-Herzegovina and Croatia, possibly the result of Neolithic migrations encroaching on Paleolithic populations against the Adriatic Sea.

Link

December 23, 2011

Multiple origins of Russian mtDNA

First PC of mtDNA variation on the left. From the paper:

The genetic distances from the Russians to the Europeanlanguage groups indicate that the gene pool of present-day Russians bears the influence of Slavic, Baltic,Finno-Ugric and, to a lesser extent, Germanic groups, aswell as Iranian and Turkic groups.

...

The results of this study strongly suggest that the impact of the pre-Slavic (Finno-Ugric) population on the East European Plain is the most important factor for the northward and southward differentiation of the present-day Russian gene pool. This explanation supports the view proposing the genetic influence of Finno-Ugrians on the formation of the northern regions of Russia, which was inferred from mtDNA marker studies of some Russian populations (Grzybowski et al., 2007) and Y-chromosome analysis (Balanovsky et al., 2008).

Being quite distant from the Finno-Ugric group, the Southern Russians consequently differ from the Northern Russians in their closeness to the Germanic group. This difference indicates that the Germanic people played a significant role in the development of the southern, but not the northern segment of the Russian gene pool. In general, the Germanic influence on the formation of the Russians is not as obvious as the impact of the Slavic, Baltic, and Finno-Ugric people. However, strong interactions between the Germanic and Slavic tribes have been found in archeological materials dating from the mid-first millennium B.C. to the early first millennium A.D. These interactions were the strongest on the northern coast of the Black Sea, in the area of the multiethnic Chernyakhov archeological culture (second to fifth centuries A.D.). In the second half of the first millennium A.D., the descendants of this culture colonized the southern regions of the historical Russian area (Sedov, 1994, 1995). However, there is no evidence in the historical literature of the interaction between the Germanic tribes and the Slavs (and later, the Russians) after the Slavic colonization of the East European Plain. Therefore, the Germanic influence could not have occurred after the early part of the first millennium A.D., which was before the eastward Slavic migration (Sedov, 1994, 1995). Apparently, the impact of the Germanic people on the Chernyakhov Slavs affected the gene pool of modern Southern Russians, consequently differentiating them from the Northern Russians (Fig. 6).

Am J Phys Anthropol DOI: 10.1002/ajpa.21649

Russian ethnic history inferred from mitochondrial DNA diversity

Irina Morozova et al.

With the aim of gaining insight into the genetic history of the Russians, we have studied mitochondrial DNA diversity among a number of modern Russian populations. Polymorphisms in mtDNA markers (HVS-I and restriction sites of the coding region) of populations from 14 regions within present-day European Russia were investigated. Based on analysis of the mitochondrial gene pool geographic structure, we have identified three different elements in it and a vast “intermediate” zone between them. The analysis of the genetic distances from these elements to the European ethnic groups revealed the main causes of the Russian mitochondrial gene pool differentiation. The investigation of this pattern in historic perspective showed that the structure of the mitochondrial gene pool of the present-day Russians largely conforms to the tribal structure of the medieval Slavs who laid the foundation of modern Russians. Our results indicate that the formation of the genetic diversity currently observed among Russians can be traced to the second half of the first millennium A.D., the time of the colonization of the East European Plain by the Slavic tribes. Patterns of diversity are explained by both the impact of the native population of the East European Plain and by genetic differences among the early Slavs.

Link

October 01, 2011

Secular trends in some Russian populations

Anthropol Anz. 2011;68(4):367-77.

Secular trends in some Russian populations.

Godina EZ

Abstract
Secular changes of body measurements in children have been the subject of studies in many different countries. In recent years, there has been an increase in BMI associated with a significant trend towards obesity in both Europe and the US. The aim of the present study was to analyze trends in body measurements and BMI in Russia from the 1960's to the beginning of the 21st century. This was done at three locations of the Russian Federation: the city of Moscow, the cities of Saratov and Naberezhnye Chelny in the Volga-river area. In addition, data on secular changes of Abkhazian children were analyzed. A large number of anthropometric measurements were taken on each individual including height, weight, arm, leg and trunk lengths (estimated), body diameters and circumferences, skinfold thickness, head and face dimensions. Stages of secondary sex characteristics also were evaluated; data on menarcheal age were collected by status-quo and retrospective methods. Changes in hand grip strength have been evaluated in some of the samples. While stature was increasing during these years, weight, chest circumference and BMI were characterized by negative changes, which became more obvious in elder girls. Changes in handgrip strength also showed negative trends. There were noticeable changes in head and face measurements, which were expressed in more elongated head and face forms, i.e. the head became longer and narrower with narrower and higher faces. Secular changes in head and facial morphology may be considered part of the general trend.

Link