søndag 24. juni 2012

Ancient gotlanders vs worldwide variation

EDIT 30/08-2013: I now consider this analysis outdated. Please check the more recent posts with reanalysis of Ajv52, Ajv70, Ire8, Gok4 and Ste7.

I got some second opinions after my last analysis so this time I have compared it to most of my worldwide dataset. The new analysis gave much the same result as the previous one for the ancient gotlanders "Comp22k", however for some of the runner ups the result changed somewhat.

Analysis technical details: ADMIXTURE run K=8, n=2254 individuals, 22k SNPs. All individuals where phased and divided into two individuals before ADMIXTURE analysis. After ADMIXTURE analysis each individual was merged into one individual. This because the ancient data from Gotlanders are "haploid" while all individual and population data is diploid.


The table below have been sorted in decreasing order by percentage belonging to cluster "European". The clusters have been named after what geographic area it has shown the highest frequency. Identified clusters are American, Oceania, East-Asia, Siberia, Medeterian, Europe, Central-Asia and Africa.

The graph below have been sorted in decreasing order by percentage belonging to cluster "European" as above.

As we can see the Saamis went down from number second in the first analysis to number ten in the this analysis. The main reason for this is that Saamis have a major Siberian component it shares with the Chuvash. The Chuvash on the other hand moved from number four to number twenty. The previous analysis appears to have moved the "European" component far north-east.

Finns that is our project Finns appears to have kept high on the list and only moved from number three to number four. The 1000 Genome Finns that was not part of the previous analysis however moved into second place.

The Estonians went up on the list from no five in the previous analysis up to number third. The Lithuanians went up from number seven to number five.

Norwegian and Swedes went up from number nine and ten to number six and seven.


As we can see the ancient Gotlanders had the highest "European" among all the todays populations.

What imidiatly catch the eye is the higher "Africa" component. Its not similar to any of the contemporary populations, not before we reach Spain we see anything close to this frequency. I suppose several hypothesis: 1) its a glitch by ADMIXTURE 2) its remnant from contact with Africans in the Iberian ice refugee 3) some other unknown reason.

We see that the Central-Asian component for the ancient Gotlanders is the smallest in the dataset of neighbouring populations. The closest neighbours is the 1000G Finns and Saami. This may suggest that Central-Asian influence did to less extent reach the ancient Gotlanders, Finns and Saami than other Fennoscandian populations.

We see that the Medeterian component to the ancient Gotlanders is low and in the same neigbourhood as for Finns, Saamis and Estonians. This ïndicate that the Medeterian influence to less degree reached the ancient Gotlanders, Finns, Saamis and Estonians.

The Siberian and other more distant components the ancient Gotlanders appears to have been less influenced by this components than Saamis and Finns, similar to Estonians, but higher than for Norwegians, Swedes and Lithuanians. This could indicate a later Siberian influence that didnt reach the ancient Gotlanders. Note the unfamiliar composition of elements. The ancient Gotlander have the highest frequenncy of the American and Oceanic compoinent in the dataset and zero clustering to the East-Asian component.


In earlier analysis I have shown that Lithuanians, Estonians and Lithuanians in particular appears to have some of the lowest mutation densities to the continental-european populations. As far as I can tell at this stage this is not in compatible with having at the same time the most similar components to the ancient Gotlanders. The Saamis on the other hand appeared to show higher mutation counts to most of other populations. More analysis with other software like Chromopainter may shed some more light on this question.

Note if we remove the Siberian, American, Oceanic and East-Asian component from the Saamis, the Saami would appears as number four among the populations with the highest European component with the ancient Gotlanders as number three and with both Fnns as number one and two, with Estonians and Lithuanians as runner ups.


Individual estimated admixtures:

Top 10 individuals with the smallest estimated component difference:

(Updated 26/06/12)

10 kommentarer:

  1. It's "Mediterranean", not "Medeterian", at least in English.

    Anyhow, the cluster you label "European" is not the real thing but some North or NW European subcluster. The "Mediterranean" cluster should also be "European" because when you compare European and Transmediterranean peoples (West Asians, North Africans), all Europeans cluster together and Basques score as high as Lithuanians in that (or more).

    So you are confusing (and possibly leading others to confusion) when using the term "European" there.

  2. Thanks, this is simply great! I didnt expect to see the k-value 8. It would be outstanding to generate an admix tool for processing individual results. Maybe it is possible to do in cooperation with Gedmatch.

  3. I have not a big urge to partake to the discussion about Europeanness, but maybe we could use terms South and North European. In principle many people migrated to Europe during a long time and now we consider all admixes big enough to be European, but if the admix is smaller we dont consider it to be European, regardless of the age of it. For example, the American (Amer Indian) is very old in Europe, but we still classify it as non-European admix. But as I said, you can have different opinions, and there is a little sense in this classifying.

  4. Maju: The reason why I didnt use the term "European" for the "Mediterranean" (thanks for the right spelling) is that this component spread far into the Middle East and peaks among the Bedouins. At some higher K however this component splintered into two components that peaks in Sardinians and Bedouins that represent Meditteranean and Middle East. The spread of the Middle East component didnt spread signifcantly to the European populations of interest so I used K=8. But formally at K=8 the Bedoiuns not Sardinians peaked at this component so I choose not to label it European.

  5. For example, the American (Amer Indian) is very old in Europe, but we still classify it as non-European admix.

    In this analysis the noise threshold of the "exotic" components is rather high. That is because the levels of the "exotic" components are inflated in this analysis.

    1. By noise threshold, I mean noise upper limit.

  6. It would be better if you included the Neolithic farmer sample (Gok4) in this analysis.

  7. Onur: There is a limitation in the number of SNP's available from the ancient Gotlanders that match available reference genomes. This make the analysis less clean cut than those made from several hundred thousands SNPs. This result in more spillover effect between the different components.

  8. Onur: Yes it could be interesting, maybe one could infer "old" and "new" meditterian components among modern Fennoscandians.