mandag 10. mars 2014

La Braña 1 closest to Basque, Sardinians and Lithuanians in geneology

This is the second look at the recently released La Braña 1 diploid genome. The previous analysis was done using Plink MDS tool and gave a limited overview. This time I am using first the Chromopainter-Finestructure unlinked model and second the Chromopainter-Finestructure linked model.

Unlinked model

The unlinked model appear to show a result consistent with the previous analysis with the La Braña 1 in the same heatmap box as Finns and Saamis especially what appear to be East-Finns and North-Saami. However also notice that the La Braña 1 in the heatmap show affiliation to Basque, Lithuanians and a mixed group of Scandinavians and Finns. This affiliation  is interesting as particular Saami and Finns do not show much sharing to Lithuanians and Basque. It may be that La Braña 1 is "ancestral" to these modern populations but through the generation or 7000 years the populations may have drifted and/or mixed with other populations..

ChunkCount 54k unlinked aggregated

The PCA plot generated from Finestructure shows a slightly different plotting in the previous analysis Plink MDS plotting (may be due to pruning). In dimension 3 (horizontal) the La Braña 1 still cluster with Finns and Saamis but in dimension two (vertical) the La Braña 1 is at the same height as the Basque.

 ChunkCount 54k unlinked PCA 

Linked model

As I understand from the authors the unlinked and linked model may give different time depths in the analysis with the unlinked model showing the most ancient time depth and the linked analysis more recent ancestry. This analysis may confirm this statement. The difference between the linked and unlinked model may be the "masking" effect of recombination that may hide ancient relationships seen in the unlinked model. This is to my knowledge the first haplotype based analysis of the La Braña 1 as the La Brãna 1 was phased.

As we can see the linked analysis shows a very different clustering of the La Braña 1. He is in the same heatmapbox as with Western Europeans and in particular with Basque and Sardinians. The La Braña 1 also show strong "heat" to the Lithuanians but the strong affiliation to Finns and Saami we so earlier is in the linked model much weaker but still quite easy to see.. 

 ChunkCount 54k aggregated linked

We can also see in the heatmap that there is strong divergence between haplotypes painted vs haplotypes donated for the La Braña 1. The La Braña 1 find related haplotypes among the modern populations but the modern populations find closer related haplotypes among themself. This suggest La Braña 1 have haplotype variation that no longer exists in the modern populations and also due to recombination the modern populations would make the La Braña 1 haploblocks very small.

The PCA for the unlinked model shows a different clustering from the unlinked model. This time at the center of the plot between the Scandinavians and Vologda Russians.

 ChunkCount 54k linked PCA


As we can see the different models appear to give different results and its due to that in geneology the La Braña 1 closest to Basque, Sardinians and Western Europeans but in ancient ancestry closest to Finns and Saamis. The clustering in geneology with Basque, Sardinians and Western Europeans in the linked model make sense as the La Braña 1 individual was found in today northern Spain. This means that the La Braña 1 like haplotypes is still much present among Western Europeans of today suggesting continuity in the autosomes.

As a sidenote to this also notice the change of clustering for the Saamis, Mordovians and Vologda Russians between the unlinked and linked model. In the linked model Saamis and Finns cluster together and the Vologda Russians and Mordovians cluster with Easern Europeans, but in the unlinked model Finns are seperatae from the Saamis, while the Saamis cluster with Mordovians and Vologda Russians. Also in the unlinked mode the Mordovians and Vologad Russians is part of a greater cluster also including Saamis and Finns.

tirsdag 18. februar 2014

La Braña 1 diploid genome vs Europeans first look

I finally got my hand on the latest La Brána 1 diploig genotype from the Iñigo Olalde et al.2014 paper. In previous analysis of the La Braña 1 and other ancient genomes they have only been the available in a "haploid" state meaning that it has not been possible to phase it and analyze using high resolution linked haplotype based models but instead using only unlinked single SNP models. I also had to "haploidize" all the other indnviduals I compared him with both in Chromopainter-Finestructure, ADMIXTURE or in Plink runs. It could possibly have affected the analysis as this seem to show at least for the La Braña 1 part.

I initially used a 1.1 mill SNP diploid La Braña 1 genotype and of these to 54k SNP matched by current 289k standard panel used in most of the project standard runs. I further LD pruned in PLINK down to 25k and run in PLINK's own MDS plotting function. It gave the following very familiar "V" shape with Saamis and Finns at one branch and Vologda Russians, Mordovians and Lithuanians on the other branch.

Please note this is a preliminary analysis and not the state of the art analysis but as we can see from the positioning the La Braña 1 without much doubt cluster with Saamis and Finns in dimension 1 (horizontal) and 2 (vertical) but not in dimension 3 where Scandinavians and Lithuanians cluster closest.

My first impression is that it is North Saamis and Eastern or Northern Finns who cluster closest in these two first dimensions but not in dimension 3 where they cluster closest with Scandinavians and Lithuanians but the distance appears huge. It may suggest that the La Braña 1 have a variation in this dimension that very much dissappeared but still exists to some extent among Scandinavians and Lithuanians.

The haplotype based analysis that will for certain come later will be able to give a better image of clustering.. The La Braña 1 individual will be included in all future Chromopainter-Finestructure haplotype based analysis and very probably the 24 000 year old individual from Siberia as well.

La Braña 1 diploid genome 25k Plink MDS D1-D2

 La Braña 1 diploid genome 25k Plink MDS D1-D3

If assuming that this position also will be at this location in the future Chromopainter-Finestructure haplotype analysis (as seen in the last run)we may get an idea of what have happening genetically speaking in Europe since 7000 year before present. Please note that the below is not made from the above MDS run but from a earlier Chromopainter-Finestructure run.

Dimension 1 - Agricultural expansion

Dimension 2 - Expansion from the East

 Dimension 3 - Expansion from Northern and Southern edges

Individual results D1-D2
 Individual results D1-D3 

torsdag 13. februar 2014

Europeans and Native Americans

Updated 14/02/2014 with individual results PCA plots

This is a furter investigation of the previous posts about East-Asian influences in Europe. This time I have extended the earlier studies by adding Native Americans. The results appears as expected but also contain unexpected results.

The first dimension in this new Finestructure run we have seen before several times in the previous run and peak on one side among Northern Europeans and the other among South-East Asians and especially Papuans/Melanasians. It will not be discussed more here.

The second dimension (X-asis below) on the other hand clearly separate Europeans from Native Americans, Siberians and East-Asians. The third dimension separate on the other hand Native Americans from East-Asians with the Siberians between.

Dimension 2 (X-axis/horizontal) and 3 (Y-axis/vertical)

If we zoom on the European panel we see as expected that especially Saamis, Mordovians and Vologda Russians pulls left toward the common East-Asian, Siberian and Native American dimension 2. Note that Finns doesn't show the same level of pull towards left as the Saamis, Mordovians and Vologda Russians.

 Dimension 2 (X-axis/horizontal) and 3 (Y-axis/vertical) zoomed Europe

Individual results D2-D3 as above

However if looking at dimension 3 /(vertical) we clearly see Saamis and Finns pulling towards the Native Americans dimension at about same level of intensity. It appears to be lacking among Vologda Russians and Mordovians who pull toward the common East-Asian, Siberian and Native American dimension 2 (horizontal). This seem to suggest (also noted by commentators of this blog from ealier posts) that there is different influences from the East in Europe.

If we move on to dimension 4 (vertical, keeping dimension 2 at horizontal). This is the same dimension as seen in the usual European PCA plot regularly seen on this blog with the charactaristic "V" shape. This is the branch (top) with Finns and Saamis and with Sardinians and Basque at the root (bottom). As we can see Siberians, East-Asians and Native Americans appear to cluster consistent separately vertical along the "European" dimension. I would guess the informed reader would agree without much thought what we see on the top with Siberians being placed at the same level with Saamis and Finns at the upper part of vertical dimension in the plot.

  Dimension 2 (X-axis/horizontal) and 4 (Y-axis/vertical) zoomed Europe

Individual results D2-D4as above zommed into Europe

However if we move down to the middle we see the cluster of East-Asian group. If we from this vertical level move horizontally toward right we see that this "East-Asian" level actually end up with the Mordovians, Vologda Russians, Russians and many Lithuanians with other Eastern Europeans in close proximity. This clustering is also difficult to explain but its striking that the East-Asians appear to fix among at least certain Eastern Europeans given known history of the area.

If we move even further down we reach the Native Americans and if we move right from the vertical level of the Native Americans we will meet Southern European populations like Italians, Basque and Sardinians. This appears to me to a big surprise I find difficult to explain. I have been thinking the thought of post-1492 admixture among Native Americans but these samples have been screened for outside admixture before included in the analysis.

onsdag 5. februar 2014

Europeans, East-Asians and Africans

This is a continuation of the previous post where I investigated haplotype variation between Europeans, East-Asians and Siberians. This time investigate further by including Africans as it shed some more light on the haplotype variation seen between Europeans, East Asians and Siberians.

The first dimension is surprising as I would expect there to be greatest haplotype variation in the dataset The first dimension is not to my suprise between Africans and non-Africans but strangely enough Europeans cluster by themself peaking among northern Europeans while Africans appear to show similarity with the East-Asians. Notice here that the PCA distance between Africans and East-Asians appear rather small.

Dimension 1 - brown Finns/Saami - blue Africans

This gradient map appear strikingly similar to dimension 1 in the previous Euroasian analysis. My interpretation of this connection between Africans and East-Asians I believe is remnants of a Papuan or/and Melanasian like population among todays East-Asians. I have earlier suggested that Africans and Papuans/Melanasians still shows a genetic connection especially to San and Pygmyes from a earlier post suggesting that it was a San/Pygmy like population that first migrated along the southern coast in Asia..

The second dimension dimension is also surprising as it instead of showing African vs non-African variation instead shows a common African-European variation vs East Asians and Siberians.

Dimension 2 - brown - Africans/Europeans - blue - East Asians/Siberians

This gradient map also shows a very striking similarity to dimension 2 in the previous Euroasian analysis. As we can see here this dimension does not only seperate Europeans and East-Asians/Siberians but seperate Europeans-Africans from East-Asians-Siberians. I am very unsure about the interpretation but as dimension 1 it appear to be ancient.

These two dimensions can be summed up into two dimensions and as we can see this plot is identical to the PCA plot dimension 1 and 2 in a previous analysis investigating relationship between Europeans, East Asians and Siberians. This may sugggest that the previous indication of East-Asian ancestry among Southern Europeans may be due to shared Papuan-Melanasian ancestry among East-Asians.and Southern Europeans.

PCA Dimension 1 (horizontal) and 2 (vertical) Overview

 PCA Dimension 1 (horizontal) and 2 (vertical) Overview Europe

   PCA Dimension 1 (horizontal) and 2 (vertical) Overview Europe individual results 

The third dimension finally appear to be a true African vs non-African dimension. The PCA coordinate distance between Africans and non-Africans is very large and outside Africa the haplotype variation appears rather uniform in comparison suggesting a bottleneck or/and foundereffect after leaving Africa. The fact that this dimension first appear as number tree suggesting this variation to be less than the previous dimensions make the previous dimensions intriguing. Maybe its just the effect of oversampling from the European region or maybe its traces of ancient migrations or mixing from earlier than out of Africa events. 

 Dimension 3 - blue - Africans, brown - Non-Africans.

This maybe make dimension 2 and 3 best for investigation of African and East-Asian/Siberian minority ancestry among Europeans. As we can see Spanish and Sardinians appear to have the most African like minority admixture while Saamis, Mordovians and Vologda Russians shows the most East-Asian or Siberian admixture.

PCA Dimension 2 (horizontal) and 3 (vertical)

 PCA Dimension 2 (horizontal) and 3 (vertical) Europe Overview

European zoomed gradient maps:

Dimension 1 - Europe

 Dimension 2 - Europe (Note Saamis should be blue)

  Dimension 3 - Europe  

onsdag 29. januar 2014

Is there a "East-Asian" influence in Continental Europeans? Part II

This post goes further in the the previous post "Is there a "East-Asian" influence in Continental Europeans?". It further elaborate on the separate run Finestructure using European, Siberian and East-Asian samples shown at the down part of the previous blogpost. In that part we only looked at the first dimension 1 and 2 and here we move further to the higher dimensions.

The third dimension is actually the same dimension we have seen many times when doing the PCA plot for the European panel in this project giving the characteristic "V" shape where South Europeans cluster at the root while Finns, Saamis cluster on one branch while Eastern Europeans branch on the other. This dimension is the Saami-Finnish branch variation vs South Europe. This variations peaks on one side among Saamis and Finns and as we can see from the gradient map it also exists consistently among the Siberians but not among the East-Asians. On the other side it peaks among Sardinians, Basque and Italians and the East-Asians cluster here with the South-Europeans. I am very unsure about the interpretation here but as both Siberians and East-Asians is not at the extreme on either side of the variation I tend to believe it may represent a gene flow from Europe towards Siberia and East-Asia. Maybe the northern spread represent a geneflow of Saami/Finnish like hunter gatherers eastward into Siberia and the lower part a geneflow from Europe towards East-Asia through todays India.

Dimension 3 - peaks among Saami, Finns vs Sardinians and Basque

Dimension 4 is the equivalent to the other branch of the "V" in Europe. On one extreme we found the Lithuanians, Mordovians and other Eastern European populations (actually here there is a Chukchi individual that a very little higher value than the top Lithuanian). The other extreme is Basque, Western Europeans, Saamis, Scandinavians and also the East-Asians. Also here I am unsure about the interpretation but it appear to show consistency as in dimension 3 but this time with a different spread. It may be a geneflow spread from Eastern Europe eastward through Siberia.

 Dimension 4 - peaks among Lithuanians vs Basque 

Dimension 5 appear to be a dimension that peaks among East-Asians on one side and Siberians on the other with the Europeans between. As we can see there is tendency to Siberian like influence in western part of Europe.

  Dimension 5 - peaks among Siberians vs East-Asians

This dimension appear interesting with regard to the question if there is any East-Asian influence among continental Europeans. So if we zoom to Europe and remove the PCA elements from outside Europe leaving only the European PCA elements we cen a more detailed view.

Dimension 5 - peaks among Siberians vs East-Asians

What is very striking here is that it appear to peak among Eastern Europeans and to some degree also Finns but appear least among the Basque, Western and South-West Europeans and among Norwegians and Saamis. It may suggest a gene flow from East-Asia that have divided in half an earlier haplotype distribution that may have gone from Western Europe to Siberia but now only remains among Western Europeans Scandinavians and Siberians. This dimension has a striking resemble to another dimension in Europe I have earlier believed to be internal European variation.

The PCA coordinates for all individuals and all dimensions can be downloaded here.

fredag 24. januar 2014

Is there a "East-Asian" influence in Continental Europeans?

Updated 27/01/14

This question have been following me since the last previous blogpost where I found this geographical distribution of chunkcounts PCA dimension 4 between Euroasian populations.

Everybody would probably agree about the northern distribution shown in blue and green apparently showing a genetic connection between North-East European populations like Saamis and Finns and Northern Siberian populations all the way from Fennoscandia to Beringia in Eastern Siberia as shown many times in this blog, from university research and from other bloggers. However if we look closer at the color distribution for the map:

We should of course not take this color distribution to literally as PCA plots distirbutions can be affected by many things but as we can see the red grade is close to the brown. This probably means the whole area from western continental Europe to East- and South-East Asia appear to show haplotype similarity. So it appears not only to have been a northern East-West influence but also a southern East-West influence as well.

To investigate this further I did a seperate run Chromopainter using 23k linked SNP at Chromosome 1 with a selection of individuals that had been phased together with over 2000 individuals using a high number of iterations in BEAGLE minimizing the error rate to a minimum. I used Chromopainters admixture functionality to design a admixture test using the relevant East-Asian, African and Siberian populations. As far I know from previous screening using ADMIXTURE all these individuals appear unmixed without any European admixture.

As we can see North-East Europeans appears to score low on East-Asian influence while continental Europeans from western and central Europe appear to score rather high vs the East-Asians. As expected we see the largest African like influence among the more southern populations and the North-Siberian influence in the more North-East European populations. Please note as this result is based on only 1 chromosome it doesn't always correlate 100% with the result of a whole genome analysis. There is a strong negative correlation between the East-Asian and Siberian component at -0.56 and there is even a stronger negative correlation between the Siberian and African component. The correlation between the East-Asian and Siberian component is weak at -0.17.

We can further see this using a area plot for the same data. The continental European populations from western and central Europe appear quite consistent to be closer to the East-Asian populations.

In Finestructure (from the previous run with the whole genome using superindividuals) the clustering appear to confirm what was observed above. The East-Asian influence appear rather consistent for all European populations included. In the North-East European populations this East-Asian influence appear to be lacking or less but the North-Siberian influence appear as expected from earlier analysis.

Individual results for project participants for PCA dimension 4 shown in the first image above. The first column show the actual PCA plot value and the second the ranking seen from top as in East-Asian and bottom as in Siberian.

This is a seperate PCA run using Europeans, East Asians and Siberians as reference. The X axis (horizontal) shows dimension 1 and the Y-axis (vetical) shows dimension 2. The X axis shows the European vs East-Asian influence, the Y-axis shows the European vs Siberian influence.

Europe vs East Asia and Siberians overview

 Europe vs East Asia and Siberians zoomed at Europeans

  Europe vs East Asia and Siberians zoomed at Europeans detailed 

Gradient map of PCA dimension 1 - blue most East-Asian, brown least East-Asian like.

 Gradient map of PCA dimension 2 - blue least Siberian like, brown most Siberian like 

fredag 10. januar 2014

Euroasian variation gradiation maps

This is a graphic presentation of the 7 Euroasian haplotype based (chunkcount) PCA plots from the latest project run using 289k linked SNP's. The average number of SNP in each segment chunk in the world panel is 13 (with a heavy overweight of Europeans and Northern Europeans in particular). This analysis is at this stage experimental. Please ignore coloring in Africa, Australia and Greenland as no populations are included from these continents.

The first dimension peaks on one side among Finns and Saamis (brown), and on the other side among Panya, Chukchi and Cambodians (light blue or blue). It appears all Fennoscandians in general belongs to this brown component.. This component appear identical to dimension 1 in the previous analysis. As this is the first dimension it also explain the largest variation in haplotypes between the populations. As the "blue side" here appear mostly at the coast peaking among populations that show affiliation to Papuans and Melanasians I suspect it seperate this ancient population from old European hunter gatherers.

Euroasian dimension 1

The second dimension peaks on one side among Lithuanians and Scandinavians and on the other side among the "The Others" containing the remaing individuals, but in the remaining panel it is peaks among Miao and some other East-Asian population. It seem to show a clear division of West and East Euroasian populations. All Fennoscandians belong in general to this western cluster.

Euroasian dimension 2

The third dimension peaks on one side among the "Others" and secondary at Bedouins and other Middle East populations and on the other side among North Siberian populations. It seem to represent influence from the African continent. This influence have reached as far north as South-Scandinavia but to less extent among Saamis and Finns. This dimension may be related to dimension 3 in the previous analysis.

 Euroasian dimension 3 

The fourth dimension peaks among Dai, Cambodian and Han in South-East Asian on one side and among Koryak, Yugagir and Nganassans in North-Siberia. Saamis, Finns and to a degree Scandinavians seems most similar in variance to the North-Siberian group while Continental-Europeans appears more similar to the South-East Asian group.

 Euroasian dimension 4

The fifth component peaks among Saamis, Finns and some South-East Asian populations on one side and among Northern Siberians on the other. Scandinavians appear less related to this dimension. This dimension may be related to dimension 3 in the previous analysis.

This component appears more difficult to explain as other analysis have shown no connection between Saamis, Finns and South-East Asians but to North-Siberians as in dimension 4. It may be an effect of having large sample of Europeans and small sample of other populations however its striking that the clustering appears consistent among the various non-European populations and not spread out randomly as if there was no structure. As far as I know and can remember there has not been done such wide scale analysis before using linked haplotypes so it may be something not seen before.

  Euroasian dimension 5 

The sixth dimension peaks among East-Asians on one side and the Indian subcontinent on the other. Saamis  Finns appear a little closer to the Indian subcontinent than Scandinavians.

 Euroasian dimension 6  

The seventh dimension peaks among the Lithuanians and Koryak on one side and among the Chukchi, South-Indians and among some North-Siberians. In more general terms as the heatmap shows similarity between Eastern Europe and South-East Asians. This dimension as dimension 5 appears difficult to explain and the same stated about this there apply here as well.

 Euroasian dimension 7

The remaining higher dimensions appear to show local variation between Siberian groups.