Totalt antall sidevisninger

onsdag 13. november 2013

Ajv70 and modern European variation II

Updated: 20 Nov 2013. This is a updated analysis of the previous Ajv70 analysis. See previous blogpost about the improvements. This analysis is based on 444k SNPs after all filtering (mostly imputed for all the modern populations, for Ajv70 all SNP are actuals).

Ajv70 appears different from Ajv52 who in the previous analysis appeared to be of mixed ancestry between what appeared to be a Saami like and a Baltic like or Eastern European like population. Ajv70 on the other hand in large cluster with Saamis, Vologda Russians and Mordovians and in particular with Saamis.

CC Euro Overview 444k SNP

As we can see the heatmap ancestry profile for Ajv70 and for the other close clustering groups seem resemble each other. Note that Mordovians and Vologda Russians doesnt cluster with Saamis and Finns in Chromopainter-Finestructure linked mode on the heatmaps but with Eastern Europeans. This is likely due to the linked mode reflect more recent ancestry and that the unlinked mode shows more ancient ancestry and Vologda Russians and Mordovians do seem to have more recent Eastern European admixture and in the past being part of the Finnic and Volga Finnic language area. This is seen both in linked and unlinked mode but probably weights more in linked mode than in the unlinked mode making them cluster with Eastern Europeans in the linked mode. The unlinked mode also give a lower resolution as its based only on allele frequencies adding to uncertainty.

The low affiliation for Ajv70 vs the Mediterranean populations on the heatmaps is only matched by Saamis and Finns among the modern populations. Finns appears more influenced by Eastern European populations and Scandinavians than the Ajv70 while Vologda Russians and Mordovians appears more influenced by Eastern European populations. Saamis show less affiliation to Scandinavians and Eastern Europeans than both Finns and Ajv70, but Ajv70 heatmap profile do seem to resemble the Saami most. 

The PCA dimensions dimension 4-5 showing internal European variation does not at first glance support the common heatmap clustering of Ajv70 vs Saamis, Vologda Russians and Mordovians as the Ajv70 appears among Scandinavians on the PCA and the Saami/Finns and Vologda Russians/Mordovians appear on very different locations on the PCA,  However the heatmaps neither indicate any Scandinavian like ancestry for Ajv70 even the individual's variation is plotted in this cluster. What could explain this seemingly contradictory result that Ajv70 resemble Saamis on the heatmap but not on the PCA is the fact that Ajv70 is not close in geneology to the populations it cluster closest with on the heatmap and that the Saamis, Finns, Mordovians and Vologda Russians are more dissimilar to "Others" and similar to "Others" than the others, however what this dissimilarity and similarity could be can be different..

CC Euro PCA 444k D4-D5

First about "not close in geneology". It can be best observed if we on a Europe PCA included related Orcadians and unrelated British. The Orcadians would form a clear outgroup on the PCA from the British even their ancestry profile on the heatmaps for the Orcadians wouldnt look much different from the British except for their internal sharing due to close relatedness and would else branch close to the unrelated British.

For the same reason Ajv70 would not cluster with Saamis and Finns on the PCA plot even they do cluster on the heatmap and else shows very similar profile with Finns and especially Saami. The explanation would therefore be that Ajv70 was not part in the founder effect that made the modern day Saamis and Finnish cluster due to closer relatedness to each other and therefore Ajv70 appears among the next best they could cluster with, the Scandinavians on the PCA plot (In earlier linked analysis Scandinavians shows both Saami and Finnish admixture) compared to continental Europeans so it would make sense.

Second about "dissimilarity" and "similarity" vs external influences or "Others". In dimension 1 and 2 vs the "Others" (the rest of the world panel) we seem to catch variation that could also explain why the Ajv70 heatmap clustering also include Mordovians and Vologda Russians even the European PCA dimension 4-5 do not support this clustering.


CC Euro PCA 444k D1-D2

This PCA plot describe the degree of similarity and dissimilarity vs the "Others". As we can see here Lithuanians, Scandinavians and many more form the upper left extreme and middle of the plot showing variation closest to the "Others" while the Saamis showing variation most distant to the "Others" in the lower right. The Ajv70 is within the range of the Saamis in this plot. There is also one Vologda Russian within this range too else the Vologda Russians and Mordovians is the next to follow up to the left after the Saamis and Ajv70. In PCA dimension 3 we also have a dimension where the Saamis, Vologda Russians, Mordovians and to some extent Finns is closest to the "Others" (not shown). The position on these dimensions can have have multiple explanations whatever each individual have been or been not influenced in various degree by the category "Others" and the very diverse panel included in this. So this means the dissimilarity for Saamis, Mordovians and Vologda Russians not necessarily is the same dissimilarity just more dissimilar than "Others" category, for example the Saamis probably pull high towards "Others" dimension 3 due to what appears as Siberian like minority ancestry while the Ajv70 who shows no such similar pull probably pull towards the "Others" because of what may be minority Afircan like ancestry (may also be erroneous affiliation due to contamination). This could affect the Finestructure tree clustering in unlinked mode giving adding Vologda Russians and Mordovians to the Saami, Finns and Ajv70 cluster.

Conclusion: Ajv70 appear to have the most similar heatmap profile to the Saamis but this individual do not seem to have been part of the same founder effect that made the higher genetic sharing between modern Saamis and Finns and therefore do not cluster together with the Finns and the Saamis on the PCA instead making Ajv70 cluster "incorrect" with their closest neighbors Scandinavians. Ajv70 neither show any significant degree of influence from the Baltic or Eastern European populations like Ajv52 who made Ajv52 appear to shift away into open space from Ajv70 position among Scandinavians closer to the Eastern European populations. The Ajv52's clustering with Scandinavian-Saami mixed individuals when adding some Baltic like admixture seem to further support that Ajv70 mostly resemble a Saami like population. Next it would make sense to check if Ire8 may have been part of the common modern Saami-Finns foundereffect. Earlier analysis may suggest it to certain degree.

Edit 19/11-13 broader overview with more European and Middle-Eastern populations. It is again very clear that Ajv70 cluster with Uralic or earlier Uralic populations and especially the Saami.


 CC Euro Overview extended 444k SNP 

Individual results:

CC Euro haploid 444k

CC Euro diploid 444k

fredag 8. november 2013

Ajv52 and European variation II

This is a updated analysis of the previous Ajv52 analysis. In the previous analysis it was difficult to find any proper affiliation of the Ajv52 individual. The reason for seems now to be that I only did the mapping- and base quality filtering but not the additional filtering needed to remove what could be contamination. This would as suggested to me by the author of Skoglund 2012 make the ancient genomes appear more African like than they really were. So this time I followed the remaining contamination procedure except for gap filtering as this wasn't described in detail in the supplementary and I removed all positions with multiple reads (author randomly chosed one random if multiple). I also this time didn't do any LD prunning in PLINK as the authors of Chromopainter-Finestructure commented LD would be taken into consideration in the Chromopainter unlinked mode. I also grouped all individuals not of interest into superindividuals ("others") in Finestructure also as recommened by the Chromopainter-Finestructure authors. This analysis is therefore based on 261k SNP after contamination filtering.

As we can see below the Ajv52 have made an interesting clustering at large together with other groups of mixed Scandinavian and Saami or Finnish ancestry and in particular to the group of mixed Scandinavian-Saami ancestry (NOR-SAM). This is interesting if one sees this in light of my earlier Ire8 analysis (Ire8 will be reanalyzed using similar approach later) where Ire8 very clearly clustered with the Saami-Finnish group. If looking at the heatmap we can see that Ajv52 shows an increased affiliation to the Baltic populations compared to others in the larger group at the same time as Ajv52 shows increased affiliation to Saamis and Finns the Baltics continental Europeans and even Scandinavians doesn't have.

CC Europe Overview 261k SNP 

This suggest that Ire8 may have represented the receiving population of ancient Gotland who had similarity to modern day Saamis and Finns while Ajv52 may represent a mix with this population and a population migrating across from the Baltic region.

EDIT 9 Nov 13: have added individual results. I have added heatmap results both in haploid and diploid mode as the Ajv52 analysis is Chromopainter have been done in haploid mode (Ajv52 have only "homologus" data). Diploid mode have been made by using superindividuals.

In diploid mode we see that Ajv52 cluster with NO6, NO7 and SWE40. In haploid mode we see Ajv52 cluster with SWE40_A, SWE40_B, NO7_A, NO7_B, NO6_A, NO6_B, SWE7_A and SWE11_A. All these individuals are of mixed Scandinavian-Saami ancestry.

CC Europe Haploid 261k SNP

 CC Europe Diploid 261k SNP 

Edit 11 Nov 13: The PCA plot for dimension 4 and 5 clearly shows that Ajv52 have a shift toward Eastern Europe compared to Ajv70 who is in the Scandinavian cluster. See discussion about Ajv70 position on PCA vs heatmap in the Ajv70 post as it relates to Ajv52 as well.

CC Europe diploid D4-D5 261k SNP

Edit 19 Nov 13: Extended Overview of Ajv52's clustering with European populations. We still see that Ajv52 cluster with Scandinavian-Saami mixed ancestry individuals.

 CC Europe extended Overview 261k SNP  

tirsdag 15. oktober 2013

Digging deeper in Fennoscandian ancestry II

This is a continuation of the previous analysis (not a new analysis run!) and this time we look at the analysis possibilities of differentiating Scandinavians (Swedes, Norwegians, Danes). This time all individuals who shows considerable Saami (both North-Saami like or South-Saami like) or Finnish admixture in the previous analysis was removed from further analysis to keep any outside influence reduced to a minimum and lumped into the "other" group containing the rest of the world.

The data was run through Finestructures clustering processing but it was unable to differentiate the remaining Scandinavians meaning it appeared by the software to be one population. However in the PCA plot it was possible to infer structure. As we so in the previous analysis the first two dimensions of the PCA plot appears to reflect level of external influences in Scandinavians. In both dimensions the Norwegians clustered in the upper left corner closest to the "others" while Swedes clustered in much of the lower right corner but with a huge spread over large part of the plot. The single Danish individual appear to cluster with the Norwegians.

PCA dimension 1(horizontal) and 2 (vertical)

Dimension 3 on the other hand appears to be internal between Norwegians and Swedes, below together with dimension 2 and seem to give a better clustering. The Danish individual still in the Norwegian cluster but closer to the Swedes.

PCA dimension 2(H) and 3 (V)

So the Chromopainter-Finestructure pipeline appears to be able to differentiate Norwegian and Swedish ancestry even only using 289k SNP's. The division isnt entirely clear cut but there have been populations movement between these countries for many centuries so the classification labels may not be entirely correct and some individuals also have mixed backgrounds.

lørdag 5. oktober 2013

Digging deeper in Fennoscandian ancestry I

I have been the last few months become more aware by the authors of Chromopainter-Finestructure that I have not been doing the analysis completely "according to book" even the software manuals have not being saying it explicitly.

I have from the start had this practice of running the Chromopainter-Finestructure analysis with the whole world panel and then later extracted from the output file the subdata from each run into a European panel and into a Fennoscandian panel for more detailed local analysis. However actually you should use the "superindividual" and "continent" functionality in Finestructure to get the analysis right and no file editing necessary and you also need to do a new ChromoCombine run with the forcefiles with these superindividuals.

I have been testing the difference using my earlier practice in the last standard run (this is NOT a new run!) data with these new settings to compare the results. It appears that after the advice from the authors its possible to extract even more information from the run with the current resolution than I was aware of even at Fennoscandiia level. I have in this reanalysis grouped all others than Fennoscandia into the superindividual "Others".

First the heatmaps. As we can see as before the heatmap identify a range of main clusters as before, some may notice that the numbers of identified clusters seem to be lower than from previous runs.

CC Fennoscandia 289k Aggregated

CC Fennoscandia 289k Raw

We then turn the attention to the PCA plots to see if we can infer more resolution than the standard heatmaps. As the superindividual "Others" contain much of the rest of the world it appears quite distant in both heatmap tree and the PCA. The "Other" group dominate the first 3 PCA dimensions as it appears far away from the Fennoscandia group in sum. The direction of "Others" are indicated on the plot.

PCA D1-D2 Fenno vs Others

PCA D1-D3 Fenno vs Others 

As I understand it these PCA D1-D3 reflect level of external influences on Fennoscandinans possibly from continental Europe as DK1 is closest "Others" in both dimensions. Scandinavians are closest to "others" on both dimension 1 and 2 while Finns and Saamis are closest "others" in dimension 3 especially FI18. In any way all these 3 dimensions can be used to differentiate Scandinavians vs Finns and Saamis. We had a similar dimension in the earlier local Fennoscandia analysis.

In dimension 4 we get a dimension that is able to differentiate Saamis from Finns and Scandinavians. We see North-Saami individuals at the extreme. If we combine this dimension with any other two dimensions we would get a plot differentiating Scandinavians, Finns and Saamis. Notice that we now also get a better grouping of the Finns.

PCA D1-D4 Fenno vs Others

At this point from previous analysis we would not get any more information using my earlier method, however using these new setting we can dig even deeper in the PCA plot. This can be shown below as we move to dimension 5.

As we can see below SWE7 a South-Saami individual clearly stand out in a own dimension separately from dimension 4 peaking in North-Saami. What this mean that using this new method the project can explicit differentiate South-Saami ancestry from North-Saami ancestry.

In earlier analysis the South-Saami ancestry appear to blend in somewhere between North-Saami and Scandinavians but as this show they are in a dimension of their own even they do share ancestry with North-Saami in dimension 4. However the North-Saami share very little of this South-Saami specific component but it appears far more common among other Fennoscandians both Scandinavians and even some Finns than the North-Saami specific component.

PCA D1-D5 Fenno vs Others 

This means that we can combine dimension 4 and 5 to map further explicit for Saami ancestry both North and South in Finns and Scandinavians.

PCA D4-D5 Fenno vs Others

There are more dimensions after dimension 5 but they appear to become more unclear and increasingly reflect individual variation. However dimension 6 (vertical axis) may be something worth looking at in the future as it appears populated with what seem to be western Finns, central Swedes and two Saamis depending on where you set the borderline.

 PCA D1-D6 Fenno vs Others 

CONCLUSION: This study shows that using superindividuals one can extract even more detailed ancestry information from autosomal genetic data within Fennoscandia. This new knowledge will be used in future project updates.





onsdag 18. september 2013

Saami ancestry and the MDLP Oracle-x Population Fitting

Update 08.10.2014: The MDLP calculators have been updated and now do not give the results as shown below. This post is then considered outdated and should not be used as guide vs the MDLP calculators.

This is a short test of MDLP Oracle-x Population Fitting at Gedmatch ability to catch Saami ancestry. Only MDLP calculators have been tested as these are the only one with Saami population reference.

As the test shows finding Saami ancestry using these calculators may give very different and even erroneous results (no Saami ancestry or minor Saami ancestry when there is actually major Saami ancestry) but one stand clearly out as the preferred choice. The "test subject" (with consent) is a North-Saami individual participating in the project with mostly Saami ancestry.


Absent = Saami ancestry not detected.
Minor = Saami ancestry detected but as minority ancestry.
Top but minor = Saami ancestry detected as top population but with less than 50%.
Top majority = Saami ancestry detected as top population with more than 50%.



As the result shows using MDLP K=5 Oracle X "Pct. Calc. Option 1" appears to be the preferred choice to detect Saami ancestry.

EDIT 20/9-13

Please note that all the other functionalities of the different versions of the MDLP calculators like Oracle and Oracle-4 was neither able to find that the Saami individual had majority Saami ancestry. Some where able to infer Saami minority ancestry and some didnt detect anything at all. The exception again is the MDLP K=5 that managed in Oracle to get a Saami population as number two and 3 of 4 in Oracle-4. Thie MDLP 27 calculator (not in Gedmatch) managed using Gaussian method 1 population mode to infer the correct population but failed in the 2,3 and 4 approximations.


onsdag 11. september 2013

La Braña 2 and modern European variation

This is a reanalysis of the La Braña's but this time separately. The La Braña 2 matched the 1000 genome reference panel with 56k SNP's. These SNP's was used together with the 288k SNP's from the standard population that match the 1000 genome reference SNP's to impute the missing 56k SNP's from the La Braña as described earlier. These SNP's was then further LD pruned in PLINK to 26k SNP's and then run through the Chromopainter-Finestructure unlinked pipeline using the world panel. The European panel was then later extracted from the Chromopainter output files and run through Finestructure using 21k SNP's.

The heatmap, tree structure and PCA plot below shows somewhat different result than for the La Braña 1 as La Braña 2 appears to have a position that cluster around with the Scandinavian-Saamis (individuals with both Scandinavian and Saami background).

This means that the original analysis of the composite La Braña need to be adjusted after the findings here. La Braña 2 appears most similar to individuals of mixed Scandinavian and Saami ancestry..

CC Euro unlinked 21k



 CC Euro unlinked 21k detailed 

 CC Euro unlinked 21k D1-D2

  CC Euro unlinked 21k D1-D3

EDIT: 20/9-13

mandag 9. september 2013

La Braña 1 and modern European variation.

This is a reanalysis of the La Braña's but this time separately. The La Braña 1 matched the 1000 genome reference panel with 129k SNP's. These SNP's was used together with the 288k SNP's from the standard population that match the 1000 genome reference SNP's to impute the missing 129k SNP's from the La Braña as described earlier. These SNP's was then further LD pruned in PLINK to 47k SNP's and then run through the Chromopainter-Finestructure unlinked pipeline using the world panel. The European panel was then later extracted from the Chromopainter output files and run through Finestructure using 38k SNP's.

The heatmap and tree structure below shows somewhat different result than for the composite La Braña individual consisting of both La Braña 1 and La Braña 2. As we can see the La Braña 1 do not completely cluster witht the Finns and Saamis anymore but appears to have a more intermediate position between the Saami-Finnish cluster and the Eastern European cluster.

CC Euro 47k unlinked La Braña 1

This is also reflected in the PCA D1-D2 (horisontal and vertical) where La Braña 1 appears to appear in the middle between the Finnish-Saami cluster and Eastern Europeans.

 CC Euro 47k unlinked La Braña 1 detailed 

CC Euro 47k PCA unlinked D1-D2

CC Euro 47k PCA unlinked D1-D3

The analysis of La Braña 2 will follow later

EDIT 23/9-13

Download individual results