onsdag 11. september 2013

La Braña 2 and modern European variation

This is a reanalysis of the La Braña's but this time separately. The La Braña 2 matched the 1000 genome reference panel with 56k SNP's. These SNP's was used together with the 288k SNP's from the standard population that match the 1000 genome reference SNP's to impute the missing 56k SNP's from the La Braña as described earlier. These SNP's was then further LD pruned in PLINK to 26k SNP's and then run through the Chromopainter-Finestructure unlinked pipeline using the world panel. The European panel was then later extracted from the Chromopainter output files and run through Finestructure using 21k SNP's.

The heatmap, tree structure and PCA plot below shows somewhat different result than for the La Braña 1 as La Braña 2 appears to have a position that cluster around with the Scandinavian-Saamis (individuals with both Scandinavian and Saami background).

This means that the original analysis of the composite La Braña need to be adjusted after the findings here. La Braña 2 appears most similar to individuals of mixed Scandinavian and Saami ancestry..

CC Euro unlinked 21k

 CC Euro unlinked 21k detailed 

 CC Euro unlinked 21k D1-D2

  CC Euro unlinked 21k D1-D3

EDIT: 20/9-13

  1. You write: "La Braña 2 appears most similar to individuals of mixed Scandinavian and Saami ancestry". It doesn't make sense to me, because on the plots La Brana 2 is closest to SWE38 (me), an individual of mixed Swedish and Forest-Finnish ancestry.

    1. This analysis need a update as I havve only done mapping and base quality filtering and have not follow all the steps to remove SNP's that potential contain contamination.

    2. May add here too that this analysis is based on only 26k markers and the analysis itself is based on a assumption and modell of unlinked markers. The 289k SNP analysis I have done for some time now is based on assumption and model of linked SNP's. The latter give far higher resolution and far higher number of markers and the former analysis have far less resolution giving a much higher margin of error. You can actually see this by compare the similar PCA plots of Europe from these two analysis. The 289k linked clustering is by far better than the 26k analysis this is build on.