onsdag 15. mai 2013

La Braña individuals and the 1000G European populations

I have finally managed to extract 183k SNPs from the La Braña individuals that matched the 1000 genome (via the dbSNP hg18 database) project SNP's. I also believe I have managed to large extent follow the quality filter procedures used by the authors of the original research paper. The  La Braña genotypes is as also seen in the earlier ancient Gotlander analaysis "haploid" meaning that we only have only have half of the actual diploid genotypes as shown in the example below:

Example haplotype 1 and 2 vertically::

AC -> C
GC -> G
TT -> T

This means that we cant phase the haplotypes, however the La Braña individuals do appear to be very similar so there is reason to believe that they have large runs of similar haplotype segments or ROH meaning we to some extent do have haplotypes.

Knowing that I dont really have the La Braña  haplotypes I still runned the La Braña in the Chromopainter-Finestructure pipeline vs the 1000 genome project individuals. I got the following result for ChunkCounts. Note here that all other individuals have been converted from diploid to haploid to allow proper comparison.


The result seems first to indicate minority African like influence (first left column) for this composite individual, further they appear to have somewhat similar minority East-Asian like influence (last right column) as Finns. The coloring vs European seem to indicate that Finns are closest (lower right blue box) and second Iberians (to up left of Finns) and third to Brits (upper left blue box). The relationship to the Tuscans (down left of the Brits) appears to be the most distant of the 1000 genome European populations.

The worldwide PCA seem to indicate the same as seen above. The La Braña individual here seen as the green dot a little outside the European cluster pulling toward the Africans to the left and upwards toward the East-Asians. The plot is similar to the one seen in the original research paper but in the paper the Finns (red) where more contracted and the La Braña alone in the space towards the East-Asians.

The ChunkLenght heatmap and PCA appears not to provide any useful information ( weird clustering on the trees and PCA) because of very low linkage c=0.05 between the markers used.

I also run Chromopainter-Finestructure unlinked model to compare if the linked model managed to capture any linkage. As shown below we see much the same here meaning the genotype data practically are unlinked.





I also at last did a quick and dirty unsupervised and supervised ADMIXTURE run for this dataset. In the former it appears like there is no minority admixture however its detected some minority admixture in some of the Finns. In the latter supervised run estimating only the ancestry in the La Braña and all others ancestry as given, the La Braña cluster with Finns as also seen in the Chromopainter-Finestructure analysis..

ADMIXTURE unsupervised K=3

ADMIXTURE supervised assuming all others ancestry as given (K=6). 


 ADMIXTURE supervised assuming British, East-Asians and Africans as given (K=3). 




søndag 28. april 2013

Updated Europe Analysis

The updated Europe analysis shows much the same as before. The addition of Germans, Danes and Poles appears to fill out more of the gaps in the PCA plots. The addition of  Greek and Albanian have also made a small group in the PCA that seem to place the Balkans on the PCA map in the vicinity of the Romanians and Bulgarians.

In this run the Scandinavians do not switch between east and west branch when comparing the CC and CL runs as in the World Analysis. They do however branch out at the highest level of the Western European tree likely due to Saami or/and Finnish like influence as seen on the heatmaps, on the other hand we can see that of the European populations the Germans including diaspora individuals of part German ancestry are the Scandinavians closest relative in continental Europe.

Its also interesting to see that according to the color scale that the Scandinavians cluster and the German cluster appears to have very little asymmetry between them suggesting that there is a close geneology between them. The asymmetry vs the Saami and Finns however appears to be less obvious (meaning greater) except for individuals of mixed Scandinavian and Saami/Finn background. Here

The asymmetry occurs because Scandinavians find their closest haplotype neighbours among Saamis and Finns, but Saami and Finns finds to less extent their closest haplotype neighbor among Scandinavians but between themself. This do not appear to be the case Scandinavisn vs Germans where both find their closest haplotypes to the same extent between each other.

Saamis and Finns form on both CC and CL their own branch separating them from both the western and eastern european branches. On the CL heatmap Finns and Saamis closest relatives appears to be the Vologda Russians.  The asymmetry do not seem to be great and what makes the Vologda Russians branch out earliest of the East-European branch.


CL Euro Aggregated

CL Euro Raw

CL Euro PCA D1-D2

CC Euro Aggregated

CC Euro Raw

CC Euro PCA D1-D2

onsdag 24. april 2013

Updated World Analysis

The updated world analysis shows much the same as before. The tree structure is different for the number of chunk counts shared (CC) and an total chunk length shared (CL). In the latter Scandinavians CL clusters with the western European populations but in the former Scandinavians CC cluster with eastern populations.

The heatmaps for both these measures appears to show somewhat equal relationship to western and eastern populations suggesting that Scandinavians are intermediate between these two major branches in addition to have more Caucasus and Middle-East ancestry than Finns and Saami. In addition Saami and Finns appears to show relationship to Vologda Russians and Russians that Scandinavians have less of. The relationship to Lithuanians appears to be closer to Scandinavians than Finns and Saamis in general.

CL Aggregated

CL Raw

CC Aggregated

CC Raw

tirsdag 23. april 2013

Updated local Fennoscandian analysis


New individuals added included a Danish who appears to cluster with Scandinavians. This time BEAGLE was not run with default settings but with settings to increase phasing accuracy and this seem to have changed clustering for some individuals like FI2 who now appears to cluster with Finns who appears to have part Saami minority ancestry. We can as before differentiate between the different groups. The images below is in full size at the source and may be seen by right clicking your mouse and then select "open image" (works in Windows 7 and 8) or downloaded by saving link.





CL Aggregated


CL Raw


CL Individual PCA


CC Aggregated


CC Raw


CC PCA Individual

mandag 1. april 2013

Possible application of asymmetries in Chromopainter

I have for some time tried to understand more about the reasons why asymmetries occurs between received and donated chunks in Chromopainter. I have in an earlier post got the understanding that when there are large asymmetries between these it also means they are mutational more distant also meaning more distantly related and opposite if the asymmetries are smaller.

In essence I understand it the way that if the segments are more closely related to each other the more symmetric the segments would be. In a world with no mutations the segments would be identical like in a IBD (identical by descent) analysis and only divided by recombination there would be no asymmetry.

Motivated by a post on another blog called "No Mongolian admixture in Poland" I tried to see if this asymmetries could provide anything of information not only about Poles but to other European populations as well.

First let us look at the CL asymmetries in Europe. The central position of Poles shows that the asymmetrical distances are small only showing some minor deviations to Finns ad Saamis on one side and to French and Romanians on the other.



We can then look at a number of European populations vs Siberian, East-Asian and Native American populations (please note color scales are identical to all):





The tables shows in general that Native Americans and Siberians appears to have smaller asymmetries to Europeans than East-Asians as they appear green while East-Asians who have larges asymmetries appears more in red.

However internally among European populations there is different asymmetries towards these eastern populations. It appears like continental European populations like Poles, French, Hungarians and Chuvash have similar and less asymmetric profiles while northern populations Finns, Saami, Scandinavians and Vologda Russians have similar more asymmetrical profiles. The Romanians also have a similar profiles to the continental Europeans but with even less asymmetries than these towards the eastern populations.

This must be said to be surprising as f.ex Saamis and Vologda Russians especially among the northern European populations appears in absolutte CL terms to have the largest CL shared with these eastern populations.

This may mean that even the sharing are large vs these eastern popuilations it may not mean that the sharing are closer in geneology than with those who share less CL.

As a side note when looking back at earlier posts, the above may explain the positions of some Siberians vs Europeans on MDS plots not close to Saamis or Finns but on the same Y-axis as French and Romanians suggesting similar genetic variations on that dimension.

Little Study of the Saami, Finns and Scandinavians

tirsdag 5. februar 2013

Updated Europe analysis

New updated Europe regional analysis. I have removed the Orcadians as they give odd clusterings in the PCA because of relatedness, but one British group who have in earlier analysis clustered with them work fine as stand-ins. In the regional analysis more substructure is revealed. Caucasus not included.

As we can see for the Fennoscandians much is as before, but the addition of more continental Europeans have revealed more structure. We see that the 2 new Germans have clustered the dispora individuals Ux who have at least partly German and English ancestry into one group close to one group of the British, also containing a few French and one British.

The 1 new Austrian appears to cluster closest with the Romanians. We see 3 of the 4 new Polish cluster with the Ukrainians and Belorussian cluster while the last cluster with the Lithuanians. Most of the new Russians appears to cluster together with the 2 Estonians already in the project. However note the Estonians appears to pull away from the Russian cluster on the main PCA D1-D2 plot.


CC Europe Aggregated
CC Euro Populations

CC Euro Raw

(PCA plots in order D1-Dx+)




fredag 1. februar 2013

Updated World Analysis

This time I reduced the samplesize for Basque, Sardinians, Italians, Bulargians and Spanish to one each because of capacity problems as I would like to limited the analysis process to two threads to reduce the work with adding and formatting input filer, however they didnt cluster to their own as I expected like with the more distant populations. At the same time I increased the number of individuals to geographic closer population to make sure sample size is large enough to give good phasing.

The new thing this time is that we have have new samples from Germany, Austria, Poland and Russia. In the world analysis as said before its more difficult to differentiate similar European populations but also small sample sizes play a role her as Spanish, Iberian, Italians and French Basque didnt seperate into seperate clusters as in the previous analysis. This analysis will be presented in the next post.

Note that Romanian7 and Romanian3 appears to be Rom people as they cluster partly with different tribes in Pakistan.


CC World Aggregated

 CC World Raw

CL World Aggregated

CL World Raw