mandag 10. mars 2014

La Braña 1 closest to Basque, Sardinians and Lithuanians in geneology

This is the second look at the recently released La Braña 1 diploid genome. The previous analysis was done using Plink MDS tool and gave a limited overview. This time I am using first the Chromopainter-Finestructure unlinked model and second the Chromopainter-Finestructure linked model.

Unlinked model

The unlinked model appear to show a result consistent with the previous analysis with the La Braña 1 in the same heatmap box as Finns and Saamis especially what appear to be East-Finns and North-Saami. However also notice that the La Braña 1 in the heatmap show affiliation to Basque, Lithuanians and a mixed group of Scandinavians and Finns. This affiliation  is interesting as particular Saami and Finns do not show much sharing to Lithuanians and Basque. It may be that La Braña 1 is "ancestral" to these modern populations but through the generation or 7000 years the populations may have drifted and/or mixed with other populations..

ChunkCount 54k unlinked aggregated

The PCA plot generated from Finestructure shows a slightly different plotting in the previous analysis Plink MDS plotting (may be due to pruning). In dimension 3 (horizontal) the La Braña 1 still cluster with Finns and Saamis but in dimension two (vertical) the La Braña 1 is at the same height as the Basque.

 ChunkCount 54k unlinked PCA 

Linked model

As I understand from the authors the unlinked and linked model may give different time depths in the analysis with the unlinked model showing the most ancient time depth and the linked analysis more recent ancestry. This analysis may confirm this statement. The difference between the linked and unlinked model may be the "masking" effect of recombination that may hide ancient relationships seen in the unlinked model. This is to my knowledge the first haplotype based analysis of the La Braña 1 as the La Brãna 1 was phased.

As we can see the linked analysis shows a very different clustering of the La Braña 1. He is in the same heatmapbox as with Western Europeans and in particular with Basque and Sardinians. The La Braña 1 also show strong "heat" to the Lithuanians but the strong affiliation to Finns and Saami we so earlier is in the linked model much weaker but still quite easy to see.. 

 ChunkCount 54k aggregated linked

We can also see in the heatmap that there is strong divergence between haplotypes painted vs haplotypes donated for the La Braña 1. The La Braña 1 find related haplotypes among the modern populations but the modern populations find closer related haplotypes among themself. This suggest La Braña 1 have haplotype variation that no longer exists in the modern populations and also due to recombination the modern populations would make the La Braña 1 haploblocks very small.

The PCA for the unlinked model shows a different clustering from the unlinked model. This time at the center of the plot between the Scandinavians and Vologda Russians.

 ChunkCount 54k linked PCA


As we can see the different models appear to give different results and its due to that in geneology the La Braña 1 closest to Basque, Sardinians and Western Europeans but in ancient ancestry closest to Finns and Saamis. The clustering in geneology with Basque, Sardinians and Western Europeans in the linked model make sense as the La Braña 1 individual was found in today northern Spain. This means that the La Braña 1 like haplotypes is still much present among Western Europeans of today suggesting continuity in the autosomes.

As a sidenote to this also notice the change of clustering for the Saamis, Mordovians and Vologda Russians between the unlinked and linked model. In the linked model Saamis and Finns cluster together and the Vologda Russians and Mordovians cluster with Easern Europeans, but in the unlinked model Finns are seperatae from the Saamis, while the Saamis cluster with Mordovians and Vologda Russians. Also in the unlinked mode the Mordovians and Vologad Russians is part of a greater cluster also including Saamis and Finns.

  1. The affiliation to North Saamis especially seems to change a lot between unlinked and linked mode, from what looks to be the closest to Braña to one of the most distant, which may indicate much drift there's been since.

    Does dimension 1 in the PCA's reveal anything?

    1. I think its the "mask" effect of recombination. In the linked mode it looks for the closest haplotype when it comes to distance in mutations and in recombination and the linked mode doesn't find these among the Saamis but among Basque.

  2. Interesting results! would it be possible to get PCA's with individual codes, it is difficult to distinguish different groups with those colored dots.

  3. By the way, could you check La Braña against Siberians, East Asians and Amerindians?

    This f3 test suggests he is in line with most europeans in Karitiana affinity but more shifted towards Han (between Adygei and North Russians, the populations are from HGDP) than most of them.

  4. Hi Anders,

    If possible, could you try this with MA-1, using only Asian (Middle Eastern, Caucasian, Central Asian, South Asian, East Asian, Southeast Asian, Northeast Asian-Siberian, and Oceanian) populations. It seems ANE peaks in Asia, rather than Europe, so it would be very interesting to see him analyzed in an Asian context.

    1. I plan to use MA-1 in the next ordinary run + a imputed La Braña 1 genoem to match the remaining 289k of SNP in the standard run.

    2. Hi Anders,


      If possible, could I email you my raw-data? I'm not European, but from South Central Asia, so I was wondering if you'd be willing to use my raw-data? I'd truly appreciate this. Thank you in advance.

  5. Anders,
    Is it from the segment size that you estimate the time depth in unlinked and linked form, small segment = older, large segment = younger connection?

    1. No its from chunkcounts alone. There is very high correlation between chunkcount and chunklenght anyway but might be worth checking out.

  6. The Chromopainter/FineSTRUCTURE analysis Lazaridis&co did on Loschbour and Stuttgart aligns with these results.

    Loschbour joins a Northeast European group that contains three subgroups, one group with Lithuanians, Estonians, some Finns and Belorussians, one group with all Vologda Russians and Mordovians and some Finns, and one group with all Ukrainians and some Belorussians (and single individuals from Czech and Finland).

    1. As I understand it the Chromopainter-Finestructure analysis used there was with the linked model. In the case of La Braña 1 the clustering is different between the linked and unlinked model.

    2. True, it will have to wait until Loschbour genome is released to see if different modes produce different results.

      The study also contains additional evidence for different sources of "easterness" in Finns, Mordovians and Vologda Russians, in addition to the ALDER run where the latter two showed a Han admixture signal that was missing from Finns. They compared strongest two-way admixture signals for different populations from fits involving at least one ancient genome and from fits involving only modern genomes. For Finns, Lithuanians, Estonians and Icelandic the strongest ancient signal came from "Loschbour and Abkhazian" fit, and the strongest modern signal for all of them was "Sardinian and Karitiana". The ancient signal was clearly stronger for them all.

      The difference between signal strenghts would probably have been even bigger for Finns if the two individuals clustering with Mordovians and Vologda Russians in their FINESTRUCTURE run would have been removed from the sample (which had 7 individuals total).

      For Mordovians the strength of ancient signal (also Loschbour + Abkhazian) and modern signals (Sardinian + Surui) were almost the same. For Vologda Russians, who all clustered with Mordovians in their FINESTRUCTURE run, the strongest ancient signal was "Loschbour + Chukchi", and the modern signal "Sardinian + Chukchi" was clearly stronger for them than the ancient signal. Unfortunately they didn't test the single Saami sample they had.

    3. ????

      page 121

      "using the Han as a reference and found a significant
      (Z>3) curve for three populations (for Finnish, Z=1.27, w"

    4. Yes, the three populations with a significant (Z>3) curve were Mordovians, Russians and Chuvash (table S14.14)

    5. Yes, but not missing.

      Page 121 Table S14.13

      "but the three populations violating our
      model (Table S14.13) are clearly to the right,
      sharing relatively more alleles with the Han"

      Finns is one of the population that sharing more alleles with the Han and therefore deviate from the main Europeans that form a cline from Stuttgart to Estonia.

    6. Yes, but that can be increased by many things like a different and older type of gene flow, or including the fact that there are two individuals in the sample who cluster with Mordovians and Vologda Russians in heatmap at page 152, unlike others who cluster with Estonians or Lithuanians.

      There are also the f3 signal differences between strength and type for Finns and Russians/Mordovians, page 81 and 82:

      "In a case where admixture was previously detected, e.g.,
      the French for the pairing (Sardinian, Karitiana), a much lower f3-statistic (Zdiff=4.3 standard errors)
      is produced by the pairing (Stuttgart, MA1) involving two ancient samples."

      Similar thing happens with Finns, for whom admixture is detected for pairing (Sardinian, Karitiana), but a much lower f3-statistic (Zdiff 4.1 standard errors) is produced by the pairing (Abkhazian, Loschbour). This is the same as with Icelandic or Estonians or Lithuanians, but not for Russians:

      "The strongest decrease in the value of the f3-statistics when we including ancient genomes is observed
      for Europeans, for which Zdiff>3 except for three cases:
      (i) Ashkenazi Jews where the (Stuttgart, MA1) pairing produces Zdiff=2.3 lower statistic than the
      (Basque, Dinka) (the Dinka may reflect small recently gene flow from Africans4).
      (ii) Maltese where the (Stuttgart, MA1) pairing is Zdiff=2 lower than the similar (Basque, Esan)
      (iii) Russians where the (Loschbour, Chuckchi) pairing is Zdiff=2.7 lower than (Chuckchi, Sardinian)."

      Mordovian signal is barely >3, and Russian signal doesn't even involve Amerindians but Chukchis.

      Also when compared to amerindians and different types of East Asians including Han, Finns don't plot with Mordovians or Russians:

    7. The eastern influence in finns and russians is native american and Han. Finns have more of the native american one and probably indicate a far north ancient gene flow along the coastline of barents sea.

  7. It would be interesting to see how much eskimos have of the amerindian component.

    1. Eskimos (from East Greenland) are more like normal East Asians and Siberians, so Russians and Mordovians show more shift towards them. They probably are more recent than mainstream Amerindian migration.

      "If we zoom on the European panel we see as expected that especially Saamis, Mordovians and Vologda Russians pulls left toward the common East-Asian, Siberian and Native American dimension 2. Note that Finns doesn't show the same level of pull towards left as the Saamis, Mordovians and Vologda Russians."

      "However if looking at dimension 3 /(vertical) we clearly see Saamis and Finns pulling towards the Native Americans dimension at about same level of intensity. It appears to be lacking among Vologda Russians and Mordovians who pull toward the common East-Asian, Siberian and Native American dimension 2 (horizontal). This seem to suggest (also noted by commentators of this blog from ealier posts) that there is different influences from the East in Europe."

  8. Anders, your blog is very informative and interesting, thanks for posting this stuff. I have one question, do you have any Eurogenes K13 results of Sami people?

    1. Thanks. No I have not run any Saami individuals vs the Eurogenes K13.

  9. My gedmatch says I have a lot of La Brana 1. Does this help?