Totalt antall sidevisninger

lørdag 26. mai 2012

Fennoscandians and Neolitic Gotlanders

EDIT 30/08-2013: I now consider this analysis outdated. Please check the more recent posts with reanalysis of Ajv52, Ajv70, Ire8, Gok4 and Ste7.

In end of April 2012 the ancient Gotlander hunter gatherers genome was published by the University of Uppsala. I will now provide an ADMIXTURE based analysis of these samples using project and public published population panels.

The samples was downloaded from authors website and then subject to most of the quality controls as done by the authors. I extracted the relevant SNP for the Ajv's and Ire8 ending up with 22k SNP using three individuals combined into a composite individual.

The composite sample was then subject to control ADMIXTURE and MDS analysis using "haplotized" (phased in BEAGLE) French, Yoruba and Han's as reference populations. The composite individual clustered entirely with French.




I then merged the composite individual with an "haplotized" (phased in BEAGLE) European panel with populations all within the European continent from different public available population panels and run it through ADMIXTURE at K3. Closely related individuals in the panel was removed. One Russian was removed because of odd clustering, reason unknown.

The result was then averaged over populations to get an overview:



As we can see the individuals clustered into 3 main components, North-East Europe, Caucasus and the Mediterian. The Gotland composite neolitic hunter gatherer tops the list of North-East Europe populations, then the Saami and very close follow up the Finns.

As we can see the old Gotlanders appears to have significant Mediterian influence but almost zero Caucasian influence. This may be due admixture or common ancestry with the Mediterian populations. The Caucasian influence appears to be almost as low as the Basque. This may suggest that the Caucasian influence arrived later than the Mediterian one,

In modern individuals you may see that that Scandinavians, British, Lithuanians, Estonians and Finns appears less affected by Caucasian influence than continental European populations like Chuvash, Vologda Russians, Belorussians, Hungarians,French, Spanish, Romanians, Italians and Sardinians. Among the Saamis we see an elevated Caucasian influence. Its possibly linked somehow to the Chuvash. See posts below.

The Mediterrian influence appears to have been less in the North-East among Chuvash, Saami, Finns, Vologda Russians and Estonians, Note the somewhat higher rate among the old Gotlander. This all suggest that the Meditterian influence arrived second after the initial post glacier migration then later a migration from the Caucasus. The latter from Caucasus didnt reach the Basque. There may be support for this suggestion in the paper by Huyghe 2010:


The Caucasion migration appears to have reached the Saamis in particular from the vicinity of Chuvash and then through the Vologda Russians. See Chromopainter analysis below.

SUMMARY: It appears like Finns and Saamis have the best match of modern variation with these ancient Gotlander hunter gatherers. Note however that in this analysis not enough Saamis to provide proper clustering in ADMIXTURE or MDS. This means Finns and Saamis quite often have been lumped togheter in this kind of analysis.

(Updated 31/5/2012)


torsdag 10. mai 2012

Scandinavians - Young East, Old West?

(Updated 7th June 2012)

I will this time look more into the details of the Scandinavians based on the main post with the latest large Chromopainter run (see two posts below).

In the last post we could infer something about the genetic history of Finns. I will here try to look into the genetic data of the our Scandinavain participants together with Saamis and Lithaanians as reference.

Please note I dont have geneological info for all so the analysis will contain certain degree of uncertainty.

Proportions:

First note that that the bars and colors for the Koryak in this presentation are made indepentent of the one from the other populations. The score for proportions is calculated by simply multiplying the proportions for each populations. The rest of the tables use averages.

The proportions score for Scandianvans appears to have little variations and the proportions to all donor populations appears similar. The only differences we appears to see is towards the Saamis and Saami mixed individuals, Lithuanians and Swedes who have partly Finnish origin. Our Lithuanian controls shows strongest affiliation with the Lithuanian donor population.


Number of Shared Segments (ChunkCounts)

Same comment as above.



Total Shared (ChunkLenght)

Same comment as above. Note that the average sums to 707 for all.


Shared Segment Size:

The segment size, that is the total lenght of segments (ChunkLenght) divided by number of shared segments after correcting for proportions.

Here we see our Lithuanian controls dominate the top with the largest segments to Lithuanians and Belorus while the Saami is at the lower lower half with the smallest segments except for larger segments to the Koryak and Chuvash.
We see in general that French, British and Lithuanians appears closest to the Swedes as they have the largest segmentsizes. The most distant appears to be Chuvash, Italians and Romanians. Same with Norwegians.

We also see that after the Lithuanians and Estonians its mostly Norwegians who appears to have in general the smaller segmentsize to the continental european donor populations. On the upper half of the table Swedes dominate until the Saami at the top of the table.

This may suggest that many Swedes are in general more closer to the continental Europeans than Norwegians.

Mutation densities or counted mutations divided by proportion:

The counts of related but not indentical haplotypes corrected for proportions shows as expected the Lithuanians controls at the bottom of the table with the least general mutation divergence the continental European populations and especially to Lithuanians but also Belorussians and Hungarians.

In general we see that Swedes appears to have least divergence to the Belorussians, Lithuanians and British. The largest divergence appears to be to Chuvash, Romanian and Hungarians. The Norwegians appears to be least divergent to Hungarians, Belorussians, Lithuanians. The largest divergence is to the Chuvash, Romanians and Italians. In total Norwegians appears to have some larger divergence than Swedes. That is consistent with Norwegians having smaller segmentsize.

So it appears from the general interferance that Swedes and Norwegians have different divergence influences when it comes to mutations. The connection of Norwegians beeing mutationally closest to Hungarians and at the same time more distant to Swedes is abit suprising but the difference is not great and may be within margin of error.

We see Saami distribute mostly at the upper upper half, we then see that the Swedes and Norwegians appears to distribute down to the lower half. In the lower lower half we find Lithuanian controls. At the top we see U1 who has confirmed geneologically half Scandinavian orign and half British Isles and some minor non-european admixture. The origin of NO14 is also half Fennoscandian and other half Belgian/Dutch. This may for some reason explain their higher mutational density beeing similar to one of the most divergent Saami.

The close runner ups SWE8 and SWE19 have ancestry/partly ancestry from central-Sweden. It may suggest the existence of "older Scandinavian" stock in central-Sweden that have been less affected by later migrations from continental-Europe.


Please note that the correlation between Mutation density vs Segmentsize is -0.61. This means that in general that if segmentsize increase or decrease mutation density is likely to decrease or increase.


Summary:

Scandinavians appear to have received different influences at different times. The average segmentsize also seem to suggest that part of the Swedish populations have diverged later from the continental populations than Norwegians. There also seem to maybe exists Scandinavian "isolates" as seen through the high mutation counts after correcting for proportions especially for some Swedes even the average Mutation density appears similar to Lithuanians..

However this interpretation must be taken by caution as these individuals do not appears as consistent as Saamis in their placement in the different tables. As an example SWE8 who appears as close runner up to SA2 in Mutation density appears average when it comes to segmentsize and ChunkCount density. SA2 however do appears consistent in all these cateogories with small segmentsize, high ChunkCount density and high Mutation density - all consistently suggesting large divergence time.






fredag 4. mai 2012

Differating Finns - Eastern and Western Histories?

(Updated 7th June 2012.)

I will this time look more into the details of the previous post with the large Chromopainter run.

In the process of generating the data and the making of the previous post I noticed there was a certain variation among Finns. I will here try to look into the genetic data of the our Finnish participants together with Saamis, Estonians and Lithuanians as reference.

Please note I dont have geneological info for all so the analysis will contain certain degree of uncertainty.


Proportions:

Lets first look at the proportions. First note that that the bars and colors for the Koryak in this persentation is made indepentent of the one from the other populations. The Score is calculated by simply multiplying the proportions for each populations. Later averages will be used.



The score basically appears as a reverse of the Koryak and Chuvash. We clearly see a gradiant from Saamis to the Estonians and Lithuanians. Lithuanians and Estonians shows of course strongest similarity to the Lithuanian and Belorussian donor populations while the Vologda Russian influence appears to be similar distributed among Finns even among the Saamis. Chuvash and Koryak seem to the main reason for division of Finns here.

Number of Shared Segments or ChunkCounts

Same comments as above.



Total Shared or ChunkLength

Same comment as above.




Segment size:

The segment size, that is the total shared divided by number of shared segments corrected for proportion. We still see the Saami and Lithuanians at each end.



As we can see here from the occurance of green Finns appears to share larger segment sizes with Vologda Russians, Lithuanians and Belorussians, some Finns even with some Chuvash influence. On the other side Finns have smallest segments to Italians, Romanians and Hugarians.


Mutation densities or counted mutations divided by proportion:

The muation densities also again shows the Saamis and Estonians at different ends of the scale.

As we can see here Lithuanians appears to have the most red and yellow vs Finns. Not before we meet the Saami it turn green. Then we have the Hungarians and British. The most distant appears to be the Chuvash, Romanians and French.

Conclusion:

The data seem to suggest different influences for the Finnish populations that divide them genetically. The different segmentsizes and mutation densities suggest that the influences have arrived at different times from same or different sources.

torsdag 3. mai 2012

Geneflows in Fennoscandia

I have after some trying and failing maybe come to a usefull and maybe even powerful way to analyze the project participants genetic history using the Chromopainter software (See earlier posting using Chromopainter and Finestructure software). The analysis provide an in-depth insight that cant be shown using software like ADMIXTURE and MDS/PCA that simply compares allele frequencies from genotypes on assumed indepentent SNP's.

I will in this analysis show that ADMIXTURE and MDS/PCA analysis is not directly wrong but that these programs omitt interesting genetic histories.

Technical info:

289k SNP from 22 chromosomes. Genotypes phased to haplotypes in BEAGLE. A set of 70 Fennoscandians receiptians run trough Chromopainter with 10 donorpopulations with 8 individuals each. HapMap recombination used. Chromopainter run in donor mode with 10 iterations.

INITIAL ANALYSIS:

Proportions:

The proportions highlights is that the Swedes shows the highest proportions to the French and British. This is in accordance with earlier MDS plot analysis. They also show the lowest proportion to the Koryak and the Chuvash. The Saamis shows the highest proportion to the Vologda Russians and to the Chuvash and have the lowest proportions to the Koryak and the Romanians. Note that Saami have the highest proportion to the Koryak in the panel. The Norwegians have as the Swedes highest proportion to the French and British, and the lowest proportions to the Koryak and the Chuvash. The four Lithuanians project participants we have of course shows the highest relationship to the Lithuanian donor population, with Belorussians and Vologda Russians as runner ups. We will se that these Lithuanians will be very useful as controls. The Finns do as the Saamis also have Vologda Russians on the top of the proportion list but instead of Chuvash have Lithuanians as second runner up. The Finns have Koryaks and Italians on the bottom of their list. Estonians appear closest to the Lithuanians and to the Belorussians. They have as the Finns the Koryaks and the Italians at the bottom of their list.


So its clear that from proportions alone that the Swedes and Norwegians appears very similar to each other when it comes to source of influences and that Finns and Saamis have different influences both to Scandinavians and to each other. The positions of these groups on earlier MDS plot support this conclusion.

To empasise these differences more strongly I have ranked the receiptant individuals according to influence from 1 to 70 (70 the total number of project participants). Here a small number means lower influence and high number higher influence vs other Fennoscandians. Note again the differnces between Scandinavians (Norwegians, Swedes) and Finns, Lithuanians and Estonians. Especially the Saami appears to rank low to most other populations in the panel.


The next tables support the descriptions commented above.

Number of shared segments (ChunkCounts):


Please note that higher number of shared segments usually means closer relationship while lower number of segments usually means more distant relationship.

However it is possible that number a lower number shared segments also reflect a more recent shared history. To check out this possibility you also have to take into consideration total shared length to calculate actual shared segment size. If the segment size is large the history is more recent. See next two tables.


Total length of shared segments (ChunkLenght):


Please note that higher total shared segments usually means closer relationship while lower number total shared segments usually means more distant relationship.

However its possible that the total lenght of shared segments is more fragmentet that is have a higher number of shared segments. This would mean that the total shared segments are older. To check out this possibility you also have to take into consideration number of shared segments to calculate actual shared segment size  Se two tables above and below.

Segment size (ChunkLenght/ChunkCount):


Please note that smaller segment size usually should mean older segments while bigger segments usually should mean newer segments both due to recombination that breaks up segments.

As we can see here Estonians, Lithuanians and Estonans appears to have the largest segments vs the donor populations. The Lithuanian controls confirms this. The Scandinavians and the Saamis appear to have the smallest segments. This suggests Estonians, Lithuanians and Finns have a more recent connection to the continental donorpopulations.

Number of related but not identical haplotypes (MutProb):


Please note that higher counted related but not identical haplotype usually should mean closer relationship as it correlates with higher proportions and the other observations seen above. There may however be more to to the data.

As we can see here Estonians, Lithuanians and Estonans appears to have on average the lowest number of counted mutations vs the donor populations. The Scandinavians and the Saamis appear to have the largest number of counted mutations. This suggests Estonians, Lithuanians and Finns have a more recent connection to the continental donorpopulations.

Proportions Correlations:

Correlations 70 participants vs donor populations:


Positive correlations suggests complementary geneflows, if your high on Koryak proportions you likely will also have high Chuvash proportion. Negative correlations are opposite. If you have high Chuvash proportions it will be on the expense of your French proportions. The correlations suggests clear affiliations for Fennoscandians vs the donorpopulations.


LOOKING MORE IN-DEPTH INTO THE NUMBERS

As seen above in tables above. There is no doubt that there are influences from different directions to different groups in Fennoscandia. Could the data be manipulated in a way that let us infer more about the influences genetic history of these groups?


In the last table we so how the counted mutations for related but not indentical haplotypes appears to distribute very similar to the observations of the other tables. Its obviously a high correlation. However how much do these correlate with for example proportions? The correlation is not perferct. Is it possible that the counted mutations are denser or less denser per proportion than for others and is it possible to infer something about history about this?

Lets try it. We simply take the number of counted mutations for related but not indentical haplotypes and divide it by the observed proportions.

A simple example to illustrate what I am trying do to:

Ind 1: Counted mutations 10. Proportion 100% or 1 = 10/1 = 10
Ind 2: Counted mutations 10. Proportion 50% or 0.5 = 10/0.5 = 20

As we can see here even the number of counted mutations for related but not identical haplotypes is higher for Ind 1. Ind 2 actually have 100% higher density of mutations than for Ind 1. I call it correcting for proportion.

Lets check it against the real data:


Lets first check the controls.

The Lithuanians participants have the lowest number of mutations for related but not identical haplotypes even they have the highest proportions and highest counted mutations before correcting for proportions to the Lithuanian donorpopulation. This is because the Lithuanians in our panel are the closest to the donor Lithuanians so that way it conforms that they have least divergence time or the least time to develop differentiating mutations to the donorpopulation.

This means we can say something about divergence times for the proportions observed. It can be a powerful tool.

Overview:

At first we see that two groups stands out on each extremes. The Estonians appears as absolutly closest in general to the continental donor populations even to the Koryaks. Here it appears that Estonians are closest to Vologda Russians. This also make sense when looking at the MDS plots published earlier. On the other side we have the Saamis who appears closest to the Belorussians but in general have the largest mutational distances to all the other donor populations except the Koryaks where the Norwegians beat the Saamis with a few points.

This suggest that the Estonians have the least divergence time to the donor poulations and the Saami have the largest divergence time.

Discussion:

Lets first look at some observations:

Koryak: We see that even the Saamis have the highest proportions that they also have the second highest divergence after Norwegians and before Swedes. The lowest divergence is among Estonians, Finns and Lithuanians. This suggest that the Koryak influence among the Scandinavians and the Saami is older than what is found in the south-east of Fennoscandia and Baltikum.

Chuvash: We see similar as above that Saamis appears to have not only the largest Chuvash proportion but also the largest divergence from the Chuvash. The divergence appears similar for Scandinavians. In south-east Fennoscandia and Baltikum it appears to have more recent influx of Chuvash like related haplotypes. Note that Finns is up to "yellow" color-code to the Chuvash. It may indicate that Finns is in a intermediate zone. They may have partly more divergent haplotypes than furter south like in Lithuania and Estonia.

Estonians: The Estonians appears to show the closest divergence times with the Vologda Russians, Belorussians, Chuvash and Lithuanians. It make sense as Estonians are close neighbours to these population or in the closer region. Note that the Estonians appears as close to the Lithuanian donor populations that Lithuanian project participants. This may be due to the margin of error (what exact numbers I do not know) or due to somewhat different history for the project Lithuanians vs the donor Lithuanians.

Finns: Finns appears to show somewhat less divergence to the Lithuanians and Hungarians. The first observation make sense as Lithuanians are geographically close. However why the Hungarians seem to be closer is a mystery, but the Lithuanians shows a simiar but closer divergence for these populations.

Norwegians: The Norwegians seem to show closest divergence to Hungarians and Belorussians. This is also odd and a mystery but may have something to do with the high frequency of Y-chromosome R1a in Norway.

Swedes: The Swedes shows lowest divergence time to Belorussians and Lithuanians.

Saami: Saamis who have the largest divergence time to almost all populations except Koryaks appears to have somewhat lower divergence time to Belorussians. This is a mystery but maybe connected to the similar lower divergency times for Scandinavians.

Summary observations: It appears like in order of intensity of influence that Estonians, Lithuanias and Finns have the closest divergency time to ALL the continental donor populations. On the other side we have the Swedes, Norwegians and the Saami with the largest divergency times.

Geneflows:

The data then obviously suggest that Lithuanians and Estonians indeed are continental populations, what maybe more unspected is that Finns appears to have been strongly influenced too by more recent immigrations or geneflow from continental Europe. This inflow appears to have come trough Lithuanian and Estonia.

This recent inflow to the Finns appears to even be somewhat more stronger than the inflow from continental Europe to Norway and Sweden. Please note that we do not have any donor population from Denmark, Germany or Poland. It may affect this result somewhat for Scandinavians. It may be that Hungarians and Belorussian influence is indirect or as proxy through Germany or Poland. However its obiously intriging that Finns appears to in general show lower divergence to continental European populations while Scandinavians seem to show this somewhat less.

The Saamis appears to have received the least geneflow at least more recently and therefore keept an older divergency time. The general higher density of mutations towards almost all populations suggest that the Saamis mutations largely are European of origin. If we remove the Koryak and even the Chuvash from the average counted mutations the Saami still hold the position as the most divergent population (See table). Why Saamis appears to show lowest divergency time with Belorussians and Italians is a mystery.

Its also interesting to see the intermediate positions Scandinavians have when it comes to divergency times. They almost look like the Saami minus what we observe further south.

This is how I see it currently. I have no guarantee what published here is absolutly correct. It may be wrong. Please take it into consideration when reading this post.

END.

OBS! Please note that the observed connections may be proxies for populations not included in the panel. If having Chuvash scores it doesnt mean your Chuvash. It means that you have something that resemble Chuvash.