In our previous analysis on this blog we have been using autosomal genotypes and not autosomal haplotypes. A genotype is a collection of unordered genetic data. The raw data you see in your raw data file from 23andme or FamilyFinder is a collection of genotype data. The file is ordered according to the physical position on the chromosomes but the actual order of the SNP like A, C, G or T in the file is random or just alphabetical.
You do not receive your autosome chromosomes genotype in random or alpabetically. You actually receive all your autosomes from each of your parents as smaller or larger segments of haplotypes. A haplotype is basically your autosomal genotype split into two where usually each come from one of your parents. To be able to make your autosomal genotype into a autosomal haplotype the genotype must go trough a procedure called phasing where segments or blocks of haplotypes are reconstructed. These segments or blocks of haplotypes constists of a range of SNP's that is close to each other and because they are close to each other they are unlikely to recombine or split into two in each generation.
These haplotypes you probably share with many other fellow Fennoscandians and Europeans, however you likely share more haplotypes with people or populations belonging to the same ethnicity, close in geography or close in history based on known historical or prehistorical events. Also the segment size or length have a similar connection. You share longer segments with people or populations with the above mentioned reasons than with others. Also finally you likely share more mutations with these populations than others.
NEW SOFTWARE CHROMOPAINTER AND FINESTRUCTURE:
There is a new program called Chromopainter and Finestructure that exploits autosomal haplotype information in genetic analysis. I have now done a preliminary analysis using 1-6 chromosomes. The goal is to exploit all 22 chromosomes. The software generates new plots that may be new to many of you called heatmaps. These maps use colors to show the relationships between individuals and groups. Yellow beeing the most distant genetically while blue being the closest. Genetic similar people usually form groups based on their similarity. The program generates 3 basic heat maps called chunkcount map, chunklenght map and mutation map.
RESULTS: CHUNKCOUNTS
The chunkcount map shows the number of haplotypes shared between individuals including not identical but related haplotypes. It can tell more about common ancestry maybe ancient between groups. In Fennoscandia the program structures the participants into 5 main groups. 1) North-Saami except SA3 2) South-Saami with SA3 and SWE7 3) Finns 4) Swedish and Norwegians 5) Mixed groups consisting of mixed individuals of Scandinavia-Saami, Swedish-Finns and some Ostrobotnian Finns.
The tree shows that North-Saami and Finns share a subbranch while the South-Saamis and the Mixed group share similarly a closely related subbranch. This subbbranch if at the next higher level shared with the Lithuanians, Belorussians and the Vologda Russians. The Scandinavians on the other share a higher level branch with the French, Italians, Hungarians and Romanians. Please note that the Scandinavians and Saami/Finns split from each other at the highest level.
This map can also be presented with populations labels that can be useful in the PCA plot:
For invidual assesment of relationships the pairwise plot should be used:
The identified populations can be presented in a PCA plot:
RESULTS: CHUNKLENGHTS
The Chunklenghts show the lenght of each shared segments seen in the earlier Chunkcounts. Larger segments usually means more recent common ancestry, while smaller opposite. The tree structure is much the same as desribed for chunkcounts but the south-saami group have dissappeared and dived into the North-Saami and into the Mixed group. Also there have been some minor movement of Finns between Finns and the Mixed group. The Finns also appears to have stronger internal sharing in segment lenght than in number of segments. Else the heatmap appears much the same as for chunkcounts.
This map can also be presented with populations labels that can be useful in the PCA plot:
For invidual assesment of relationships the pairwise plot should be used:
The identified populations can be presented in a PCA plot:
RESULTS: MUTATIONS
The mutations matrix show the number of SNP than have mutated compared to other haplotypes. This map is not as clean cut as the earlier presented and should strictly to be considered preliminary, it probably need more chromosomes added to get a better assignment and tree structure. In this plot most Scandinavians are spread out on a large branch consisting of the panel of continental Europeans like French, Italians, Hungarians, Romanians and Belorussians, and only a few with Lithuanians and Vologda Russians. The North-Saami is all in one group here, the South-Saami and Finns have dispersed into what appears as two mixed groups.
This map can also be presented with populations labels that can be useful in the PCA plot:
For invidual assesment of relationships the pairwise plot should be used:
The identified populations can be presented in a PCA plot:
PRELIMINARY END DISCUSSION:
FINNS AND THE SAAMI
The number of shared segment analysis and the lenght of these segments appears extreme among the Saamis and to a lower extent with the Finns (also similar with continentals like Lithuanians and Italians).
There can be several reasons for these higher levels of number of shared segments and the lenght of these segments 1) foundereffect giving a pool of similar haplotypes 2) genetic drift that kills of or fixated haplotypes 3) lack of gene inflow or/and outflow reducing diversity of haplotypes that can match with other populations 4) mutations only found within the group reducing the matching of segments with other populations.
The large number of shared haplotypes and large lenght of these haplotypes you may be tempted to suggest that any outlierness among the Saami and secondary also Finns is due to a shallow foundereffect and following genetic drift however the mutational matrix may suggest at least partly a different history.
This brings us back to the mutation heatmap where the Saami seperate at a higher branch. Here it appears that the Saami have their own mutations and only to a extent share with Finns or the mixed group. The mutations may in part explain the high internal sharing among the Saami and Finns. They have their own seperate mutational history but Finns/Mixed group have closer ties to the surrounding populations are at least partly from individuals in these groups. It is tempting to think that the sharing between Finns and Saami is due to Saami admix, or alternative it due to common source or Finnish admixture before the Finns genetic affiliation with continental populations. This is in accordance with the MDS plot analysis placement of Finns and Saami as outliers.
The possibility of introduction of "foreign" haplotypes not seen in the panel should also be investigated further. It may explain the two strange Italians at the upper left corner of the mutation matrix.
SCANDINAVIANS
In this analysis the Scandinavians clearly seperated from other populations in the number of shared haplotype and lenght of these shared haplotypes and with an internal sharing not so different from most of the continental populations. In this analysis they are always connected to the continental European populations like French. They do not however appear different to the continental-europeans populations when it comes to the mutation matrix. Here they appear to share as much with them as the Saami share with them self. It is currently difficult to know how old this large mutational sharing actually is but it appearantly old enough to make them seperate into their own group when it comes to shared haplotypes and haplotype lenght. In the MDS plot Scandinavians appears close to the continental European populations like French, Italians and so on.
BIG PICTURE
If we looker at the bigger picture we see that most of continental Europe is tied to each other more trough mutations than others making them harder to seperate even at this level (6 chromosomes). We see that Lithuanians seem to have stronger affiliation to the large continental European cluster including Scandinavians but this affiliation is weaker for Vologda Russians. This connection is even weaker for Finns and almost non-existing for Saamis. This is in accordance with the MDS plot.
(Updated 28/5/2012)
My wife and I enjoy your blog. My wife is half Danish and has many Swedish and Norwegian cousins. Would you like us to send her 23andme file?
SvarSlett