Totalt antall sidevisninger

fredag 18. november 2011

Autosomal Haplotype Clustering Patterns - Actual or Error? (updated)

I have found this pattern of haplotype clustering within individuals from Norway, Sweden and Finland on Chr 1 using 38.5k SNP:

Unique haplotypes not shared with others - 1 344 (typical between 15 to 50 per individual)
Haplotypes shared between 2 ind - 157
Haplotypes shared between 3 ind - 30
Haplotypes shared between 4 ind - 3
Haplotypes shared between 5 ind - 2
Haplotypes shared between 6 ind - 2

As this shows widespread haplotype clusters are much rarer than those shared with only two individuals, but the "unique" haplotype clusters appears to be absolutely highest at the individual level.

This raises the questions why its like this. I suspect its the following reasons:

1. The effect of recombination splitting or killing haplotypes. However the maximum haplotype size in clusters is 500 SNP. Redusing it to 100 SNP only reduced to 1324 unique haplotype clusters. Reducing to max 10 SNP only reduced to 1189 unique haplotype clusters. Reducing to 5 SNP reduced only to 962 unique haplotype clusters. If reducing to 2 SNP only 277 unique haplotype clusters.
2. The effect of limited population data. Its possible more individuals and populations would reduce the number of unique haplotype clusters.
3. The effect of undetected errors in the genotypes. However no correlation between high unique haplotypes found in individuals and high detected genotype error rate for these.
4. The effect of incorrect phasing as the result of errors in genotype or/and ordinary phasing error as result of the model used.
5. The effect of haplotype or mutation extinction. Recent individual haplotypes or mutations have limited spread generally, while older haplotype clusters or mutations have larger geographic spread.

So what I infer from this is that these unique haplotype clusters is rather small and not very large. These numbers have been generated from software made for finding genetic diseases from haplotypes where you mark individuals with certain traits cases and check them vs the controls. If there is any haplotype strongly associated with a trait the associated haplotype is found. These haplotypes are usually not very large. Just check the SNP used by 23andme health section.. So is also the cases with these haplotypes.

The software do for many haplotypes infer parent-child relationships between them indicating that haplotype mutations are in the picture at least when I check at the individual level.

3 kommentarer:

  1. A few days ago I discussed a PLoS ONE paper on Chinese and other's haploid genetics, specifically CNVs and what most called my attention was that the amount of possibly individualized material is c. 30-40%. Of course sample size may cast some doubts but your example and previous studies seem to corroborate this finding. Considering that another 30% is shared by all humans, individual (or quasi-individual) variation is like 50% of all genetic variation at human scale.

    If we look at a more comparable population, as is the East Asian sample, we still find that c. 35% of the variation is uniquely individual, while c. 45% is shared across the region (not much more than at global scale), almost 20% is shared by two of the three subgroups, leaving only a tiny 5% to illustrate the differences between Japanese, Han and South China minorities.

    You may have stumbled with this matter: almost NO genetic variation is shared within populations but is either shared across ethnic boundaries or almost irreducibly individual.

  2. Self-correction:

    Maybe better than "almost NO genetic variation is shared within populations" could be "very little genetic variation is shared within populations. It is still significant after all.

  3. Your explaination make sense compared to the findings in this post. It may be that the effect of recomination is the primary mutation or haplotype killer limiting the spread of individually accumulated mutations/haplotypes. Just remember that only half of the total amount of haplotypes from each of your parents reach you. If you have brothers/sisters you only share about 1/4, and it would rapidly decline down the cousin ladder.

    My primary goal of with this attempt is to find etnic or geographic spesific mutations/haplotypes for use in ancestry and/or geneology. I suspect I found 194 exclusive Fennoscandian mutations/haplotypes if excluding individual unique mutations/haplotypes. Its not many compared to the whole panel of 38.5kb SNP on Chr 1 but its some.