next up previous contents index
Next: Overview of the algorithm Up: Overview Previous: Overview   Contents   Index

Philosophy

The aim of an association study is to analyze the genetic variation in one or more regions of interest and to detect nonrandom association between alleles and the studied phenotype. This genetic variation is generated by a complex stochastic process that includes mutation, genetic drift, recombination, and sometimes natural selection. In population genetics, this stochastic process is typically modeled using the so-called ``ancestral recombination graph'', or ARG (reviewed by Nordborg, 2001). In the ARG the ancestry of each individual locus can be described as a bifurcating tree (a coalescent tree). TreeLD uses the entire information in the marker data to infer these trees at selected loci in the region of interest. Once the ancestry of a locus is known, we can assess the likelihood that a disease-causing allele arose on this ancestry by looking at the distribution of cases and controls among the tips of the tree.

Figure 1: Hypothetical example of a coalescent genealogy for a sample of 28 chromosomes, at the locus of a disease susceptibility gene. Each tip at the bottom of the tree represents a sampled chromosome; the lines indicate the ancestral relationships among the chromosomes. The two black circles on the tree represent two independent mutation events producing susceptibility variants. These are inherited by the chromosomes marked with gray filled circles. Individuals carrying those chromosomes will be at increased risk of disease. This means that there will be a tendency for chromosomes from affected individuals to cluster together on the tree, in two mutation-carrying clades. The degree of clustering depends in part on the penetrance of the mutation.
\begin{figure}\center
\epsfysize = 2 in
\epsfbox{figures/tree2.ps}
\par\par
\end{figure}

Figure 1 illustrates the utility of this approach. In the displayed tree, the individuals that show a disease phenotype are clustered, therefore providing good evidence that this tree describes the ancestry of a locus containing a disease mutation. If the individuals carrying the disease were randomly distributed among the tips of the tree, it would be a strong indication for the absence of a disease mutation. Thus we can use the ancestry of a locus as an indicator for the presence of one or more disease mutations at the locus of interest. For the purpose of this model, a locus is not a single basepair in the sequence, but a short region of a few kb.


next up previous contents index
Next: Overview of the algorithm Up: Overview Previous: Overview   Contents   Index
Sebastian Zoellner 2005-01-27