TreeLD also implements a significance test for the presence of a
disease locus in the region of interest based on the output of the
treepeeling program. During the tree peeling step, the likelihood of
a disease gene being present is obtained by finding the focal point
and the set of penetrance parameters that generate the highest
likelihood. With this a likelihood ratio (LR)
is calculated by dividing this by the likelihood of the
null hypothesis. While asymptotic theory suggests that this LR is
approximately
-distributed, our simulation studies have shown
that applying this distribution is conservative. Nevertheless, the
p-value that is generated by this distribution can act as an indicator
for the strength of the association signal. Therefore, the program
provides two p-values from the LR-test, one that is the unmodified
p-value and a second one that is corrected for multiple tests by a
Bonferroni-correction. Please note that
for a tight grid of focal points, the signal at adjacent focal points
is highly correlated and the Bonferroni correction is very
conservative. Nevertheless, it provides useful starting point for the
analysis. If the resulting p-values are indicative but not
significant, a more appropriate p-value can be obtained by permuting
the phenotypes among the individuals in the sample. This shuffling
occurs over all trees at once and thus generates a overall p-value
that does not need correction for multiple
testing. To perform this analysis, rerun the
peeling algorithm and check the Run permutation box. The p-value is then displayed in the text console window
at the end of the run. The random phenotypes that are generated by
the shuffling are dependent on the seed the program is started
with. Thus if the user wants to apply the same random
phenotype-distributions to multiple datasets, the program has to be
restarted with the same seed in every analysis.
As described before, the likelihood of a focal point will depend on its distance to the location of the disease mutation. Therefore, if the focal points are distributed on a tighter grid in the region of interest, the power of a test for association may be increased.