Next: Density of focal points
Up: Choosing parameters for the
Previous: Choosing burn-in
Contents
Index
Effect of outliers among the sampled trees
Due to computational restraints, only a limited number of trees can be
drawn from the posterior distribution. This makes the posterior
likelihood that is calculated at each location susceptible to being
influenced by outliers with unusually high likelihoods. These occur when individual trees with a low
posterior likelihood but high support for the presence of the disease
location are sampled. This may result in the posterior likelihood
being overestimated at that location. If the map of focal points is
sufficiently dense (see 8.3, influence of an outlier may be
indicated by a single spike in the posterior distribution, where
focal point
has a high posterior likelihood, but neither focal
point
nor
have an elevated signal. If a focal point is
really close to the locus of disease mutation, then the neighboring
focal points will also be in the proximity of the disease mutation and
therefore also show a increased posterior likelihood. On the other
hand, if the increased likelihood at a focal point is due to an
outlier, adjacent focal points will not show an increased likelihood
for the presence of a disease mutation. While the impact of outliers
can be reduced by sampling more trees from the posterior distribution,
some outliers may have a likelihood that is orders of magnitude higher
than the likelihood of all other trees that are sampled at the same
focal point. Thus it may be not computationally viable to sample
enough trees to control for the effect of outliers.
On the other hand, a signal that is generated by an outlier in the
tree-distribution will usually not be repeated if a second set of
trees is generated and analyzed. Therefore, it is advisable to verify
any peak in the posterior distribution by generating additional trees
at locations where a signal is generated. To make sure that this new
set of trees is independent from the first set of trees, it may be
necessary to restart the analysis from a random tree and to repeat the
MCMC, including burn-in.
Next: Density of focal points
Up: Choosing parameters for the
Previous: Choosing burn-in
Contents
Index
Sebastian Zoellner
2005-01-27