Next: Genetic map distances
Up: Documentation for TreeLD, version
Previous: The interface
Contents
Index
Input format
The input file supplied by the user should indicate the position of
the markers used, the phenotypes of the individuals studied and the
phased genotypes of the sample. A schematic for the input file is
given by figure 4 where the quantities are as follows:
Figure 4:
Schematic of an input file. Each
oval box indicates the input information for one individual. Figure
5 shows a specific example.
![\begin{figure}\center
\fbox{
\parbox[b]{10cm}{ \char93 Comments\\
P Position(1...
...pe(NumberOfIndividuals.1)\ Haplotype(NumberOfIndividuals.2)}}
}
}\end{figure}](img11.png) |
- Comments
-
All lines in the beginning of the file that is started by a hash (#)
will be ignored by the program and can therefore be used for comments.
- P
- The line that designates the map-positions of the markers
starts with a P (upper case). The positions should be separated by a
single whitespace.
- Position(i)
- Each of those
numbers indicates the position of a marker relative to an arbitrary
point of reference in basepairs. The loci must be in consecutive
order along the chromosome (i.e. the positions have to be
increasing). Position(i) should be separated from Position(i+1) by a
single whitespace.
For each individual i in the sample, the phenotype and genotype
information must be provided. The first line indicates the phenotype,
the second and possibly third line contain the genotype information.
- Phenotype(i)
- This
floating point number indicates the phenotype of the individual i. For
a QTL-study, this can be the measured quantitative trait. In a
case-control study, the phenotypes should be indicated as 1.0 for
cases and 0.0 for controls.
- 1/2
- This entity should designate how many chromosomes share the
phenotype designated in the same line. For case-control studies on
autosomal loci this will be a 2, while for male X-chromosomal loci or
for non-transmitted haplotypes in a TDT this will be a 1.
- Haplotype(i.x)
- In this line(s) the input file contains the one
or two haplotypes as indicated by the line with the phenotype
information. The state of each SNP is indicated by a 1 or a 2 without
a space between the individual characters. It is not important which
allele of the SNP is assigned to which number, as no information about
ancestral state is used. The number of markers displayed in each of
this lines must match the number of marker positions provided in the
line starting with a P.
An example for a simple input file is given in the file SampleInput.txt, which is shown in figure 5.
Figure 5:
Example of an input file. The first
line is a comment that is ignored by the program. The second line
indicates the location of the 5 markers at bp 820, 22312,..., 82290
relative to an arbitrary starting point. The first three individuals
in the sample are the cases, designated by the phenotype of 1.0. The 2
after the phenotype indicates that these are diploid individuals. The
last three individuals are controls as indicated by the phenotype
0.0.
![\begin{figure}\hspace{2.5cm}
\fbox{
\parbox[b]{12cm}{ \char93 SampleInput.txt, ...
...\\
12111\\
0.0 2\\
22211\\
11112\\
0.0 2\\
11122\\
11111
}
}
\end{figure}](img12.png) |
Please note that the chromosomes in the input file are assumed to be
unrelated. Thus, data that are generated in a case-control study
can be used without modifications. See 5.3 for instructions how
to generate an input file for trios.
Subsections
Next: Genetic map distances
Up: Documentation for TreeLD, version
Previous: The interface
Contents
Index
Sebastian Zoellner
2005-01-27