
Input File Formats
- Files are tab-delimited and
no header is needed unless otherwise specified
- The alleles of one
genotype are separated by one space
- Markers / Marker genotypes are
listed in ascending order of their physical positions
Parameters
| - Default threshold for SNPs |
Threshold for SNPs not included in the SNP-specific
threshold file (See below) |
| - Max inter-SNP distance |
Marker pairs with inter-distance outside
that will be excluded in the R2 calculation |
| - MAF cut-off |
Only markers with minor allele frequency (MAF)
higher than that will be considered, except those included in the forced
selection file or has a SNP-specific threshold greater than the
Default threshold for SNPs parameter |
Sample Datasets
- Click
here to download the sample input files
- Click
here to download the sample output files
File Descriptions
PED (in .ped extension)
It shows the familial relationship
among individuals and their genotypes for each marker. Each line represents one individual
and the genotypes are listed in the same order as that of their corresponding markers
in the MAP file (See below)
|
Family ID
|
Unique family ID |
|
Individual
ID |
Unique ID within
family |
|
Father
ID |
Individual ID of father, 0 = founder |
|
Mother
ID |
Individual ID of mother, 0 = founder |
|
Gender
ID |
1 = male and 2 = female |
|
Affection |
Reserved for association studies, 0 = unknown |
|
Marker genotypes |
Genotypes in columns 1 = A, 2 = C, 3 = G, 4 = T, 0 = missing data |
A typical PED file looks like this:
|
family ID |
individual ID |
father ID |
mother ID |
sex |
affection |
genotype 1 |
genotype 2 |
genotype 3 |
. . . |
|
4567 |
1 |
0 |
0 |
1 |
1 |
1 1 |
1 2 |
2 2 |
. . . |
|
4567 |
2 |
0 |
0 |
2 |
1 |
1 1 |
1 1 |
1 2 |
. . . |
|
4567 |
3 |
1 |
2 |
1 |
2 |
1 1 |
1 2 |
1 2 |
. . . |
|
4567 |
4 |
1 |
2 |
2 |
2 |
1 1 |
1 2 |
2 2 |
. . . |
|
4567 |
5 |
1 |
2 |
2 |
1 |
1 1 |
1 1 |
2 2 |
. . . |
MAP (in .map extension)
It describes the chromosomal position of each marker. Header is needed.
A typical MAP file looks like this:
|
chromosome |
marker |
position |
|
3 |
rs100001 |
10010000 |
|
3 |
rs100002 |
10020000 |
|
3 |
rs100003 |
10030000 |
|
3 |
rs100004 |
10040000 |
|
3 |
rs100005 |
10050000 |
|
3 |
rs100006 |
10060000 |
|
3 |
rs100007 |
10070000 |
|
3 |
rs100008 |
10080000 |
|
3 |
rs100009 |
10090000 |
|
3 |
rs100010 |
10100000 |
Forced selection
List of markers
forced to be tags, i.e. those with known genotype information
A typical forced selection file looks like this:
|
marker |
|
rs100003 |
|
rs100005 |
|
rs100009 |
Forced non-selection
List of markers
forced to be tagged, i.e. those with assay design problems
A typical
forced non-selection file looks like this:
|
marker |
|
rs100002 |
|
rs100006 |
|
rs100007 |
SNP-specific threshold
Each threshold (between 0 and 1 inclusive) represents the
minimum R2 distance required for the marker to be tagged by the others. Markers of particular interest, i.e.
functional SNPs, should be given a higher threshold (e.g. 0.8).
A typical SNP-specific threshold file looks like this:
|
marker |
threshold |
|
rs100001 |
0.8 |
|
rs100004 |
0.8 |
|
rs100008 |
0.4 |
|
rs100010 |
0.3 |
Should you have any comments, please contact
us at pcsham@hku.hk
Copyright © 2006-2010 Pak Sham. All rights reserved
|