Jurg Ott / 22 July 2015
Rockefeller University, New York
ott@rockefeller.edu / ottjurg@psych.ac.cn

(created with Libre Office, maintained with Composer)

TLINKAGE programs for 2-locus traits

INTRODUCTION

The Tlinkage programs described here are an extension of the general LINKAGE programs for genetic linkage analysis. The extension consists of allowing for a disease phenotype to be under the control of two loci. The current version (20 Apr 2007) corrected a bug that had an effect only with relatively large numbers of alleles at the marker loci.

The two postulated disease loci are typically unlinked (on two different chromosomes), each with two alleles (one normal, one being the disease allele), although there is no such restriction in this implementation. Below, the two disease loci are implemented as "null loci". Each of the disease loci may be linked with a marker or map of markers. Typically, two recombination fractions will be estimated, that between disease locus 1 and marker 1, and that between disease locus 2 and marker 2.

If each of the two disease loci has two alleles and, thus, three genotypes, there is a total of 9 possible genotype patterns at the two loci jointly. Which of these confer susceptibility (are associated with positive penetrance) is often unknown but specific patterns of interaction between the two disease loci have been described (Risch 1990). Below, technicalities of implementation of the TLINKAGE programs are provided. The programs are written in Free Pascal and have been compiled for Windows and Linux. Note that the programs will not check whether program constants are exceeded. In case of problems check the first few lines of program output -- they will tell you the maximum parameter values allowed in the programs.

If research papers refer to these programs, the appropriate reference is Lathrop and Ott (1990) (see below).

IMPLEMENTATION

  ---------------------------
  Program name Corresponds to
  ---------------------------
      TUNK        UNKNOWN
      TMLINK      MLINK
      TLINKM      LINKMAP
      TILINK      ILINK
  ---------------------------

The TLINKAGE programs implement a new locus type #4, called the null type because it may have no associated phenotype in the pedigree file.

Each null locus is associated with the same phenotype. The number of null loci corresponds to the number of loci jointly responsible for a disease phenotype. That number is indicated as the last entry on the first line of the datafile (after the program number). In this version of TLINKAGE, you must have two null loci, or zero null loci. If two null loci are defined, there are thus two consecutive locus descriptions for these null loci in the datafile, but they correspond to only one phenotype in the pedfile. That phenotype must be an affection status phenotype, and it may be at any position among the phenotypes (ie, the null loci are not restricted to be the first two loci in the datafile). NOTE, however, the following restriction: If the two null loci are loci 1 and 2, their order must be 1 2 and not 2 1.  For example, with loci 3 and 4 being markers, orders 1 3 2 4 and 4 1 3 2 are all right but orders 2 3 1 4 and 4 2 3 1 are not.  An analogous restriction on order applies when the null loci are numbered other than 1 and 2.

In the datafile, the description of a null locus contains only the number of alleles and gene frequencies, e.g.:

 4   2          << null locus, number of alleles
 0.9  0.1       << allele frequencies

except that after the last null locus, a line specifying the number of liability classes must be present, followed by one or more tables of penetrances (as many tables as there are liability classes). Each such table (see example below) has a single entry for each genotype combination at the two loci and is arranged as shown in the example below. The numbers to be entered in each table are only the penetrances, in this case the 3 3 = 9 numbers in the body of the table (do not enter any of the genotypes such as 1/2 or 2/2).

Repeat this table with different entries for different liability classes. Remember that only the last null locus has an associated phenotype in the pedfile. An example may look as follows:

   -----------------------------
   First       Second null locus
   null        -----------------
   locus         1/1  1/2  2/2
   -----------------------------
    1/1           0    0    0
    1/2           0   0.8  0.8
    2/2           0   0.8   1
   -----------------------------

The current 2-locus version of LINKAGE allows analysis of autosomal loci only. Please make sure you do not use these programs for X chromosomal loci -- there is no check in the programs to ensure that you are adhering to this restriction.

The different steps for running the TLINKAGE programs are analogous to those for the LINKAGE programs:

Instead of the steps indicated above, after preparing input files you may invoke the RUN command, which will copy the relevant infiles and invoke the analysis programs. Input file names must be given on the command line. Example: RUN TESTML.DAT TEST.PED TMLINK.

KNOWN BUGS

In releases prior to 13 Feb 1991, two bugs were present in these programs: The programs did not work right when individuals with unknown disease status were present and when more than one liability class was used. Both bugs were fixed by Joseph Terwilliger.

SAMPLE INPUT FILES

The files TESTML.DAT, TESTLM.DAT, and TESTIL.DAT are sample datafiles for TMLINK, TLINKM, and TILINK, respectively. A test pedigree file (after processing by MAKEPED) is included as TEST.PED. To run the test example for TMLINK, copy the test files to your current directory in which you want to carry out the TLINKAGE runs. Then give the following commands in the Windows command box:

  copy  testml.dat  datafile.dat
  copy  test.ped  pedfile.dat
  tunk
  tmlink

In a Linux terminal you would type::

  cp  testml.dat  datafile.dat
  cp  test.ped  pedfile.dat
  ./tunk
  ./tmlink

InWindows, these commands may all be executed by one single command (RUN batch file), in this case:
run testml.dat test.ped tmlink

In the regular LINKAGE programs, various utility programs such as LCP and PREPLINK may be used for the creation of the input files. These programs have not been adapted for the TLINKAGE programs.

The sample files mentioned above refer to the situation of two disease loci (one trait) and a single marker locus that is linked with disease locus 2. The other disease locus is assumed somewhere else in the genome, unlinked with any tested marker locus. The input datafile for the TMLINK program looks as follows:

3 0 0 5 2 <<< Number of loci, risk locus, sexlinked (if 1), program code, # null loci
0 0.0 0.0 0  << Mut Locus, Mut Rates (male, female), Haplotype Frequencies (if 1)
1 2 3
4 2    <<- null locus, number of alleles
 
0.97000 0.03000
4 2    <<- null locus, number of alleles
 
0.99000 0.01000
 
1    <<- number of liability classes
 
0.000 0.000 0.000
 
0.000 0.000 0.000
 
0.000 0.000 1.000
3 4   << Allele numbers, Number of alleles
 
0.25000 0.25000 0.25000 0.25000
 
0 0  << Sex Difference, Interference (If 1 or 2)
 
0.50000 0.00000 << Recombination Values
 
2 0.10000 0.40000 << Rec. varied, Increment, Finishing value

On line 1, the program code (5 for TMLINK) is not used by the program (it was used in previous versions of LINKAGE). So, the user must be careful that the structure of the datafile is appropriate for the program he or she is using. The input pedfile (after processing by the Makeped program!) corresponding to the above datafile looks as shown below. It refers to two parents, one unaffected, the other affected, and two affected children.

1 1 0 0 3 0 0 1 1  1  1 2   Ped: 1  Per: 1
1 2 0 0 3 0 0 2 0  2  3 3   Ped: 1  Per: 2
1 3 1 2 0 4 4 2 0  2  1 3   Ped: 1  Per: 3
1 4 1 2 0 0 0 2 0  2  1 3   Ped: 1  Per: 4

In the file holding the pedigree data, the phenotypes are listed in the order "disease - marker 1 - marker 2" as this is the order in which these loci are given in the datafile above (input order).

If the two disease loci are taken to be linked with a marker locus each, the corresponding datafile (for TMLINK) may look as follows, where locus order is "marker 1 - disease 1 - disease 2 - marker 2" (chromosome order):

4 0 0 5 2 <<< Number of loci, risk locus, sexlinked (if 1), program code, # null loci
0 0.0 0.0 0  << Mut Locus, Mut Rates (male, female), Haplotype Frequencies (if 1)
3 1 2 4
4 2    <<- null locus, number of alleles
 
0.97000 0.03000
4 2    <<- null locus, number of alleles
 
0.99000 0.01000
 
1    <<- number of liability classes
 
0.000 0.000 0.000
 
0.000 0.000 0.000
 
0.000 0.000 1.000
3 4   << Allele numbers, Number of alleles (marker 1)
 
0.25000 0.25000 0.25000 0.25000
3 3   << Allele numbers, Number of alleles (marker 2)
 
0.25000 0.25000 0.50000
 
0 0  << Sex Difference, Interference (If 1 or 2)
 
0.0001 0.50 0.0 << Recombination Values
 
1 0.10 0.45 << Rec. varied, Increment, Finishing value

PROGRAM EXTENSIONS

The basic version of TLINKAGE described and available for downloading here is relatively slow. Several extensions have been published (Dietter at al 2004; Wu and Shete 2005; Shete and Zhou 2006) and should perform well although I have not personally tried them

LITERATURE

Dietter J, Spiegel A, an Mey D, Pflug HJ, Al-Kateb H, Hoffmann K, Wienker TF, Strauch K (2004) Efficient two-trait-locus linkage analysis through program optimization and parallelization: application to hypercholesterolemia. Eur J Hum Genet 12(7):542-50

Lathrop GM, Ott J (1990) Analysis of complex diseases under oligogenic models and intrafamilial heterogeneity by the LINKAGE programs. Am J Hum Genet 47, A188 (abstr)

Risch N (1990) Linkage strategies for genetically complex traits. I. Multilocus models. Am J Hum Genet 46, 222-228

Schork NJ, Boehnke M, Terwilliger JD, Ott J (1993) Two trait locus linkage analysis: a powerful strategy for mapping complex genetic traits. Am J Hum Genet 53, 1127-1136

Shete S, Zhou X (2006) TLINKAGE-IMPRINT: a model-based approach to performing two-locus genetic imprinting analysis. Hum Hered 62(3):145-56. Epub 2006 Oct 20

Wu CC, Shete S (2005) Analysis of genes for alcoholism using two-disease-locus models. BMC Genet Dec 30;6 Suppl 1:S149