Shared Genomic Segments

Jurg Ott / 25 Jan 2020
Rockefeller University, New York

This set of programs determines the likelihood (“p-value”) for multiple individuals or families to share an estimated disease position (Horpaopan et al 2020, to be submitted). Our approach has been implemented in the sharedSNP program and is available for Windows and Linux. This program reads information from a parameter file. An example file, sharedsnp.param, is as follows:

61          numfam           Line 1
9999 numperm Line 2
100 distlimit, kb Line 3
2 m Line 4
RR61.het het file Line 5
RR prefix Line 6
ShSnpRR100 output ID Line 7

=== Parameter file for SharedSNP program ===

A family/individual is "sharing" a SNP if its position is within distlimit (kb) of any of the Hmax positions.

Line 1
Number of files for families or individuals present, where file names are preceded by the prefix on line 7. These files must be generated with the PH program. They have names like RR1.negpos.res, RR2.negpos.res, and so on.

Line 2
Number of permutations, may be zero

Line 3
dist limit (see above)

Line 4
m > 0 to print markers with Nf >= m (m = 2 recommended)
m = 0 for no output

Line 5
Full name of het file; holds all variants

Line 6
Prefix for input file names, preceding the individual number. For example, for RR1.negpos.res, enter RR.

Line 7
Output file identifier, for example, ShSnpRR100 will identify output files generated by sharedSNP with d = 100 kb. Any file identifier will do but including some information like the value of d will be helpful in finding files.

A utility program, sigruns, can find runs of variants shared by Nf or more individuals.

More detailed information is forthcoming.