Jurg Ott, Josephine Hoh / 10 Oct 2012
ott@rockefeller.edu 

User's guide to the p53MH program

For each site in the DNA sequence of a gene, this program computes a score between 0 and 100. A high score indicates an increased probability that n nucleotides at that and subsequent sites represent a p53 binding site, where n = 2 10 + (selected length of spacer region). The algorithm is based on our published description [1]. The program is written in Free Pascal (an extension of Turbo Pascal) and is available for Windows and Linux. A slightly modified version of our algorithm has more recently been developed for p63 binding sites [2].

Input files

1. Gene names

This file, p53names.dat, contains a list of file names that should be processed. Each file (gene or other sequence) will be analyzed by the algorithm. The program will read names and process the corresponding files one after the other until it encounters a blank line (or end of file) in the p53names.dat file. Text below such an empty line will be ignored.

2. Random seed

For the random number generator, the user must prepare a file called seed.txt containing a negative integer number, i.e., the random number seed. This file will be updated with the ending seed of the previous run with each successive run of the p53MH program.

3. Gene file(s)

As many such files must be present as names are listed in the p53names.dat file. Each file may contain any text on the first few lines but must have a blank line before the actual sequence starts. Examples: mdm2.dat, waf1.txt.

4. Parameter file

This file, p53param.dat, contains various parameters that control analysis features. After the actual parameter values, brief explanations are listed in the sample file. More detailed explanations are as follows (each input line may contain one or more parameters):

Running the p53MH program

Windows

There are two options:
  1. Traditional approach. Open a command window (“DOS box”) and change directories until you are in the directory (folder) where the p53MH program and its files reside. Then enter the command, p53MH.
  2. Double click on the p53MH (p53MH.exe) file. The program will then execute and show progress on screen. Once it finishes the window closes automatically. Alternatively, to execute the program and save screen output to a file, double click on the run (run.bat) file. After the program finishes, screen output may be viewed by inspecting the screen.out file.
The program currently checks array bounds and may, thus, be somewhat slow. Once all bugs have been eliminated the array bound checking feature will be disabled.

Linux

This is analogous to option 1 above.

Output files

The following output files will be written by the program (some of them only for certain parameter settings):
  1. p53res.out presents detailed output
  2. p53ord.out essentially provides the same output as above but in a format that is easy to import into a spreadsheet
  3. p53psum.out: If computer simulation is requested (input line 2) then this file is written and contains p-values for sums of order statistics for the scores.
  4. p53pmin.out: Analogous output for minimum p-value for sums of order statistics (one result per gene)

References

[1] Hoh J, Jin S, Parrado T, Edington J, Levine AJ, Ott J (2002) The p53MH algorithm and its application in detecting p53-responsive genes. Proc Natl Acad Sci USA 99, 8467-8472

[2] Perez CA, Ott J, Mays DJ, Pietenpol JA (2007) p63 consensus DNA-binding site: identification, analysis and application into a p63MH algorithm. Oncogene 26, 7363-7370