**** Parameter List for SEQSEE **** Users should feel free to copy this file to their own directory and make any changes they feel appropriate. Parameter entries are preceded by 2 consecutive angle brackets, the order of the parameters must be maintained! Comments and blank lines can be placed anywhere. PARAMETER FILE VERSION: >> 1.4 **************************************************************** Id code for main SEQSEE driver. >> SEQSEE_V1.4 Programs that the SEQSEE driver will be calling >> /usr/local/bin/seqhelp >> /usr/local/bin/seqed >> /usr/local/bin/seqret >> /usr/local/bin/stats >> /usr/local/bin/alexis >> /usr/local/bin/seqsearch >> /usr/local/bin/fleqsee >> /usr/local/bin/moment >> /usr/local/bin/hydro >> /usr/local/bin/fast_align >> /usr/local/bin/sb_align >> /usr/local/bin/nw_align >> /usr/local/bin/mult_align >> /usr/local/bin/psearch >> /usr/local/bin/hsearch >> /usr/local/bin/dotplot >> /usr/local/bin/refscan >> /usr/local/bin/browse Automatically enter editor when results found (1=yes,0=no). >> 1 **************************************************************** Id code for seqret. Do not change this line. >> SEQRET What format is the sequence database? >> 2 1 = SWISS-PROT, 2 = PIR, 3 = SWISS_PROT (intelligenetics version), 4 = PIR (intelligenetics version) >> 3 Number of files that compose the database Location of each sequence database file >> /usr/local/databases/pir/PIR_1.ADB >> /usr/local/databases/pir/PIR_2.ADB >> /usr/local/databases/pir/PIR_3.ADB Update output file every 'x' proteins which are processed. >> 1000 **************************************************************** Id code for stats function. Do not change this line. >> STATS Location of SEQBANK database >> /usr/local/lib/seqsee/databases/SEQBANK.db hydrophobicity table >> /usr/local/lib/seqsee/tables/kyte.parms These thresholds are dependent on the hydrophobicity table used. >> 0.10 hydrophobic proteins threshold >> -6.00 hydrophilic proteins threshold >> 0.85 protein insoluble threshold >> 1.90 protein generally does not fold threshold >> 0.77 protein insoluble threshold >> 1.43 protein generally does not fold threshold Hydrophobic Amino Acids >> ACFGHILMVWY one letter codes >> 52.44 average percent of these amino acids in a protein Hydrophilic Amino Acids >> DEKNPQRST one letter codes >> 47.56 average percent of these amino acids in a protein molecular weight table >> /usr/local/lib/seqsee/tables/mol.weights molecular volume table >> /usr/local/lib/seqsee/tables/mol.volume molecular surface area table >> /usr/local/lib/seqsee/tables/mol.surfarea molecular partial specific volume table >> /usr/local/lib/seqsee/tables/mol.parspecvol molecular polar, nonpolar, surface area table >> /usr/local/lib/seqsee/tables/mol.asa molecular fraction buried >> /usr/local/lib/seqsee/tables/mol.fracbur fraction of amino acids buried >> /usr/local/lib/seqsee/tables/fracbur.parms **************************************************************** Function ID code for alexis >> ALEXIS Location of programs alexis will be running >> /usr/local/bin/a_membrane >> /usr/local/bin/a_motif >> /usr/local/bin/a_homol >> /usr/local/bin/a_moment >> /usr/local/bin/a_gor >> /usr/local/bin/a_cfas Correlation tables for predicting protein-structural classes >> /usr/local/lib/seqsee/tables/alexis.norm (for most sequences) >> /usr/local/lib/seqsee/tables/alexis.cys (for heavy cys sequences) Remove intermediate results files? >> 1 (1=yes, 0=no) **************************************************************** Identification code for the following set of parameters. Do not change this line. >> A_MEMBRANE Location of membrane spanning hydrophobicity parms >> /usr/local/lib/seqsee/tables/membrane.parms Nature of membrane spanning test (scaling constants) >> -9.02 170.00 14.27 **************************************************************** Identification code for the following set of parameters. Do not change this line. >> A_HOMOL Enter the location of the seqbank database. >> /usr/local/lib/seqsee/databases/SEQBANK.db Tell program the location of the similarity scoring matrix. >> /usr/local/lib/seqsee/tables/wt.rbo Homologous segments must have a certain minimum test stat before the secondary structure they represent is counted. >> 3.20 Improve prediction by weighting of scores because of unequal representation of secondary structure in the database. >> 1.000 /* betastrand represent 28% of seqbank */ >> 0.820 /* coil represent almost 37% of seqbank */ >> 0.780 /* helix represent almost 35% of seqbank */ Offset and multiplier needed to normalize prediction scores to mean=1000 and stddev=200. >> 494.00 0.89 Improve prediction by applying smoothing function >> 1 /* number of times to apply smoothing function */ Improve prediction by biasing random coils at sequence ends >> 1 /* 1=yes, 0=no */ Improve prediction by class weighting. >> 1 /* 1=yes, 0=no */ >> 1.10 /* beta Scores */ >> 1.30 /* helix Scores */ Improve prediction by smoothing the predicted structure >> 1 /* 1=yes, 0=no */ **************************************************************** Identification code for the following set of parameters. Do not change this line. >> A_MOMENT Tell program the location of the chou-fasman parameters. >> /usr/local/lib/seqsee/tables/moment.cfas Tell program the location of the hydrophobicity parms which are biased for BetaStrands. >> /usr/local/lib/seqsee/tables/moment.bhydro Tell program the location of the hydrophobicity parms which are biased for Helicies. >> /usr/local/lib/seqsee/tables/moment.hhydro Beta Strand Prediction Parameters >> 7 /* window size */ >> 1 2 3 4 3 2 1 /* cfas weighting factors */ >> 2 /* number of periodicity tests */ >> 160 180 /* periodicity angles */ Coil Prediciton Parameters >> 5 /* window size */ >> 2 3 4 3 2 /* cfas weighting factors */ Helix Prediction Parameters >> 11 /* window size */ >> 2 3 3 3 3 3 3 3 3 3 2 /* cfas weighting factors */ >> 2 /* number of periodicity tests */ >> 100 110 /* periodicity angles */ Multiplier and offset needed to normalize prediction scores to mean=1000 and stddev=200 >> 831.00 13.30 Improve prediction by applying smoothing function >> 1 /* number of times to apply smoothing function */ Improve prediction by biasing random coils at sequence ends >> 1 /* 1=yes, 0=no */ Improve prediction by class weighting >> 1 /* 1=yes, 0=no */ >> 0.95 /* beta Scores */ >> 1.05 /* helix Scores */ Improve prediction by smoothing the predicted structure >> 1 /* 1=yes, 0=no */ **************************************************************** Identification code for the following set of parameters. Do not change this line. >> A_GOR Tell program the location of the GOR parms >> /usr/local/lib/seqsee/tables/gor.data Offset and multiplier needed to normalize prediction scores to mean=1000 and stddev=200. >> 966.00 13.10 Improve prediction by applying smoothing function >> 0 /* number of times to apply smoothing function */ Improve prediction by biasing random coils at sequence ends >> 1 /* 1=yes, 0=no */ Improve prediction by class weighting >> 1 /* 1=yes, 0=no */ >> 1.08 /* beta Scores */ >> 1.16 /* helix Scores */ Improve prediction by smoothing the predicted structure >> 1 /* 1=yes, 0=no */ **************************************************************** Identification code for the following set of parameters. Do not change this line. >> A_CFAS Tell program the location of the weighting parameters See the default listed here to understand the input format. >> /usr/local/lib/seqsee/tables/cfas.data BetaStrand window size >> 7 Weighting factors within this window for BetaStrand >> 1 2 3 4 3 2 1 Coil Window Size >> 5 Weighting factors within this window for Coil >> 1 2 3 2 1 Helix Window Size >> 9 Weighting factors within this window for Helix >> 1 2 3 4 5 4 3 2 1 Offset and multiplier needed to normalize prediction scores to mean=1000 and stddev=200. >> 953.00 13.50 Improve prediction by applying smoothing function >> 1 /* number of times to apply smoothing function */ Improve prediction by biasing random coils at sequence ends >> 1 /* 1=yes, 0=no */ Improve prediction by class weighting >> 1 /* 1=yes, 0=no */ >> 1.02 /* beta Scores */ >> 1.00 /* helix Scores */ Improve prediction by smoothing the predicted structure >> 1 /* 1=yes, 0=no */ **************************************************************** Function ID code for motif searching program (motifs from literature) >> LIT_MOTIF Location of motifs databases >> /usr/local/lib/seqsee/databases/seqmotif1.db Printing Parameters >> 100 Print stats summary every 'x' motifs processed >> 0 Print individual motifs which match (1=yes, 0=no) **************************************************************** Function ID code for motif searching program (computer generated dbase) >> COMP_MOTIF Location of motifs databases >> /usr/local/lib/seqsee/databases/seqmotif2.db Printing Parameters >> 100 Print stats summary every 'x' motifs processed >> 0 Print individual motifs which match (1=yes, 0=no) **************************************************************** Id code for seqsearch function. Do not change this line. >> SEQSEARCH Number of SEQSITE databases >> 3 Location of SEQSITE databases >> /usr/local/lib/seqsee/databases/SEQSITE.db (general sequence motifs) >> /usr/local/lib/seqsee/databases/PHOSITE.db (general phosphorylation sites) >> /usr/local/lib/seqsee/databases/EPISITE.db (antigenic sites) **************************************************************** Function ID code. Do not change this line. >> FLEQSEE Type of output, 0 = weighted scores, 1 = raw scores >> 1 Location of flexibility parameters >> /usr/local/lib/seqsee/tables/fleqsee.parms Manipulating Flexibility Scores >> 7 Window size >> 1 2 3 4 3 2 1 Weighting constants based on window size **************************************************************** Function ID code. Do not change this line. >> MOMENT Type of output, 0 = weighted scores, 1 = raw scores >> 1 Location of hydrophobicity parameters (hmom.* files) >> /usr/local/lib/seqsee/tables/hmom.cornet Nature of periodicity tests >> 8 number of tests >> 0 5 0 type(0=beta, 1=coil, 2=helix), window size, periodicity angle >> 0 5 160 >> 0 5 170 >> 0 5 180 >> 2 9 90 >> 2 9 100 >> 2 9 110 >> 2 9 120 smoothing function to be applied 'x' times >> 2 **************************************************************** Function ID code. Do not change this line. >> HYDRO Type of output, 0 = weighted scores, 1 = raw scores >> 1 Location of hydrophobicity parameters (hphob.* files) >> /usr/local/lib/seqsee/tables/hphob.kyte Manipulating hydrophobicity scores >> 7 Window size >> 1 2 3 4 3 2 1 Weighting constants based on window size **************************************************************** Function ID code. Do not change this line. >> FAST_ALIGN What format is the sequence database? >> 2 1 = SWISS-PROT, 2 = PIR, 3 = SWISS_PROT (intelligenetics version), 4 = PIR (intelligenetics version) >> 3 Number of files that compose the database Location of each sequence database file >> /usr/local/databases/pir/PIR_1.ADB >> /usr/local/databases/pir/PIR_2.ADB >> /usr/local/databases/pir/PIR_3.ADB Tell program the location of the similarity scoring matrix. >> /usr/local/lib/seqsee/tables/wt.align What minimum value from the similarity scoring matrix would constitiute a near match? >> 5 Cut-off score for similar tuples. Note that this score depends on the matrix selected above. For example, in the matrix 'wt.align', FYE is similar to YFN if the cutoff score is 50 or less. >> 48 Update output file every 'x' proteins which are processed. >> 1000 Penalize the alignment score 'x' points every time a gap needs to be introduced. The value of 'x' depends on the similarity scoring matrix, a typical value being the 3rd or 4th highest number in the matrix. >> 20 Penalize the alignment score 'x' points for each entry in the gap. This will keep the gap from getting too large. >> 5 **************************************************************** Id code for exhaustive alignment on SEQBANK database. >> SB_ALIGN Enter the location SEQBANK database. >> /usr/local/lib/seqsee/databases/SEQBANK.db Tell program the location of the similarity scoring matrix. See the default listed here to understand the input format. >> /usr/local/lib/seqsee/tables/wt.rbo What minimum value from the similarity scoring matrix would constitiute a near match? >> 5 Random number seed used to jumble sequences. >> 13791 sorting alignment scores 0 = sort by raw score (tends to overlook smaller sequences) 1 = sort by raw score / sequence len (fast, generally more accurate) 2 = sort by jumbling (very slow but most accurate) >> 1 These parameters are only used if sort by jumbling option chosen. Number of jumbles based on current test stat. (6 entries only!) (eg, if after 18 jumbles the test stat exceeds 2 std dev, keep going). jumbles std dev >> 3 0.00 >> 8 1.00 >> 18 2.00 >> 50 3.00 >> 150 4.00 >> 500 9999.00 (this tstat value is ignored here) Update output file every 'x' proteins processed. >> 10 Penalize the alignment score 'x' points every time a gap needs to be introduced. The value of 'x' depends on the similarity scoring matrix, a typical value being the 3rd or 4th highest number in the matrix. >> 10 Penalize the alignment score 'x' points for each entry in the gap. This will keep the gap from getting too large. >> 2 Penalty for a gap within a random coil region >> 0 Penalty for a gap at the end of a helix or beta strand structure >> 1 Penalty for a gap in the middle of a helix or beta strand structure >> 4 **************************************************************** Identification code for the following set of parameters. >> NW_ALIGN What format is the sequence database? >> 2 1 = SWISS-PROT, 2 = PIR, 3 = SWISS_PROT (intelligenetics version), 4 = PIR (intelligenetics version) >> 3 Number of files that compose the database Location of each sequence database file >> /usr/local/databases/pir/PIR_1.ADB >> /usr/local/databases/pir/PIR_2.ADB >> /usr/local/databases/pir/PIR_3.ADB Tell program the location of the similarity scoring matrix. Matrices such as Dayhoff can be used. See the default listed here to understand the input format. >> /usr/local/lib/seqsee/tables/wt.rbo What minimum value from the similarity scoring matrix would constitiute a near match? >> 5 Random number seed used to jumble sequences. >> 13791 sorting alignment scores 0 = sort by raw score (tends to overlook smaller sequences) 1 = sort by raw score / sequence len (fast, generally more accurate) 2 = sort by jumbling (very slow but most accurate) >> 1 These parameters are only used if sort by jumbling option chosen. Number of jumbles based on current test stat. (6 entries only!) (eg, if after 18 jumbles the test stat exceeds 2 std dev, keep going) jumbles std dev >> 3 0.00 >> 8 1.00 >> 18 2.00 >> 50 3.00 >> 150 4.00 >> 500 9999.00 (this tstat value is ignored here) Update output file every 'x' proteins processed. >> 50 Penalize the alignment score 'x' points every time a gap needs to be introduced. The value of 'x' depends on the similarity scoring matrix, a typical value being the 3rd or 4th highest number in the matrix. >> 10 Penalize the alignment score 'x' points for each entry in the gap. This will keep the gap from getting too large. >> 2 **************************************************************** ID function code. Do not change this line. >> MULT_ALIGN Tell program the location of the similarity scoring matrix. >> /usr/local/lib/seqsee/tables/wt.rbo What minimum value from the similarity scoring matrix would constitiute a near match? >> 5 Random number seed used to jumble sequences. >> 13791 sorting alignment scores 0 = sort by raw score (tends to overlook smaller sequences) 1 = sort by raw score / sequence len (fast, generally more accurate) 2 = sort by jumbling (very slow but most accurate) >> 0 These parameters are only used if sort by jumbling option chosen. Number of jumbles based on current test stat. (6 entries only!) (eg, if after 18 jumbles the test stat exceeds 2 std dev, keep going). jumbles std dev >> 3 0.00 >> 8 1.00 >> 18 2.00 >> 18 3.00 >> 18 4.00 >> 18 9999.00 (this tstat value is ignored here) Print pairwise alignments? (1=yes, 0=no) >> 0 Consensus percent - Print the amino acid in the consensus sequence if it is found above the consensus percent threshold. >> 70 Penalize the alignment score 'x' points every time a gap needs to be introduced. The value of 'x' depends on the similarity scoring matrix, a typical value being the 3rd or 4th highest number in the matrix. >> 10 Penalize the alignment score 'x' points for each entry in the gap. This will keep the gap from getting too large. >> 2 **************************************************************** Identification code for this function. Do not change this line. >> PSEARCH What format is the sequence database? >> 2 1 = SWISS-PROT, 2 = PIR, 3 = SWISS_PROT (intelligenetics version), 4 = PIR (intelligenetics version) >> 3 Number of files that compose the database Location of each sequence database file >> /usr/local/databases/pir/PIR_1.ADB >> /usr/local/databases/pir/PIR_2.ADB >> /usr/local/databases/pir/PIR_3.ADB Location of structurally determined database. >> /usr/local/lib/seqsee/databases/SEQBANK.db Allow multiple matches for a search string in a sequence >> 0 1 = yes, 0 = no Update output file every 'x' proteins which are processed. >> 200 **************************************************************** Identification code for this function. Do not change this line. >> HSEARCH What format is the sequence database? >> 2 1 = SWISS-PROT, 2 = PIR, 3 = SWISS_PROT (intelligenetics version), 4 = PIR (intelligenetics version) >> 3 Number of files that compose the database Location of each sequence database file >> /usr/local/databases/pir/PIR_1.ADB >> /usr/local/databases/pir/PIR_2.ADB >> /usr/local/databases/pir/PIR_3.ADB Location of structurally determined database. >> /usr/local/lib/seqsee/databases/SEQBANK.db Tell program the location of the similarity scoring matrix. Matrices such as Dayhoff can be used. >> /usr/local/lib/seqsee/tables/wt.align What minimum value from the similarity scoring matrix would constitiute a near match? >> 5 Update output file every 'x' proteins which are processed. >> 200 **************************************************************** Function ID code. Do not change this line. >> DOTPLOT What format is the sequence database? >> 2 1 = SWISS-PROT, 2 = PIR, 3 = SWISS_PROT (intelligenetics version), 4 = PIR (intelligenetics version) >> 3 Number of files that compose the database Location of each sequence database file >> /usr/local/databases/pir/PIR_1.ADB >> /usr/local/databases/pir/PIR_2.ADB >> /usr/local/databases/pir/PIR_3.ADB Tell program the location of the similarity scoring matrix. See the default listed here to understand the input format. >> /usr/local/lib/seqsee/tables/wt.align What minimum value from the similarity scoring matrix would constitiute a near match? >> 5 Length Penalty Value: subtract x*lenPenalty from our score where 'x' is the number of amino acids. >> 5 Threshold Score (homologous segments must score above) >> 80 msearchFlag - Does multiple scans down diagonals Only turn this flag on if database is small. (0 = off, 1 = on) >> 0 Update output file every 'x' proteins which are processed. >> 200 **************************************************************** Identification code for this function. Do not change this line. >> REFSCAN What format is the sequence database? >> 2 1 = SWISS-PROT, 2 = PIR, 3 = SWISS_PROT (intelligenetics version), 4 = PIR (intelligenetics version) >> 3 Number of files that compose the database Location of files in references database >> /usr/local/databases/pir/PIR_1.ADB >> /usr/local/databases/pir/PIR_2.ADB >> /usr/local/databases/pir/PIR_3.ADB Update output file every 'x' proteins which are processed. >> 1000 **************************************************************** Identification code for browse function. Do not change this line. >> BROWSE Location of SEQBANK database >> /usr/local/lib/seqsee/databases/SEQBANK.db Location of PIRSEE databases (PIR Titles + ID codes) >> /usr/local/lib/seqsee/databases/PIRSEE.db Location of SWISSEE databases (SWISS_PROT Titles + ID codes) >> /usr/local/lib/seqsee/databases/SWISSEE.db Location of Default Parameters file for SEQSEE >> /usr/local/lib/seqsee/seqsee.parms