PENCE / CIHR-Group
Joint Software Centre

Funding for this software has been provided in part by the
Canadian Institutes of Health Research (CIHR Group)
and the
Protein Engineering Networks of Centres of Excellence (PENCE).


gsc - NMR Chemical Shift Comparison

Version: 1.2 - Dec 2002

Purpose: gsc is a tcl/tk program which makes it easy to analyze chemical shifts of selected atoms between chemical shift files.

Table of Contents

  1. Introduction

  2. Preparations

  3. Running the program

Overview

Often times a researcher will have two or more similar sequences and will want to know how similar the proteins are in terms of chemical shift. Are the chemical shifts more similar in certain regions than others? Can shift patterns be detected in certain atoms? Perhaps the user has a set of predicted chemical shifts for a sequence and wants to compare these to a different or true set of shifts. These are the types of questions that gsc tries to address.

The user enters shift data for sequences which are identical or highly homologous. The user then selects the atom type (eg, hn, ha, ca, etc) and the region of the sequences to study. The program draws a line graph depicting the difference in shifts between the sequences. The user can then save the plot to a postscript file.

Copyright and Acknowledgements

Wolfram Gronwald , R. Boyko, and B.D. Sykes. GSC: a graphical program for NMR chemical shifts comparison CABIOS Vol. 13 no. 5 557-558 (1997)

Copyright (C) 1999 - No portion of this program may be incorporated into other programs or sold for profit without express written consent of the authors.

Download

Select the version of gsc corresponding to your operating system.

PC(Linux): gsc v1.2 (0.91 MB)
Solaris: gsc v1.2 (1.04 MB)
SGI(Irix6.5): gsc v1.2 (1.50 MB)

Installation

Once you have downloaded the software, you then proceed by uncompressing and untarring the files. For example:

	> uncompress gsc-v1.2-sgi6.tar.Z
	> tar xvf gsc-v1.2-sgi6.tar 
	> cd gsc-v1.2-sgi6
Look at the README file for details on installation.
	> more README 
It is pretty simple, all you have to do is know where you want to put the executables and where to put the documentation, library and example files. The installation script prompts you for the names of these directories.
	> ./Install
Finally you can test the program by going to the directory where the program is installed and type the name. The README file also explains how to set your path environment variable to include the location of the executable.

Data files for the example displayed in the screen snapshot are found in the INSTALL_LIB/gsc/examples directory.

Input Data Files

The gsc program can only read shift files which are in PPM format.

Verification of amino acid codes and atom names is not done by gsc, however it is important for amino acid ID codes to be ordered from lowest to highest. Only shifts which have the same shift name can be compared! Please note the implications of this statement:

  1. One cannot directly compare the shifts of two sequences when amino acid substitutions occur. Figuring out the proper shift offset in such a comparison will be left to a future version of gsc.

  2. The user may need to renumber the amino acids in one of the shift files. One cannot compare "1:LYS_65:HA" in one shift file to "1:LYS_1:HA" in another file even though this was the obvious intention. Use the renum_ppm program in the "camra" package to do this.

  3. Atom names in whatever specification format used must be consistent between the shift files. For example, you cannot compare "1:LYS_1:HA" and "1:LYS_1:HA1".

As a final note, it is possible to input shift data where the name and value fields are not necessarily the first and second fields respectively. The user can set the fields via the "Shift Data Format" FILE menu option. One particular application of this is for a program called orb which has valid shift values of interest in the 4th field as well.

Example PPM shift files should be available in INSTALL_DIR/lib/gsc/examples.


Basic Program Usage

  1. Start the program by typing 'gsc'.

    If you do not get a graphical window, check with your system administrator to make sure the program has been installed and is accessible to you. If you customize the colors, fonts, size, etc of the window, these changes will be saved in your $HOME/.gsc file.

  2. Enter the properly formatted shift file that you wish to study in the "Shift Data File" field and press return.

    Once the shift data file is entered, gsc will take a few seconds to read the shifts. It will display the sequence IDs of the first and last shifts read in the "xMin" and "xMax" fields respectively.

  3. Next enter shift files for comparison in the "Compare File 1" or "Compare File 2" fields.

    The user can compare the shifts in the above data file to one or two other sets of shift data.

  4. (optional) Change the "xMin" or "xMax" fields.

    This option is useful if you want to compare amino acids in a specific range of your sequence. Remember to press return after changing a number.

  5. Select the atom you wish to study from the "Atoms" menu.

    Click and hold the mouse down and move the cursor to hilite the atom of choice.

  6. Click the calculate button.

    The program calculates the shift differences. Understanding and saving the output is dealt with in the next two sections.


Understanding the Output

  1. A line chart is drawn for each comparison file entered. The first line chart is represented by rectangles and the second by ovals.

  2. Each rectangle/oval represents a matched amino acid shift difference. A match only occurs when:

    • amino acid ID numbers correspond (see renum_ppm)
    • amino acids are identical (eg, no substitutions)
    • both shift values are available

    A line connecting any two ovals or rectangles represents consecutively numbered amino acid ID numbers.

  3. If an amino acid has more than one matching atom, then the averaged difference is plotted. An example of this would be glycine where HA1 and HA2 atoms can be matched between shift files.

  4. At the bottom of the graph(s) the program prints the average error and correlation coefficient of all matched shift differences.

  5. The "yMax" field determines the maximum and minimum numeric labelling on the y-axis. If "yMax" is set to 0, then the program will calculate a suitable value based on the shift differences it finds. A "yMax" greater than 0 means the program will not change the scale on the y-axis and will plot the points according to the current scale. Whenever the user changes the "yMax" field, the program will update the graph.


Customizing the Graph

The following settings are available under the GRAPHICS menu item. All changes are saved to the users's ./gscDefaults file. A user can re-acquire the default settings by simply getting rid of this file.

If you are looking to change colors of certain fields, a list of valid color names can be found in $OPENWINHOME/lib/rgb.txt or try /usr/X11R6/lib/X11/rgb.txt. Ask your system administrator.

The program has put the graph text in default areas. You can move any text by holding down the leftmost mouse button over the text and drag it to the area you wish. This may be useful when the text and graph lines overlap.

Fonts for the graph can be changed via the fonts GRAPHICS menu item. When selecting a font, it is desirable to select a postscript font available on your laser printer. Use the Save plot FILE menu option to save your plot to a postscript file.


Output Difference Files

It can be very insightful to look at specific atom shift differences for amino acids in greater detail.

The numeric shift differences that gsc has plotted can be viewed via the "show output..." FILE menu option. The user can also choose to save copies of the output files by selecting the "Save output..." FILE menu option.

The shift differences are printed one per line, and if more than 1 atom match is found in a given amino acid, then the average difference for that amino acid is also printed.


Final Comments

We felt that having more than two compare files would make it hard to read the graph. If the user really wants to depict several shift file comparisons on the same graph, you may be able to input the numeric output files into some other graph fitting package.

Do not edit the ./gscDefaults file directly because it is overwritten whenever you run gsc and make a change to your options. If the user does not have ./gscDefaults, the program will use the default file from INSTALL_DIR/lib/gsc/gscDefaults.

Appendix 1: PPM Formatted Shift Files

The following rules define a shift file in PPM format:
  1. There can be data and non-data lines. Non-data lines are preceded with a comment character '!' in the first column.

  2. Each data line contains one atomic chemical shift name and one or more shift value fields separated by one or more blank characters.

  3. The atomic chemical shift name field is the first field and of the form:
    	molNum:Residue_ResId:atom
    
    where
    	molNum = Molecular Number (an integer)
    	Residue = Amino acid in 3 letter code (character string)
    	ResId = Amino acid ID number (an integer)
    	atom = Atom name (character string)
    
For example, 1:GLU_95:HB1 has molecular Number = 1, Amino acid = GLU, Amino acid ID number = 95, and atom name = HB1

The shift name field has no blank characters and amino acids are expected to have ResId's which are ordered from lowest to highest. A shift value field is specified as a either a real number or with asterisks '*' to denote unknown values. The value "-999.99" or "999.99" is also understood by several programs to mean an unknown value.

Here is an example of a typical PPM shift file:

!
!Sequence: ADQ
!
1:ALA_1:N           ***.**
1:ALA_1:C           174.00
1:ALA_1:CA           51.90
1:ALA_1:CB           18.80 
1:ALA_1:HN          ***.**
1:ALA_1:HA            4.15
1:ALA_1:HB#           1.57
1:ASP_2:N           120.50
1:ASP_2:C           175.80
1:ASP_2:CA           54.70
1:ASP_2:CB           41.20 
1:ASP_2:HN          ***.**
1:ASP_2:HA            4.67
1:ASP_2:HB1           2.72
1:ASP_2:HB2           2.60
1:ASP_2:CG           **.**
1:ASP_2:HD2           *.**
1:GLN_3:N           119.60
1:GLN_3:C           175.80
1:GLN_3:CA           55.70
1:GLN_3:CB           30.20 
1:GLN_3:HN            8.24
1:GLN_3:HA            4.42
1:GLN_3:HB1           2.12
1:GLN_3:HB2           2.00
1:GLN_3:CG           33.70 
1:GLN_3:HG1           2.38
1:GLN_3:HG2           2.38
1:GLN_3:CD          180.00 
1:GLN_3:NE2         ***.**
1:GLN_3:HE21          7.37
1:GLN_3:HE22          6.71


Back to Software Centre

This file last updated:

Questions to: bionmrwebmaster@biochem.ualberta.ca