csi - Chemical Shift Index

PENCE / CIHR-Group
Joint Software Centre

Funding for this software has been provided in part by the
Canadian Institutes of Health Research (CIHR Group)
and the
Protein Engineering Networks of Centres of Excellence (PENCE).

csi - Chemical Shift Index

Version: 2.0 - Dec 2002

Latest News
Overview
Reference and Copyright
Download / Installation
How to Use
Programs in csi
Bugs

Overview

CSI is a program for determining secondary structure in proteins from the chemical shift indices of 1-H and 13-C nuclei. CSI is written in the C programming language and can run on most UNIX machines (SUN's, SGI's and NeXT machines).

Reference

Wishart, D.S. and B.D. Sykes. The 13C chemical shift index. A simple method for the identification of protein secondary structure using 13C chemical shift data. J. Biomol. NMR 4:171-180 (1994).

Authors: David Wishart, Brian Sykes

Programming for CSI: Leigh Willard, Tim Jellard

Download

Select the version of csi corresponding to your operating system.

PC(Linux): csi v2.0 (0.09 MB)
Solaris: csi v2.0 (0.08 MB)
SGI(Irix6.5): csi v2.0 (0.11 MB)

Installation

Once you have downloaded the software, you then proceed by uncompressing and untarring the files. For example:

	
	> tar xvf csi-v2.0-sgi6.tar 
	> cd csi-v2.0-sgi6

Look at the README file for details on installation.

	> more README

It is pretty simple, all you have to do is know where you want to put the executables and where to put the documentation, library and example files. The installation script prompts you for the names of these directories.

	> ./Install

Finally you can test the program by going to the directory where the program is installed and type the name. The README file also explains how to set your path environment variable to include the location of the executable.

How to use CSI

Before running the program you must prepare an input file containing the chemical shifts, residue names (1-letter code) and residue numbers. The format for this input file is quite general and examples of the kinds of input files that are allowed can be found in the SAMPLE/ directory. Every input file should have:
- One or two comment lines marked by a ">" character
- A column header line containing a "#" character as the first character in the line, followed by "AA" to indicate the amino acid residue column, followed by the chemical shift title columns ("CA", "HA" etc.)
- The convention used for naming the chemical shift columns is:
```
            i)    CA = alpha carbon
            ii)   HA = alpha proton
            iii)  CO = carbonyl carbon
            iv)   CB = beta carbon
```
- The chemical shifts corresponding to each column title. Note that proton shifts need to be given with two decimal places and carbon chemical shifts with one decimal place.
Following is a typical chemical shift input file for a protein:
```
        >Calmodulin (Drosophila Melanogaster)
        >M. Ikura, L.E. Kay and A. Bax, Biochemistry 29, 4659 (1990)

        #   AA  HA      CA      CB      CO
        1   A   4.15    51.8    18.8    174.0
        2   D   4.67    54.8    41.2    175.8
        3   Q   4.42    55.6    30.2    175.8
        4   L   4.70    54.5    43.7    177.7
        5   T   4.51    60.7    71.3    175.8
        6   E   4.02    60.1    29.2    179.4
        7   E   4.12    60.0    28.9    179.1
        8   Q   3.93    58.7    29.2    178.3
        9   I   3.77    66.3    37.8    177.9
```
When you are preparing your input file you must make sure to do the following:
- Make sure that the "#" character appears as the first character in the column header line. This character is used by the program to identify the start of the chemical shift table.
- Because glycine has two alpha protons, it is important to replace these two chemical shifts with the AVERAGE chemical shift. Hence all glycines should only have a single HA chemical shift listed in the chemical shift table.
- In cases where no chemical shift data is available, the missing value must be marked in with a "0". Absolutely NO other character (a blank, a * or a hyphen) is allowed. This also applies to the CB chemical shift of glycine which should always be entered as a "0". NOTE: if you are missing chemical shift information for an entire column, it is best to delete that column from your input file (see explanation below).
- Check for typos, missing numbers, sequence correctness etc.
- Note that because there are important chemical shift differences between reduced and oxidized cysteine we have adopted the convention (using the single letter amino acid code) that "C" = oxidized cysteine and "B" = reduced cysteine.
When you have finished preparing your input file (or if you just want to test the program on the sample data in SAMPLE/) you may start the program by typing:
```
                               csi
```
and pressing the key. A menu will appear and the rest of the program is essentially self guiding.

Programs in CSI

There are three different programs in CSI:

Adjust/Re-reference Chemical Shifts

The first program is used to add (or subtract) values to an entire column in your input file. It will prompt you for each column in turn and ask if you want to make changes to it. This function is quite useful for correcting carbon chemical shifts which are often referenced to different "0" point compounds (Dioxane, TSP, DSS, TMS, acetone etc.).

Calculate Raw CSI / Calculate Filtered CSI

The second program calculates chemical shift indices and determines the secondary structure. There are two methods which are offered to calculate the chemical shift index. The first method produces "raw" chemical shift values while the second method uses edge-detection, pattern reconginition and digital smoothing to produce "filtered" chemical shift index values. Filtered CSI's are generally easier to read but occasionally some important information is lost in the filtering process.

When you run CSI you will produce an output file that looks something like this:

#
############################################################################
#  Program...:                        CSI (c)
#  Version...:                         2.0
#  Location..:                 University of Alberta
#                          Protein Engineering Network of
#                              Centres of Excellence
#  Input.....:                     SAMPLE/csi.sample
#  Date......:                 Sat Aug 14 11:56:17 1993
#############################################################################
#
#
#     A       HA       CA       CO       CB       Consensus
#
3     M       0 C      0 C      0 C      NA          0 C 
4     T       0 C     -1 C      0 C      NA          0 C 
5     D      -1 H      1 H      1 H      NA         -1 H 
6     Q      -1 H      1 H      1 H      NA         -1 H 
7     Q      -1 H      1 H      1 H      NA         -1 H 
8     A      -1 H      1 H      1 H      NA         -1 H 
9     E      -1 H      1 H      1 H      NA         -1 H 
10    A      -1 H      1 H      1 H      NA         -1 H

The residue numbers and residue letters appear in the first 2 columns while the chemical shift indices corresponding to the CA, CO, CB and/or HA chemical shifts appear in the following columns. Beside each of the indices is a letter (H, C or B) which indicates the secondary structure which has been assigned to the particular residue. Note that:

H = alpha helix
B = beta strand
C = coil (includes turns, loops etc.)

If there is no chemical shift data for either CA, CO,CB or HA, the chemical shift index columns are marked by a "NA" (not available). If a chemical shifts is given as "0", the data will be "filled-in" with a neighbours chemicam shift index. If all any three chemical shifts (HA, CA, CO, CB) are present, the program automatically calculates a consensus secondary structure which appears under the far right column marked as "CONSENSUS".

At the bottom of each CSI table is a section called "Secondary Structure Summary" which provides a short summary of the secondary structures identified by the program.

Graphical Output

The third program (CSI_GRAPH) creates graphs of the output of CSI. This program **only** runs on SGI computers.

To start CSI_GRAPH select the ``Graphical Output'' menu option and you will be prompted for the name of the csi file that you previously prepared by running the ``Calculate CSI'' programs.

When CSI_GRAPH is running, it draws the CSI plots on the lower left region of the screen. Each plot (upto five will be plotted: one each for HA, CA, CB, CO and one for CONSENSUS) will appear on the screen for a few seconds before it is erased and a second (or a third etc.) is produced. A typical run of CSI_GRAPH will take about 20 seconds. Note that all graphs will be plotted with black foreground and a yellow background for higher contrast.

After CSI_GRAPH has run it prepares output rgb files. Typically these files are given the default titles of:

      splotCA.rgb  
      splotHA.rgb  
      splotCO.rgb  
      splotCB.rgb  
      splotCONS.rgb

The purpose of these files is to allow the user to print or view the CSI plots at their leisure. Note that the "CONS" files contain the consensus CSI plots and are generally the ones that are most useful in producing figures and visualizing structure.

The files containing the "rgb" suffix may be viewed on the SGI computer with the "ipaste" command by typing the following:

     ipaste myfile.rgb

where myfile.rgb is either splotCA.rgb, splotHA.rgb, splotCB.rgb, splotCO.rgb or splotCONS.rgb.

Bugs

CSI_GRAPH plots the CSI graphs to the screen. The graphs will disappear from the screen in a few seconds. Users should not try to move these graphs to a different location on the screen or try to close the window on which they are plotted.
CSI_GRAPH cannot plot more then 400 residues.
CSI_GRAPH tries to convert the .rgb output files to postscript using a program called "tops". If you do not have this program then csigraph will complain. After csigraph is done, you will have to use some other program to convert rgb files to postscript.

Back to Software Centre

This file last updated:

Questions to: bionmrwebmaster@biochem.ualberta.ca

PENCE / CIHR-Group Joint Software Centre