About HPC Systems Software User Guides Education Partners

  / gears / hpc / software / bioinf / sputnik


Bioinformatics

Compilers and Programming Tools

Computational Chemistry

File System

Finite Element Solvers

Graphics

Mathematics

Numerical Libraries

Optimization

Parallel Programming Libraries and Tools

Queuing and Scheduling Systems

Solid Modeling

Statistics

Sputnik

Overview

Sputnik is a C language program that searches dna sequence files in FASTA format for microsatellite repeats. A sequence file is specified on the command line and the resulting hits are written to stdout along with their position in the sequence, length, and a score determined by the length of the repeat and the number of errors.

Sputnik uses a recursive algorithm to search for repeated patterns of nucleotides of length between 2 and 5. Insertions, mismatches and deletions are tolerated but affect the overall score. It does not search against a "library" of known microsatellites. Instead it reads through the entire sequence, assumes the existence of a repeat at every position, compares subsequent nucleotides and applies a simple scoring rule. If the resulting score rises above a preset threshold, the region along with its position and score is written out. If the score falls below a cutoff threshold, the search is abandoned and begun again at the next nucleotide. Each nucleotide that matches the value predicted (by assuming a repeat) adds to the score. Each "error" subtracts from the score. When an error is encountered, the three possible kinds of errors (mismatch, insertion and deletion) are assumed and recursive calls to the comparison routine are made. If the resulting score from one of these is above the cutoff threshold, it is returned and the best of three pursued.

Setup

To use Sputnik it is necessary to set your Sputnik environment by running a special command sequence once per login session. You may optionally place these commands in your .cshrc (C Shell users) or .profile (Bourne Sell users) to avoid having to manually run these commands on login.

For csh and tcsh:

source /usr/local/setup/sputnik.setup.csh

For sh and bash:

. /usr/local/setup/sputnik.setup.sh
Usage

Sputnik is invoked with the command sputnik. It takes as an arguement the name of a file of sequences in FASTA format.

Examples

The following is an example PBS script to run a Sputnik job on LION-XE for a maximum of 2 hours. The input file input.fa is in FASTA format and for the scope of this example is in /home/foo/sputnik. Since Sputnik sends its output to STDOUT (standard output), the job output will be in the normal PBS output file.

#PBS -l nodes=1:ppn=1
#PBS -l walltime=2:00:00
#PBS -j oe
#PBS -q lionxe-serial

# setup the sputnik environment
. /usr/local/setup/sputnik.setup.sh

# change the current working directory to the directory where
# the input file can be found
cd /home/foo/sputnik

# run the sputnik command
sputnik input.fa

Further information on PBS scripts and submitting jobs on the LION-XE cluster can be found in the User Guides section of the HPC website.

Documentation

Information on Sputnik can be found on LION-XE in the file /usr/global/sputnik/README.


Please send questions or suggestions about this web page to beatnic@aset.psu.edu

ASET | ITS | Penn State