RepeatMasker
Overview
RepeatMasker is a program that screens DNA sequences for interspersed
repeats and low complexity DNA sequences. The output of the program is
a detailed annotation of the repeats that are present in the query
sequence as well as a modified version of the query sequence in which
all the annotated repeats have been masked (default: replaced by
Ns). On average, about 50% of a human genomic DNA sequence is masked
by the program. Sequence comparisons in RepeatMasker are performed by
the program cross_match, an efficient implementation of the
Smith-Waterman-Gotoh algorithm developed by Phil Green.
Setup
No setup is necessary for this package.
Usage
RepeatMasker is invoked with the command RepeatMasker.
Examples
The following is an example PBS script to run a RepeatMasker job on LION-XE
for a maximum of 2 hours. The input file input.fa is in FASTA format as in
in the directory /home/foo/repeatmasker. The -mus option is given to
tell RepeatMasker to mask against rodent specific and mammalian wide
repeats instead of primates, which is the default. Output will appear in
/home/foo/repeatmasker since that is the directory in which the command is
run.
#PBS -l nodes=1:ppn=1
#PBS -l walltime=2:00:00
#PBS -j oe
#PBS -q lionxe-serial
# setup the RepeatMasker environment
. /usr/local/setup/repeatmasker.setup.sh
# change the current working directory to the directory where
# the input file can be found
cd /home/foo/repeatmasker
# run the RepeatMasker command
RepeatMasker -mus input.fa
|
Further information on PBS scripts and submitting jobs on the LION-XE cluster
can be found in the User Guides section of the HPC website.
Documentation
Documentation on RepeatMasker can be found on LION-XE in the file
/usr/global/repeatmasker/repeatmasker.help.
Please send questions or suggestions about this web page to beatnic@aset.psu.edu
ASET | ITS | Penn State
|