About HPC Systems Software User Guides Education Partners

  / gears / hpc / software / compchem / nwchem


Bioinformatics

Compilers and Programming Tools

Computational Chemistry

Finite Element Solvers

Graphics and Imaging

Mathematics

Numerical Libraries

Optimization

Parallel Programming Libraries and Tools

Solid Modeling

Statistics

NWChem

Overview

NWChem is a computational chemistry package that is designed to run on high-performance parallel supercomputers as well as conventional workstation clusters. It aims to be scalable both in its ability to treat large problems efficiently, and in its usage of available parallel computing resources.

NWChem has been developed by the Molecular Sciences Software group of the Environmental Molecular Sciences Laboratory (EMSL) at the Pacific Northwest National Laboratory (PNNL). Most of the implementation has been funded by the EMSL Construction Project.

An extensive list of NWChem's capabilities and functionality can be found by clicking on the Capabilities link on the navigation tabs on the NWChem website at URL http://www.emsl.pnl.gov/docs/nwchem. It is highly recommended that you review this information when evaluating the use of NWChem for your research needs.

Release Notes

The latest NWChem release notes may be found by clicking on the Release Notes link on the navigation tabs on the NWChem website at URL http://www.emsl.pnl.gov/docs/nwchem.

The release notes contain very important information on the new features available in our NWChem version, functional differences from earlier releases you may have used, and other important information. It is highly recommended you review these notes.

Citing NWChem

Our license agreement with EMSL requires that all scholarly works created using the NWChem prominently cite the use of NWChem. The required and proper citation information may be found by clicking on the Citation link on the navigation tabs on the NWChem website at URL http://www.emsl.pnl.gov/docs/nwchem/

Documentation

Documention on NWChem can be found by clicking on the User's Manual link on the navigation tabs on the NWChem website at URL http://www.emsl.pnl.gov/docs/nwchem.

Setup

NWChem requires a configuration file in your home directory so that standard database files that contain the original force field information can be located. These database files reside in a directory that is specified in file $HOME/.nwchemrc.

You should link to the master system-wide configuration file by issuing the following command:

ln -s /usr/global/NWChem-xc/data/default.nwchemrc $HOME/.nwchemrc

The will create the file .nwchemrc in your home directory that NWChem will use to locate the database files.

Please note that you only need to issue this instruction once. You will only need to execute it again should you delete or corrupt your .nwchemrc file.

Usage

NWChem can be run interactively in serail and both serial and parallel in batch mode.

Serial mode works on all systems while parallel mode is available only on a few clusters at the moment. lionxc and lionxb are the only parallel batch systems supported. You should check with the RCC group on parallel availability for availability on other systems.

Serial NWChem

To start an interactive serial NWChem job, run the following command:

/usr/global/NWChem-xc/serial/bin/nwchem input.nw

To run a serial batch NWChem job on the batch clusters such as LION-XC using the PBS queueing system, a PBS script such as the following would be used.

#PBS -l nodes=1:ppn=1
#PBS -l walltime=0:10:00
#PBS -j oe

# change the current working directory to the directory where
# the input deck input.nw can be found
cd $PBS_O_WORKDIR

echo " "
echo "Starting job on `hostname` at `date`"
echo " "

# start serial NWChem with input deck input.nw
/usr/global/NWChem-xc/serial/bin/nwchem input.nw

echo " "
echo "Completing job on `hostname` at `date`"
echo " "

Additional information on PBS scripts and submitting jobs to PBS can be found in the appropriate system's User Guide in the User Guides section of this website.

Parallel NWChem

An MPI version of NWChem is available on clusters that support the Infinipath MPI. Currently this includes clusters LION-XB and LION-XC. Parallel version for non-Infinipath clusters will be made available. Please check with beatnic for availability on non-Infinipath machines.

Since Infinipath MPI is only available on batch clusters no interactive instructions will be provided.

Parallel NWChem does not require any special keywords be added to the input file. Simply invoke the parallel version of NWChem to run in parallel. No shared memory version is available but additional processor cores can be used on a node by passing the proper arguements to PBS. When using multiple cores per node please be sure to reference the Performance Hints section below for important information regarding memory usage. The following example uses 4 nodes and 4 cores per node for a total of 16 processors.

#PBS -l nodes=4:ppn=4
#PBS -l walltime=0:10:00
#PBS -j oe

# change the current working directory to the directory where
# the input deck input.nw can be found
cd $PBS_O_WORKDIR

echo " "
echo "Starting job on `hostname` at `date`"
echo " "

# use mpirun to call parallel NWChem with input deck input.nw
/usr/global/bin /usr/global/NWChem-xc/mpi/nwchem input.nw

echo " "
echo "Completing job on `hostname` at `date`"
echo " "

Additional information on PBS scripts and submitting jobs to PBS can be found in the appropriate system's User Guide in the User Guides section of this website.

NWChem Performance Tips

Disk i/o and processor performance can affect the overall performance of NWChem.

For most users a wise choice for the scratch keyword setting will have the most impact on application performance. Use of the default local temporary disk space will provide the highest i/o performance. If your analysis requires large amounts of scratch disk space, first try to run NWChem jobs on nodes with large local /tmp space before switching to larger network-based scratch directories.

Make use of system memory to store integrals in-core rather then on disk try adding this to the NWChem input file:

scf semidirect memsize 2000000000 filesize 0 end

which will store 2 gwords (16 gbytes) of integrals in memory and none on disk. You can try more by increasing that number. The job does not crash your job if the integral cache cannot be allocated in the memory available to NWChem, so there is no harm in setting this number to a large value, as long as it does not roll over the maximum integer size on your machine (shouldn't be an issue for a 64-bit machine).

This will work in a DFT calculation as well an SCF calculation.

If disk IO is very fast, the difference between in-core and out-of-core algorithms is small. If you have fast disk, an optimal approach would be to use both memory and disk for integral caching, but limit the disk cache because huge files usually slow things down. Experimentations find that past a ~2 gb disk access starts to get slow. You opt for disk caching as well by changing the filesize in the semidirect line to something other than zero.

Semidirect is most useful when calculating the integrals is expensive, ie when high angular momentum functions are used, ie aug-cc-pVQZ. For sp basis sets like 6-31G*, the difference with respect to direct should be small.

When running in parallel and using multiple core on the same node, the memory setting is cumulative. For example:

memory stack 2048 mb heap 2048 mb global 1024 mb

will use a total of 5 GB (2 + 2 + 1) *per MPI process*. Thus 4 MPI procs i on a node will use 4 * 5 or 20 GB. Node memory can be exhausted quite easily if care is not taken.

For disk i/o and processor hints, feel free to contact beatnic@cac.psu.edu

Further Information

Further information on NWChem may be found on the NWChem web site at URL http://www.emsl.pnl.gov/docs/nwchem/.


Please send questions or suggestions about this web page to beatnic@aset.psu.edu

ASET | ITS | Penn State