Installation¶

Installing via conda¶

The simplest way to install YANK is via the conda package manager. Packages are provided on the omnia Anaconda Cloud channel for Linux, OS X, and Win platforms. The yank Anaconda Cloud page has useful instructions and download statistics.

If you are using the anaconda scientific Python distribution, you already have the conda package manager installed. If not, the quickest way to get started is to install the miniconda distribution, a lightweight minimal installation of Anaconda Python.

On linux, you can install the Python 3.6 version into $HOME/miniconda3 with:

$ wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
$ bash ./Miniconda3-latest-Linux-x86_64.sh -b -p $HOME/miniconda3
$ export PATH="$HOME/miniconda3/bin:$PATH"

On osx, you want to use the osx binary

$ wget https://repo.continuum.io/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
$ bash ./Miniconda3-latest-Linux-x86_64.sh -b -p $HOME/miniconda3
$ export PATH="$HOME/miniconda3/bin:$PATH"

You may want to add the new `$PATH extension to your ~/.bashrc file to ensure Anaconda Python is used by default. Note that YANK will be installed into this local Python installation, so that you will not need to worry about disrupting existing Python installations.

Note

conda installation is the preferred method since all dependencies are automatically fetched and installed for you.

Release build¶

You can install the latest stable release build of YANK via the conda package with

$ conda config --add channels omnia --add channels conda-forge
$ conda install yank

This version is recommended for all users not actively developing new algorithms for alchemical free energy calculations.

Note

conda will automatically dependencies from binary packages automatically, including difficult-to-install packages such as OpenMM, numpy, and scipy. This is really the easiest way to get started.

Development build¶

The bleeding-edge, absolute latest, very likely unstable development build of YANK is available on GitHub commit, and can be obtained by installing from source (and installed into whatever the current conda environment is):

Warning

Development builds may be unstable and are generally subjected to less testing than releases. Use at your own risk!

Upgrading your installation¶

To update an earlier conda installation of YANK to the latest release version, you can use conda update:

$ conda update yank

Testing your installation¶

Test your YANK installation to make sure everything is behaving properly on your machine:

$ yank selftest

This will check that installation paths are correct and run a battery of tests that ensure any automatically detected GPU hardware is behaving as expected. If installed, it will also check your OpenEye installation.

Testing Available Platforms¶

You will want to make sure that all GPU accelerated platforms available on your hardware are accessible to YANK. The simulation library that YANK runs on, OpenMM, can run on CPU, CUDA, and OpenCL platforms. The following command will check which platforms are available:

$ yank platforms

You should see an output that looks like the following:

Available OpenMM platforms:
Reference
CPU
CUDA
OpenCL

If your output is missing on option you expect, such as CUDA on Nvidia GPUs, then please check that you have correct drivers for your GPU installed. Non-standard CUDA installations require setting specific environment variables; please see the appropriate section for setting these variables.

Configuring Non-Standard CUDA Install Locations¶

Multiple versions of CUDA can be installed on a single machine, such as on shared clusters. If this is the case, it may be necessary to set environment variables to make sure that the right version of CUDA is being used for YANK. You will need to know the full <path_to_cuda_install> and the location of that installation’s nvcc program is (by default it is at <path_to_cuda_install>/bin/nvcc). Then run the following lines to set the correct variables:

export OPENMM_CUDA_COMPILER=<path_to_cuda_install>/bin/nvcc
export LD_LIBRARY_PATH=<path_to_cuda_install>/lib64:$LD_LIBRARY_PATH

You may want to add the new $OPENMM_CUDA_COMPILER variable and $LD_LIBRARY_PATH extension to you ~/.bashrc file to avoid setting this every time. If nvcc is installed in a different folder than the example, please use the correct path for your system.

Configuring Your CUDA Devices¶

You will need to configure your CUDA devices to run in shared/Default Compute Mode if you have CUDA based cards, especially if you plan to run MPI on multiple CUDA cards.

If you run nvidia-smi on your device, you will see a sample output that looks like this:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.48                 Driver Version: 367.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 680     Off  | 0000:03:00.0     N/A |                  N/A |
| 30%   33C    P8    N/A /  N/A |      0MiB /  4036MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 680     Off  | 0000:04:00.0     N/A |                  N/A |
| 30%   32C    P8    N/A /  N/A |      0MiB /  4036MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 680     Off  | 0000:83:00.0     N/A |                  N/A |
| 30%   33C    P8    N/A /  N/A |      0MiB /  4036MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 680     Off  | 0000:84:00.0     N/A |                  N/A |
| 30%   33C    P8    N/A /  N/A |      0MiB /  4036MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

The Compute M. on the right side should be set to Default for your device(s). If not, you can set the card(s) mode with the following: nvidia-smi -i <List of Dev IDs> -c 0 where <List of Dev IDs> is a comma separated list of GPU indices no spaces. For this case, you can write: nvidia-smi -i 0,1,2,3 -c 0.

YANK also has the ability to check this status for you through yank selftest. Part of the command will attempt to run nvidia-smi and infer what the Compute Mode is of any CUDA capable GPU detected. The command tries to infer this information from parsing the output so it may not be exact. Please double check this yourself.

Optional Tools¶

The OpenEye toolkit and Python wrappers can be installed to enable free energy calculations to be set up directly from multiple supported small molecule formats, including

SDF
SMILES
IUPAC names
Tripos mol2
PDB

Note that PDB and mol2 are supported through the pure AmberTools pipeline as well, though this does not provide access to the OpenEye AM1-BCC charging pipeline.

Use of the OpenEye toolkit requires an academic or commercial license.

To install these tools into your conda environment, use pip:

$ pip install -i https://pypi.anaconda.org/OpenEye/simple OpenEye-toolkits

Note that you will need to configure your $OE_LICENSE environment variable to point to a valid license file.

You can test your OpenEye installation with

$ python -m openeye.examples.openeye_tests

Supported platforms and environments¶

Software¶

YANK runs on Python 3.5, and Python 3.6

We no longer support Python 2.X.

Dependencies¶

YANK uses a number of tools in order to allow the developers to focus on developing efficient algorithms involved in alchemical free energy calculations, rather than reinventing basic software, numerical, and molecular simulation infrastructure.

Warning

Installation of these prerequisites by hand is not recommended—all required dependencies can be installed via the conda package manager.

Note

This list is taken directly from YANK’s conda-recipe/meta.yaml to provide a singular source for dependencies

  build:
    - python
    - cython
    - setuptools

  run:
    - python
    - pandas
    - numpy >=1.11
    - scipy
    - cython
    - netcdf4 ==1.3.1  # TODO: Fix this right after bugfix: "always return masked array by default, even if there are no masked values"
    - openmm >=7.1
    - mdtraj >=1.7.2
    - openmmtools >=0.15.0
    - pymbar
    - ambermini >=16.16.0
    - docopt
    - openmoltools >=0.7.5
    - mpi4py
    - pyyaml
    - clusterutils
    - sphinxcontrib-bibtex
    - cerberus ==1.1.*
    - matplotlib
    - jupyter
    - pdbfixer
    - libnetcdf >=4.6.0

Optional¶

mpi4py is needed if MPI support is desired.

Note

The mpi4py installation must be compiled against the system-installed MPI implementation used to launch jobs. Using the conda version of mpi4py together with the conda-provided mpirun is the simplest way to avoid any issues.

The OpenEye toolkit and Python wrappers can be used to enable free energy calculations to be set up directly from multiple supported OpenEye formats, including Tripos mol2, PDB, SMILES, and IUPAC names (requires academic or commercial license). Note that PDB and mol2 are supported through the pure AmberTools pipeline as well, though this does not provide access to the OpenEye AM1-BCC charging pipeline.

cython optional dependency for the replica-exchange code.

Hardware¶

Supported hardware¶

YANK makes use of OpenMM, a GPU-accelerated framework for molecular simulation. This allows the calculations to take advantage of hardware that supports CUDA (such as NVIDIA GPUs) or OpenCL (NVIDIA and ATI GPUs, as well as some processors). OpenMM also supports a multithreaded CPU platform which can be used if no CUDA or OpenCL resources are available.

OpenMM requires that AMD cards can support the most recent Catalyst drivers, and NVIDIA cards can support at least CUDA 7.5.

Recommended Hardware¶

We have found the best price/performance results are currently obtained with NVIDIA GTX-class consumer-grade cards, such as the GTX-780, GTX-980, GTX-1080, and GTX-Titan cards. You can find some benchmarks for OpenMM on several classes of recent GPUs at openmm.org.

Ross Walker and the Amber GPU developers maintain a set of excellent pages with good inexpensive GPU hardware recommendations that will also work well with OpenMM and YANK.

Installing from Source¶

Note

We recommend only developers wanting to modify the YANK code should install from source. Users who want to use the latest development version are advised to install the Development build conda package instead.

Installing from the GitHub Source Repository¶

Installing from source is only recommended for developers that wish to modify YANK or the algorithms it uses. Installation via conda is preferred for all other users.

Clone the source code repository from GitHub.

$ git clone git@github.com:choderalab/yank.git
$ cd yank/
$ python setup.py install

If you wish to install into a different path (often preferred for development), use

$ python setup.py install

setup.py will try to install some of the dependencies, or at least check that you have them installed and throw an error. Note that not all dependencies can be installed via pip, so you will have to install dependencies if installation fails due to unmet dependencies.

Testing your Installation¶

Test your YANK installation to make sure everything is behaving properly on your machine:

$ yank selftest

This will not only check that installation paths are correct, but also run a battery of tests that ensure any automatically detected GPU hardware is behaving as expected. Please also check that YANK has access to the expected platforms and the correct CUDA version if CUDA is installed in a non-standard location.

Running on the Cloud¶

Amazon EC2 now provides Linux GPU instances with high-performance GPUs and inexpensive on-demand and spot pricing (g2.2xlarge). We will soon provide ready-to-use images to let you quickly get started on EC2.

We are also exploring building Docker containers for rapid, reproducible, portable deployment of YANK to new compute environments. Stay tuned!