Installation

Installing via conda

The simplest way to install YANK is via the conda package manager. Packages are provided on the omnia Anaconda Cloud channel for Linux, OS X, and Win platforms. The yank Anaconda Cloud page has useful instructions and download statistics.

If you are using the anaconda scientific Python distribution, you already have the conda package manager installed. If not, the quickest way to get started is to install the miniconda distribution, a lightweight minimal installation of Anaconda Python.

On linux, you can install the Python 3.6 version into $HOME/miniconda3 with:

$ wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
$ bash ./Miniconda3-latest-Linux-x86_64.sh -b -p $HOME/miniconda3
$ export PATH="$HOME/miniconda3/bin:$PATH"

On osx, you want to use the osx binary

$ wget https://repo.continuum.io/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
$ bash ./Miniconda3-latest-Linux-x86_64.sh -b -p $HOME/miniconda3
$ export PATH="$HOME/miniconda3/bin:$PATH"

You may want to add the new `$PATH extension to your ~/.bashrc file to ensure Anaconda Python is used by default. Note that YANK will be installed into this local Python installation, so that you will not need to worry about disrupting existing Python installations.

Note

conda installation is the preferred method since all dependencies are automatically fetched and installed for you.


Release build

You can install the latest stable release build of YANK via the conda package with

$ conda config --add channels omnia --add channels conda-forge
$ conda install yank

This version is recommended for all users not actively developing new algorithms for alchemical free energy calculations.

Note

conda will automatically dependencies from binary packages automatically, including difficult-to-install packages such as OpenMM, numpy, and scipy. This is really the easiest way to get started.


Development build

The bleeding-edge, absolute latest, very likely unstable development build of YANK is available on GitHub commit, and can be obtained by installing from source (and installed into whatever the current conda environment is):

Warning

Development builds may be unstable and are generally subjected to less testing than releases. Use at your own risk!

Upgrading your installation

To update an earlier conda installation of YANK to the latest release version, you can use conda update:

$ conda update yank

Testing your installation

Test your YANK installation to make sure everything is behaving properly on your machine:

$ yank selftest

This will check that installation paths are correct and run a battery of tests that ensure any automatically detected GPU hardware is behaving as expected. If installed, it will also check your OpenEye installation.


Testing Available Platforms

You will want to make sure that all GPU accelerated platforms available on your hardware are accessible to YANK. The simulation library that YANK runs on, OpenMM, can run on CPU, CUDA, and OpenCL platforms. The following command will check which platforms are available:

$ yank platforms

You should see an output that looks like the following:

Available OpenMM platforms:
 0 Reference
 1 CPU
 2 CUDA
 3 OpenCL

If your output is missing on option you expect, such as CUDA on Nvidia GPUs, then please check that you have correct drivers for your GPU installed. Non-standard CUDA installations require setting specific environment variables; please see the appropriate section for setting these variables.


Configuring Non-Standard CUDA Install Locations

Multiple versions of CUDA can be installed on a single machine, such as on shared clusters. If this is the case, it may be necessary to set environment variables to make sure that the right version of CUDA is being used for YANK. You will need to know the full <path_to_cuda_install> and the location of that installation’s nvcc program is (by default it is at <path_to_cuda_install>/bin/nvcc). Then run the following lines to set the correct variables:

export OPENMM_CUDA_COMPILER=<path_to_cuda_install>/bin/nvcc
export LD_LIBRARY_PATH=<path_to_cuda_install>/lib64:$LD_LIBRARY_PATH

You may want to add the new $OPENMM_CUDA_COMPILER variable and $LD_LIBRARY_PATH extension to you ~/.bashrc file to avoid setting this every time. If nvcc is installed in a different folder than the example, please use the correct path for your system.

Configuring Your CUDA Devices

You will need to configure your CUDA devices to run in shared/Default Compute Mode if you have CUDA based cards, especially if you plan to run MPI on multiple CUDA cards.

If you run nvidia-smi on your device, you will see a sample output that looks like this:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.48                 Driver Version: 367.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 680     Off  | 0000:03:00.0     N/A |                  N/A |
| 30%   33C    P8    N/A /  N/A |      0MiB /  4036MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 680     Off  | 0000:04:00.0     N/A |                  N/A |
| 30%   32C    P8    N/A /  N/A |      0MiB /  4036MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 680     Off  | 0000:83:00.0     N/A |                  N/A |
| 30%   33C    P8    N/A /  N/A |      0MiB /  4036MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 680     Off  | 0000:84:00.0     N/A |                  N/A |
| 30%   33C    P8    N/A /  N/A |      0MiB /  4036MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

The Compute M. on the right side should be set to Default for your device(s). If not, you can set the card(s) mode with the following: nvidia-smi -i <List of Dev IDs> -c 0 where <List of Dev IDs> is a comma separated list of GPU indices no spaces. For this case, you can write: nvidia-smi -i 0,1,2,3 -c 0.

YANK also has the ability to check this status for you through yank selftest. Part of the command will attempt to run nvidia-smi and infer what the Compute Mode is of any CUDA capable GPU detected. The command tries to infer this information from parsing the output so it may not be exact. Please double check this yourself.


Optional Tools

The OpenEye toolkit and Python wrappers can be installed to enable free energy calculations to be set up directly from multiple supported small molecule formats, including

  • SDF
  • SMILES
  • IUPAC names
  • Tripos mol2
  • PDB

Note that PDB and mol2 are supported through the pure AmberTools pipeline as well, though this does not provide access to the OpenEye AM1-BCC charging pipeline.

Use of the OpenEye toolkit requires an academic or commercial license.

To install these tools into your conda environment, use pip:

$ pip install -i https://pypi.anaconda.org/OpenEye/simple OpenEye-toolkits

Note that you will need to configure your $OE_LICENSE environment variable to point to a valid license file.

You can test your OpenEye installation with

$ python -m openeye.examples.openeye_tests

Supported platforms and environments

Software

YANK runs on Python 3.5, and Python 3.6

We no longer support Python 2.X.

Dependencies

YANK uses a number of tools in order to allow the developers to focus on developing efficient algorithms involved in alchemical free energy calculations, rather than reinventing basic software, numerical, and molecular simulation infrastructure.

Warning

Installation of these prerequisites by hand is not recommended—all required dependencies can be installed via the conda package manager.

Note

This list is taken directly from YANK’s conda-recipe/meta.yaml to provide a singular source for dependencies

  build:
    - python
    - cython
    - setuptools

  run:
    - python
    - pandas
    - numpy >=1.11
    - scipy
    - cython
    - netcdf4 ==1.3.1  # TODO: Fix this right after bugfix: "always return masked array by default, even if there are no masked values"
    - openmm >=7.1
    - mdtraj >=1.7.2
    - openmmtools >=0.15.0
    - pymbar
    - ambermini >=16.16.0
    - docopt
    - openmoltools >=0.7.5
    - mpi4py
    - pyyaml
    - clusterutils
    - sphinxcontrib-bibtex
    - cerberus ==1.1.*
    - matplotlib
    - jupyter
    - pdbfixer
    - libnetcdf >=4.6.0
Optional

Note

The mpi4py installation must be compiled against the system-installed MPI implementation used to launch jobs. Using the conda version of mpi4py together with the conda-provided mpirun is the simplest way to avoid any issues.

  • The OpenEye toolkit and Python wrappers can be used to enable free energy calculations to be set up directly from multiple supported OpenEye formats, including Tripos mol2, PDB, SMILES, and IUPAC names (requires academic or commercial license). Note that PDB and mol2 are supported through the pure AmberTools pipeline as well, though this does not provide access to the OpenEye AM1-BCC charging pipeline.
  • cython optional dependency for the replica-exchange code.

Hardware

Supported hardware

YANK makes use of OpenMM, a GPU-accelerated framework for molecular simulation. This allows the calculations to take advantage of hardware that supports CUDA (such as NVIDIA GPUs) or OpenCL (NVIDIA and ATI GPUs, as well as some processors). OpenMM also supports a multithreaded CPU platform which can be used if no CUDA or OpenCL resources are available.

OpenMM requires that AMD cards can support the most recent Catalyst drivers, and NVIDIA cards can support at least CUDA 7.5.

Installing from Source

Note

We recommend only developers wanting to modify the YANK code should install from source. Users who want to use the latest development version are advised to install the Development build conda package instead.

Installing from the GitHub Source Repository

Installing from source is only recommended for developers that wish to modify YANK or the algorithms it uses. Installation via conda is preferred for all other users.

Clone the source code repository from GitHub.

$ git clone git@github.com:choderalab/yank.git
$ cd yank/
$ python setup.py install

If you wish to install into a different path (often preferred for development), use

$ python setup.py install

setup.py will try to install some of the dependencies, or at least check that you have them installed and throw an error. Note that not all dependencies can be installed via pip, so you will have to install dependencies if installation fails due to unmet dependencies.

Testing your Installation

Test your YANK installation to make sure everything is behaving properly on your machine:

$ yank selftest

This will not only check that installation paths are correct, but also run a battery of tests that ensure any automatically detected GPU hardware is behaving as expected. Please also check that YANK has access to the expected platforms and the correct CUDA version if CUDA is installed in a non-standard location.

Running on the Cloud

Amazon EC2 now provides Linux GPU instances with high-performance GPUs and inexpensive on-demand and spot pricing (g2.2xlarge). We will soon provide ready-to-use images to let you quickly get started on EC2.

We are also exploring building Docker containers for rapid, reproducible, portable deployment of YANK to new compute environments. Stay tuned!