MultiState Module¶

This is the API for the Multi State sub-module and its classes.

MultiStateAnalyzers

MultiState¶

Multistate Sampling simulation algorithms, specific variants, and analyzers

This module provides a general facility for running multiple thermodynamic state multistate simulations, both general as well as derived classes for special cases such as parallel tempering (in which the states differ only in temperature).

The classes also provide

Provided classes include:

yank.multistate.MultiStateSampler

Base class for general, multi-thermodynamic state parallel multistate
yank.multistate.ReplicaExchangeSampler

Derived class from MultiStateSampler which allows sampled thermodynamic states to swap based on Hamiltonian Replica Exchange
yank.multistate.ParallelTemperingSampler

Convenience subclass of ReplicaExchange for parallel tempering simulations (one System object, many temperatures).
yank.multistate.SAMSSampler

Single-replica sampler which samples through multiple thermodynamic states on the fly.
yank.multistate.MultiStateReporter

Replica Exchange reporter class to store all variables and data

Analyzers¶

The MultiState module also provides analysis modules to analyze simulations and compute observables from data generated under any of the MultiStateSampler’s

Extending and Subclassing¶

Subclassing a sampler and analyzer is done by importing and extending any of the following:

The base MultiStateSampler from multistatesampler

The base MultiStateReporter from multistatereporter

The base MultiStateAnalyzer or PhaseAnalyzer and base ObservablesRegistry` from multistateanalyzer

COPYRIGHT¶

Current version by Andrea Rizzi <andrea.rizzi@choderalab.org>, Levi N. Naden <levi.naden@choderalab.org> and John D. Chodera <john.chodera@choderalab.org> while at Memorial Sloan Kettering Cancer Center.

Original version by John D. Chodera <jchodera@gmail.com> while at the University of California Berkeley.

LICENSE¶

This code is licensed under the latest available version of the MIT License.

MultistateSampler¶

Base multi-thermodynamic state multistate class

COPYRIGHT

Current version by Andrea Rizzi <andrea.rizzi@choderalab.org>, Levi N. Naden <levi.naden@choderalab.org> and John D. Chodera <john.chodera@choderalab.org> while at Memorial Sloan Kettering Cancer Center.

Original version by John D. Chodera <jchodera@gmail.com> while at the University of California Berkeley.

LICENSE

This code is licensed under the latest available version of the MIT License.

class yank.multistate.multistatesampler.MultiStateSampler(mcmc_moves=None, number_of_iterations=1, online_analysis_interval=200, online_analysis_target_error=0.0, online_analysis_minimum_iterations=200, locality=None)[source]¶

Base class for samplers that sample multiple thermodynamic states using one or more replicas.

This base class provides a general simulation facility for multistate from multiple thermodynamic states, allowing any set of thermodynamic states to be specified. If instantiated on its own, the thermodynamic state indices associated with each state are specified and replica mixing does not change any thermodynamic states, meaning that each replica remains in its original thermodynamic state.

Stored configurations, energies, swaps, and restart information are all written to a single output file using the platform portable, robust, and efficient NetCDF4 library.

Parameters:

mcmc_moves : MCMCMove or list of MCMCMove, optional

The MCMCMove used to propagate the thermodynamic states. If a list of MCMCMoves, they will be assigned to the correspondent thermodynamic state on creation. If None is provided, Langevin dynamics with 2fm timestep, 5.0/ps collision rate, and 500 steps per iteration will be used.

number_of_iterations : int or infinity, optional, default: 1

The number of iterations to perform. Both float('inf') and numpy.inf are accepted for infinity. If you set this to infinity, be sure to set also online_analysis_interval.

online_analysis_interval : None or Int >= 1, optional, default: 200

Choose the interval at which to perform online analysis of the free energy.

After every interval, the simulation will be stopped and the free energy estimated.

If the error in the free energy estimate is at or below online_analysis_target_error, then the simulation will be considered completed.

If set to None, then no online analysis is performed

online_analysis_target_error : float >= 0, optional, default 0.0

The target error for the online analysis measured in kT per phase.

Once the free energy is at or below this value, the phase will be considered complete.

If online_analysis_interval is None, this option does nothing.

Default is set to 0.0 since online analysis runs by default, but a finite number_of_iterations should also be set to ensure there is some stop condition. If target error is 0 and an infinite number of iterations is set, then the sampler will run until the user stop it manually.

online_analysis_minimum_iterations : int >= 0, optional, default 200

Set the minimum number of iterations which must pass before online analysis is carried out.

Since the initial samples likely not to yield a good estimate of free energy, save time and just skip them If online_analysis_interval is None, this does nothing

locality : int > 0, optional, default None

If None, the energies at all states will be computed for every replica each iteration. If int > 0, energies will only be computed for states range(max(0, state-locality), min(n_states, state+locality)).

Attributes:

n_replicas: The integer number of replicas (read-only).
n_states: The integer number of thermodynamic states (read-only).
iteration: The integer current iteration of the simulation (read-only).
mcmc_moves: A copy of the MCMCMoves list used to propagate the simulation.
sampler_states: A copy of the sampler states list at the current iteration.
metadata: A copy of the metadata dictionary passed on creation (read-only).
is_completed: Check if we have reached any of the stop target criteria (read-only)
:param number_of_iterations: Maximum number of integer iterations that will be run
:param online_analysis_interval: How frequently to carry out online analysis in number of iterations
:param online_analysis_target_error: Target free energy difference error float at which simulation will be stopped during online analysis, in dimensionless energy
:param online_analysis_minimum_iterations: Minimum number of iterations needed before online analysis is run as int

classmethod from_storage(storage)[source]¶

Constructor from an existing storage file.

Parameters:	storage : str or Reporter If str: The path to the storage file. If `Reporter`: uses the `Reporter` options In the future this will be able to take a Storage class as well.
Returns:	sampler : MultiStateSampler A new instance of MultiStateSampler (or subclass) in the same state of the last stored iteration.

class Status(iteration, target_error, is_completed)¶

count(value) → integer -- return number of occurrences of value¶

index(value[, start[, stop]]) → integer -- return first index of value.¶: Raises ValueError if the value is not present.

is_completed¶: Alias for field number 2

iteration¶: Alias for field number 0

target_error¶: Alias for field number 1

classmethod read_status(storage)[source]¶

Read the status of the calculation from the storage file.

This class method can be used to quickly check the status of the simulation before loading the full ReplicaExchange object from disk.

Parameters:	storage : str or Reporter The path to the storage file or the reporter object.
Returns:	status : ReplicaExchange.Status The status of the replica-exchange calculation. It has three fields: `iteration`, `target_error`, and `is_completed`.

n_states¶: The integer number of thermodynamic states (read-only).

n_replicas¶: The integer number of replicas (read-only).

iteration¶

The integer current iteration of the simulation (read-only).

If the simulation has not been created yet, this is None.

mcmc_moves¶

A copy of the MCMCMoves list used to propagate the simulation.

This can be set only before creation.

sampler_states¶

A copy of the sampler states list at the current iteration.

This can be set only before running.

is_periodic¶: Return True if system is periodic, False if not, and None if not initialized

online_analysis_interval¶: interval to carry out online analysis

metadata¶: A copy of the metadata dictionary passed on creation (read-only).

is_completed¶: Check if we have reached any of the stop target criteria (read-only)

create(thermodynamic_states: list, sampler_states, storage, initial_thermodynamic_states=None, unsampled_thermodynamic_states=None, metadata=None)[source]¶

Create new multistate sampler simulation.

Parameters:

thermodynamic_states : list of openmmtools.states.ThermodynamicState

Thermodynamic states to simulate, where one replica is allocated per state. Each state must have a system with the same number of atoms.

sampler_states : openmmtools.states.SamplerState or list

One or more sets of initial sampler states. The number of replicas is taken to be the number of sampler states provided. If the sampler states do not have box_vectors attached and the system is periodic, an exception will be thrown.

storage : str or instanced Reporter

If str: the path to the storage file. Default checkpoint options from Reporter class are used If Reporter: Uses the reporter options and storage path In the future this will be able to take a Storage class as well.

initial_thermodynamic_states : None or list or array-like of int of length len(sampler_states), optional,

default: None. Initial thermodynamic_state index for each sampler_state. If no initial distribution is chosen, sampler_states are distributed between the thermodynamic_states following these rules:

If len(thermodynamic_states) == len(sampler_states): 1-to-1 distribution

If len(thermodynamic_states) > len(sampler_states): First and last state distributed first remaining sampler_states spaced evenly by index until sampler_states are depleted. If there is only one sampler_state, then the only first thermodynamic_state will be chosen

If len(thermodynamic_states) < len(sampler_states), each thermodynamic_state receives an equal number of sampler_states until there are insufficient number of sampler_states remaining to give each thermodynamic_state an equal number. Then the rules from the previous point are followed.

unsampled_thermodynamic_states : list of openmmtools.states.ThermodynamicState, optional, default=None

These are ThermodynamicStates that are not propagated, but their reduced potential is computed at each iteration for each replica. These energy can be used as data for reweighting schemes (default is None).

metadata : dict, optional, default=None

Simulation metadata to be stored in the file.

minimize(tolerance=Quantity(value=1.0, unit=kilojoule/(nanometer*mole)), max_iterations=0)[source]¶

Minimize all replicas.

Minimized positions are stored at the end.

Parameters:	tolerance : simtk.unit.Quantity, optional Minimization tolerance (units of energy/mole/length, default is `1.0 * unit.kilojoules_per_mole / unit.nanometers`). max_iterations : int, optional Maximum number of iterations for minimization. If 0, minimization continues until converged.

equilibrate(n_iterations, mcmc_moves=None)[source]¶

Equilibrate all replicas.

This does not increase the iteration counter. The equilibrated positions are stored at the end.

Parameters:	n_iterations : int Number of equilibration iterations. mcmc_moves : MCMCMove or list of MCMCMove, optional Optionally, the MCMCMoves to use for equilibration can be different from the ones used in production.

run(n_iterations=None)[source]¶

Run the replica-exchange simulation.

This runs at most number_of_iterations iterations. Use extend() to pass the limit.

Parameters:	n_iterations : int, optional If specified, only at most the specified number of iterations will be run (default is None).

extend(n_iterations)[source]¶

Extend the simulation by the given number of iterations.

Contrarily to run(), this will extend the number of iterations past number_of_iteration if requested.

Parameters:	n_iterations : int The number of iterations to run.

classmethod default_options()[source]¶: dict of all default class options (keyword arguments for __init__ for class and superclasses)

options¶: dict of all class options (keyword arguments for __init__ for class and superclasses)

ReplicaExchangeSampler¶

Derived multi-thermodynamic state multistate class with exchanging configurations between replicas

COPYRIGHT

Current version by Andrea Rizzi <andrea.rizzi@choderalab.org>, Levi N. Naden <levi.naden@choderalab.org> and John D. Chodera <john.chodera@choderalab.org> while at Memorial Sloan Kettering Cancer Center.

Original version by John D. Chodera <jchodera@gmail.com> while at the University of California Berkeley.

LICENSE

This code is licensed under the latest available version of the MIT License.

class yank.multistate.replicaexchange.ReplicaExchangeSampler(replica_mixing_scheme='swap-all', **kwargs)[source]¶

Replica-exchange simulation facility.

This MultiStateSampler class provides a general replica-exchange simulation facility, allowing any set of thermodynamic states to be specified, along with a set of initial positions to be assigned to the replicas in a round-robin fashion.

No distinction is made between one-dimensional and multidimensional replica layout. By default, the replica mixing scheme attempts to mix all replicas to minimize slow diffusion normally found in multidimensional replica exchange simulations (Modification of the ‘replica_mixing_scheme’ setting will allow the traditional ‘neighbor swaps only’ scheme to be used.)

Stored configurations, energies, swaps, and restart information are all written to a single output file using the platform portable, robust, and efficient NetCDF4 library.

Parameters:

mcmc_moves : MCMCMove or list of MCMCMove, optional

The MCMCMove used to propagate the states. If a list of MCMCMoves, they will be assigned to the correspondent thermodynamic state on creation. If None is provided, Langevin dynamics with 2fm timestep, 5.0/ps collision rate, and 500 steps per iteration will be used.

number_of_iterations : int or infinity, optional, default: 1

The number of iterations to perform. Both float('inf') and numpy.inf are accepted for infinity. If you set this to infinity, be sure to set also online_analysis_interval.

replica_mixing_scheme : ‘swap-all’, ‘swap-neighbors’ or None, Default: ‘swap-all’

The scheme used to swap thermodynamic states between replicas.

online_analysis_interval : None or Int >= 1, optional, default None

Choose the interval at which to perform online analysis of the free energy.

After every interval, the simulation will be stopped and the free energy estimated.

If the error in the free energy estimate is at or below online_analysis_target_error, then the simulation will be considered completed.

online_analysis_target_error : float >= 0, optional, default 0.2

The target error for the online analysis measured in kT per phase.

Once the free energy is at or below this value, the phase will be considered complete.

If online_analysis_interval is None, this option does nothing.

online_analysis_minimum_iterations : int >= 0, optional, default 50

Set the minimum number of iterations which must pass before online analysis is carried out.

Since the initial samples likely not to yield a good estimate of free energy, save time and just skip them If online_analysis_interval is None, this does nothing

Examples

Parallel tempering simulation of alanine dipeptide in implicit solvent (replica exchange among temperatures). This is just an illustrative example; use ParallelTempering class for actual production parallel tempering simulations.

Create the system.

>>> import math
>>> from simtk import unit
>>> from openmmtools import testsystems, states, mcmc
>>> testsystem = testsystems.AlanineDipeptideImplicit()

Create thermodynamic states for parallel tempering with exponentially-spaced schedule.

>>> n_replicas = 3  # Number of temperature replicas.
>>> T_min = 298.0 * unit.kelvin  # Minimum temperature.
>>> T_max = 600.0 * unit.kelvin  # Maximum temperature.
>>> temperatures = [T_min + (T_max - T_min) * (math.exp(float(i) / float(nreplicas-1)) - 1.0) / (math.e - 1.0)
...                 for i in range(n_replicas)]
>>> thermodynamic_states = [states.ThermodynamicState(system=testsystem.system, temperature=T)
...                         for T in temperatures]

Initialize simulation object with options. Run with a GHMC integrator.

>>> move = mcmc.GHMCMove(timestep=2.0*unit.femtoseconds, n_steps=50)
>>> simulation = ReplicaExchangeSampler(mcmc_moves=move, number_of_iterations=2)

Create simulation with its storage file (in a temporary directory) and run.

>>> storage_path = tempfile.NamedTemporaryFile(delete=False).name + '.nc'
>>> reporter = MultiStateReporter(storage_path, checkpoint_interval=1)
>>> simulation.create(thermodynamic_states=thermodynamic_states,
>>>                   sampler_states=states.SamplerState(testsystem.positions),
>>>                   storage=reporter)
>>> simulation.run()  # This runs for a maximum of 2 iterations.
>>> simulation.iteration
2
>>> simulation.run(n_iterations=1)
>>> simulation.iteration
2

To resume a simulation from an existing storage file and extend it beyond the original number of iterations.

>>> del simulation
>>> simulation = ReplicaExchangeSampler.from_storage(reporter)
>>> simulation.extend(n_iterations=1)
>>> simulation.iteration
3

You can extract several information from the NetCDF file using the Reporter class while the simulation is running. This reads the SamplerStates of every run iteration.

>>> reporter = MultiStateReporter(storage=storage_path, open_mode='r', checkpoint_interval=1)
>>> sampler_states = reporter.read_sampler_states(iteration=range(1, 4))
>>> len(sampler_states)
3
>>> sampler_states[-1].positions.shape  # Alanine dipeptide has 22 atoms.
(22, 3)

Clean up.

>>> os.remove(storage_path)

Parameters:

number_of_iterations – Maximum number of integer iterations that will be run
replica_mixing_scheme – Scheme which describes how replicas are exchanged each iteration as string
online_analysis_interval – How frequently to carry out online analysis in number of iterations
online_analysis_target_error – Target free energy difference error float at which simulation will be stopped during online analysis, in dimensionless energy
online_analysis_minimum_iterations – Minimum number of iterations needed before online analysis is run as int

Attributes:

n_replicas: The integer number of replicas (read-only).
iteration: The integer current iteration of the simulation (read-only).
mcmc_moves: A copy of the MCMCMoves list used to propagate the simulation.
sampler_states: A copy of the sampler states list at the current iteration.
metadata: A copy of the metadata dictionary passed on creation (read-only).
is_completed: Check if we have reached any of the stop target criteria (read-only)

class Status(iteration, target_error, is_completed)¶

count(value) → integer -- return number of occurrences of value¶

index(value[, start[, stop]]) → integer -- return first index of value.¶: Raises ValueError if the value is not present.

is_completed¶: Alias for field number 2

iteration¶: Alias for field number 0

target_error¶: Alias for field number 1

create(thermodynamic_states: list, sampler_states, storage, initial_thermodynamic_states=None, unsampled_thermodynamic_states=None, metadata=None)¶

Create new multistate sampler simulation.

Parameters:

thermodynamic_states : list of openmmtools.states.ThermodynamicState

Thermodynamic states to simulate, where one replica is allocated per state. Each state must have a system with the same number of atoms.

sampler_states : openmmtools.states.SamplerState or list

One or more sets of initial sampler states. The number of replicas is taken to be the number of sampler states provided. If the sampler states do not have box_vectors attached and the system is periodic, an exception will be thrown.

storage : str or instanced Reporter

If str: the path to the storage file. Default checkpoint options from Reporter class are used If Reporter: Uses the reporter options and storage path In the future this will be able to take a Storage class as well.

initial_thermodynamic_states : None or list or array-like of int of length len(sampler_states), optional,

default: None. Initial thermodynamic_state index for each sampler_state. If no initial distribution is chosen, sampler_states are distributed between the thermodynamic_states following these rules:

If len(thermodynamic_states) == len(sampler_states): 1-to-1 distribution

If len(thermodynamic_states) > len(sampler_states): First and last state distributed first remaining sampler_states spaced evenly by index until sampler_states are depleted. If there is only one sampler_state, then the only first thermodynamic_state will be chosen

If len(thermodynamic_states) < len(sampler_states), each thermodynamic_state receives an equal number of sampler_states until there are insufficient number of sampler_states remaining to give each thermodynamic_state an equal number. Then the rules from the previous point are followed.

unsampled_thermodynamic_states : list of openmmtools.states.ThermodynamicState, optional, default=None

These are ThermodynamicStates that are not propagated, but their reduced potential is computed at each iteration for each replica. These energy can be used as data for reweighting schemes (default is None).

metadata : dict, optional, default=None

Simulation metadata to be stored in the file.

default_options()¶: dict of all default class options (keyword arguments for __init__ for class and superclasses)

equilibrate(n_iterations, mcmc_moves=None)¶

Equilibrate all replicas.

This does not increase the iteration counter. The equilibrated positions are stored at the end.

Parameters:	n_iterations : int Number of equilibration iterations. mcmc_moves : MCMCMove or list of MCMCMove, optional Optionally, the MCMCMoves to use for equilibration can be different from the ones used in production.

extend(n_iterations)¶

Extend the simulation by the given number of iterations.

Contrarily to run(), this will extend the number of iterations past number_of_iteration if requested.

Parameters:	n_iterations : int The number of iterations to run.

from_storage(storage)¶

Constructor from an existing storage file.

Parameters:	storage : str or Reporter If str: The path to the storage file. If `Reporter`: uses the `Reporter` options In the future this will be able to take a Storage class as well.
Returns:	sampler : MultiStateSampler A new instance of MultiStateSampler (or subclass) in the same state of the last stored iteration.

is_completed¶: Check if we have reached any of the stop target criteria (read-only)

is_periodic¶: Return True if system is periodic, False if not, and None if not initialized

iteration¶

The integer current iteration of the simulation (read-only).

If the simulation has not been created yet, this is None.

mcmc_moves¶

A copy of the MCMCMoves list used to propagate the simulation.

This can be set only before creation.

metadata¶: A copy of the metadata dictionary passed on creation (read-only).

minimize(tolerance=Quantity(value=1.0, unit=kilojoule/(nanometer*mole)), max_iterations=0)¶

Minimize all replicas.

Minimized positions are stored at the end.

Parameters:	tolerance : simtk.unit.Quantity, optional Minimization tolerance (units of energy/mole/length, default is `1.0 * unit.kilojoules_per_mole / unit.nanometers`). max_iterations : int, optional Maximum number of iterations for minimization. If 0, minimization continues until converged.

n_replicas¶: The integer number of replicas (read-only).

n_states¶: The integer number of thermodynamic states (read-only).

options¶: dict of all class options (keyword arguments for __init__ for class and superclasses)

read_status(storage)¶

Read the status of the calculation from the storage file.

This class method can be used to quickly check the status of the simulation before loading the full ReplicaExchange object from disk.

Parameters:	storage : str or Reporter The path to the storage file or the reporter object.
Returns:	status : ReplicaExchange.Status The status of the replica-exchange calculation. It has three fields: `iteration`, `target_error`, and `is_completed`.

run(n_iterations=None)¶

Run the replica-exchange simulation.

This runs at most number_of_iterations iterations. Use extend() to pass the limit.

Parameters:	n_iterations : int, optional If specified, only at most the specified number of iterations will be run (default is None).

sampler_states¶

A copy of the sampler states list at the current iteration.

This can be set only before running.

class yank.multistate.replicaexchange.ReplicaExchangeAnalyzer(*args, unbias_restraint=True, restraint_energy_cutoff='auto', restraint_distance_cutoff='auto', **kwargs)[source]¶

The ReplicaExchangeAnalyzer is the analyzer for a simulation generated from a Replica Exchange sampler simulation, implemented as an instance of the MultiStateSamplerAnalyzer.

See also

PhaseAnalyzer, MultiStateSamplerAnalyzer

clear()¶

Reset all cached objects.

This must to be called if the information in the reporter changes after analysis.

generate_mixing_statistics(number_equilibrated: typing.Union[int, NoneType] = None) → typing.NamedTuple¶

Compute and return replica mixing statistics.

Compute the transition state matrix, its eigenvalues sorted from greatest to least, and the state index correlation function.

Parameters:	number_equilibrated : int, optional, default=None If specified, only samples `number_equilibrated:end` will be used in analysis. If not specified, automatically retrieves the number from equilibration data or generates it from the internal energy.
Returns:	mixing_statistics : namedtuple A namedtuple containing the following attributes: - `transition_matrix`: (nstates by nstates `np.array`) - `eigenvalues`: (nstates-dimensional `np.array`) - `statistical_inefficiency`: float

get_effective_energy_timeseries(energies=None, replica_state_indices=None)¶

Generate the effective energy (negative log deviance) timeseries that is generated for this phase.

The effective energy for a series of samples x_n, n = 1..N, is defined as

u_n = - ln pi(x_n) + c

where pi(x) is the probability density being sampled, and c is an arbitrary constant.

Parameters:

energies : ndarray of shape (K,L,N), optional, Default: None: Energies from replicas K, sampled states L, and iterations N. If provided, then states input_sampled_states must also be provided.
replica_state_indices : ndarray of shape (K,N), optional, Default: None: Integer indices of each sampled state (matching L dimension in input_energy). that each replica K sampled every iteration N. If provided, then states input_energies must also be provided.

Returns:

u_n : ndarray of shape (N,): u_n[n] is the negative log deviance of the same from iteration n Timeseries used to determine equilibration time and statistical inefficiency.

get_enthalpy()¶

Compute the difference in enthalpy and error in that estimate from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:	DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Difference in enthalpy from each state relative to each other state dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Error in the difference in enthalpy from each state relative to each other state

get_entropy()¶

Compute the difference in entropy and error in that estimate from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:	DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Difference in enthalpy from each state relative to each other state dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Error in the difference in enthalpy from each state relative to each other state

get_free_energy()¶

Compute the free energy and error in free energy from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:	DeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Difference in free energy from each state relative to each other state dDeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Error in the difference in free energy from each state relative to each other state

has_log_weights¶: Return True if the storage has log weights, False otherwise

kT¶

Quantity of boltzmann constant times temperature of the phase in units of energy per mol

Allows conversion between dimensionless energy and unit bearing energy

n_equilibration_iterations¶: int: The number of equilibration interations.

n_iterations¶: int: The total number of iterations of the phase.

n_replicas¶: int: Number of replicas.

n_states¶: int: Number of sampled thermodynamic states.

name¶: User-readable string name of the phase

observables¶: List of observables that the instanced analyzer can compute/fetch.

read_energies()¶

Extract energies from the ncfile and order them by replica, state, iteration.

Returns:

sampled_energy_matrix : np.ndarray of shape [n_replicas, n_states, n_iterations]: Potential energy matrix of the sampled states.
unsampled_energy_matrix : np.ndarray of shape [n_replicas, n_unsamped_states, n_iterations]: Potential energy matrix of the unsampled states. Energy from each drawn sample n, evaluated at unsampled state l. If no unsampled states were drawn, this will be shape (0,N).
neighborhoods : np.ndarray of shape [n_replicas, n_states, n_iterations]: Neighborhood energies were computed at, uses a boolean mask over the energy_matrix.
replica_state_indices : np.ndarray of shape [n_replicas, n_iterations]: States sampled by the replicas in the energy_matrix

read_logZ(iteration=None)¶

Extract logZ estimates from the ncfile, if present. Returns ValueError if not present.

Parameters:	iteration : int or slice, optional, default=None If specified, iteration or slice of iterations to extract
Returns:	logZ : np.ndarray of shape [n_states, n_iterations] logZ[l,n] is the online logZ estimate for state `l` at iteration `n`

read_log_weights()¶

Extract log weights from the ncfile, if present. Returns ValueError if not present.

Returns:	log_weights : np.ndarray of shape [n_states, n_iterations] log_weights[l,n] is the log weight applied to state `l` during the collection of samples at iteration `n`

reference_states¶: Tuple of reference states i and j for MultiPhaseAnalyzer instances

reformat_energies_for_mbar(u_kln: numpy.ndarray, n_k: typing.Union[numpy.ndarray, NoneType] = None)¶

Convert [replica, state, iteration] data into [state, total_iteration] data

This method assumes that the first dimension are all samplers, the second dimension are all the thermodynamic states energies were evaluated at and an equal number of samples were drawn from each k’th sampler, UNLESS n_k is specified.

Parameters:

u_kln : np.ndarray of shape (K,L,N’)

K = number of replica samplers L = number of thermodynamic states, N’ = number of iterations from state k

n_k : np.ndarray of shape K or None

Number of samples each _SAMPLER_ (k) has drawn This allows you to have trailing entries on a given kth row in the n’th (n prime) index which do not contribute to the conversion.

If this is None, assumes ALL samplers have the same number of samples such that N_k = N’ for all k

WARNING: N_k is number of samples the SAMPLER drew in total, NOT how many samples were drawn from each thermodynamic state L. This method knows nothing of how many samples were drawn from each state.

Returns:

u_ln : np.ndarray of shape (L, N): Reduced, non-sparse data format L = number of thermodynamic states N = sum_k N_k. note this is not N’

reporter¶: Sampler Reporter tied to this object.

show_mixing_statistics(cutoff=0.05, number_equilibrated=None)¶

Print summary of mixing statistics. Passes information off to generate_mixing_statistics then prints it out to the logger

Parameters:	cutoff : float, optional, default=0.05 Only transition probabilities above ‘cutoff’ will be printed number_equilibrated : int, optional, default=None If specified, only samples number_equilibrated:end will be used in analysis If not specified, it uses the internally held statistics best

statistical_inefficiency¶: float: The statistical inefficiency of the sampler.

use_online_data¶: Get the online data flag

ParallelTemperingSampler¶

Derived multi-thermodynamic state multistate class with exchanging configurations between replicas of different temperatures. This is a special case which accepts a single thermodynamic_state and different temperatures to sample. If you want different temperatures and Hamiltonians, use ReplicaExchangeSampler with temperatures pre-set.

COPYRIGHT

Current version by Andrea Rizzi <andrea.rizzi@choderalab.org>, Levi N. Naden <levi.naden@choderalab.org> and John D. Chodera <john.chodera@choderalab.org> while at Memorial Sloan Kettering Cancer Center.

Original version by John D. Chodera <jchodera@gmail.com> while at the University of California Berkeley.

LICENSE

This code is licensed under the latest available version of the MIT License.

class yank.multistate.paralleltempering.ParallelTemperingSampler(replica_mixing_scheme='swap-all', **kwargs)[source]¶

Parallel tempering simulation facility.

This class provides a facility for parallel tempering simulations. It is a subclass of ReplicaExchange, but provides efficiency improvements for parallel tempering simulations, so should be preferred for this type of simulation. In particular, this makes use of the fact that the reduced potentials are linear in inverse temperature.

See also

MultiStateSampler, ReplicaExchangeSampler

Examples

Create the system.

>>> from simtk import unit
>>> from openmmtools import testsystems, states, mcmc
>>> import tempfile
>>> testsystem = testsystems.AlanineDipeptideImplicit()

Create thermodynamic states for parallel tempering with exponentially-spaced schedule.

>>> n_replicas = 3  # Number of temperature replicas.
>>> T_min = 298.0 * unit.kelvin  # Minimum temperature.
>>> T_max = 600.0 * unit.kelvin  # Maximum temperature.
>>> reference_state = states.ThermodynamicState(system=testsystem.system, temperature=T_min)

Initialize simulation object with options. Run with a GHMC integrator.

>>> move = mcmc.GHMCMove(timestep=2.0*unit.femtoseconds, n_steps=50)
>>> simulation = ParallelTemperingSampler(mcmc_moves=move, number_of_iterations=2)

Create simulation with its storage file (in a temporary directory) and run.

>>> storage_path = tempfile.NamedTemporaryFile(delete=False).name + '.nc'
>>> reporter = MultiStateReporter(storage_path, checkpoint_interval=10)
>>> simulation.create(reference_state,
...                   states.SamplerState(testsystem.positions),
...                   reporter, min_temperature=T_min,
...                   max_temperature=T_max, n_temperatures=n_replicas)
>>> simulation.run(n_iterations=1)

Clean up.

>>> os.remove(storage_path)

create(thermodynamic_state, sampler_states: list, storage, min_temperature=None, max_temperature=None, n_temperatures=None, temperatures=None, **kwargs)[source]¶

Initialize a parallel tempering simulation object.

Parameters:

thermodynamic_state : openmmtools.states.ThermodynamicState

Reference thermodynamic state that will be simulated at the given temperatures.

WARNING: This is a SINGLE state, not a list of states!

sampler_states : openmmtools.states.SamplerState or list

One or more sets of initial sampler states. If a list of SamplerStates, they will be assigned to replicas in a round-robin fashion.

storage : str or Reporter

If str: path to the storage file, checkpoint options are default If Reporter: Instanced Reporter class, checkpoint information is read from In the future this will be able to take a Storage class as well.

min_temperature : simtk.unit.Quantity, optional

Minimum temperature (units of temperature, default is None).

max_temperature : simtk.unit.Quantity, optional

Maximum temperature (units of temperature, default is None).

n_temperatures : int, optional

Number of exponentially-spaced temperatures between min_temperature and max_temperature (default is None).

temperatures : list of simtk.unit.Quantity, optional

If specified, this list of temperatures will be used instead of min_temperature, max_temperature, and n_temperatures (units of temperature, default is None).

metadata : dict, optional

Simulation metadata to be stored in the file.

Notes

Either (min_temperature, max_temperature, n_temperatures) must all be specified or the list of ‘temperatures‘ must be specified.

class Status(iteration, target_error, is_completed)¶

count(value) → integer -- return number of occurrences of value¶

index(value[, start[, stop]]) → integer -- return first index of value.¶: Raises ValueError if the value is not present.

is_completed¶: Alias for field number 2

iteration¶: Alias for field number 0

target_error¶: Alias for field number 1

default_options()¶: dict of all default class options (keyword arguments for __init__ for class and superclasses)

equilibrate(n_iterations, mcmc_moves=None)¶

Equilibrate all replicas.

This does not increase the iteration counter. The equilibrated positions are stored at the end.

Parameters:	n_iterations : int Number of equilibration iterations. mcmc_moves : MCMCMove or list of MCMCMove, optional Optionally, the MCMCMoves to use for equilibration can be different from the ones used in production.

extend(n_iterations)¶

Extend the simulation by the given number of iterations.

Contrarily to run(), this will extend the number of iterations past number_of_iteration if requested.

Parameters:	n_iterations : int The number of iterations to run.

from_storage(storage)¶

Constructor from an existing storage file.

Parameters:	storage : str or Reporter If str: The path to the storage file. If `Reporter`: uses the `Reporter` options In the future this will be able to take a Storage class as well.
Returns:	sampler : MultiStateSampler A new instance of MultiStateSampler (or subclass) in the same state of the last stored iteration.

is_completed¶: Check if we have reached any of the stop target criteria (read-only)

is_periodic¶: Return True if system is periodic, False if not, and None if not initialized

iteration¶

The integer current iteration of the simulation (read-only).

If the simulation has not been created yet, this is None.

mcmc_moves¶

A copy of the MCMCMoves list used to propagate the simulation.

This can be set only before creation.

metadata¶: A copy of the metadata dictionary passed on creation (read-only).

minimize(tolerance=Quantity(value=1.0, unit=kilojoule/(nanometer*mole)), max_iterations=0)¶

Minimize all replicas.

Minimized positions are stored at the end.

Parameters:	tolerance : simtk.unit.Quantity, optional Minimization tolerance (units of energy/mole/length, default is `1.0 * unit.kilojoules_per_mole / unit.nanometers`). max_iterations : int, optional Maximum number of iterations for minimization. If 0, minimization continues until converged.

n_replicas¶: The integer number of replicas (read-only).

n_states¶: The integer number of thermodynamic states (read-only).

options¶: dict of all class options (keyword arguments for __init__ for class and superclasses)

read_status(storage)¶

Read the status of the calculation from the storage file.

This class method can be used to quickly check the status of the simulation before loading the full ReplicaExchange object from disk.

Parameters:	storage : str or Reporter The path to the storage file or the reporter object.
Returns:	status : ReplicaExchange.Status The status of the replica-exchange calculation. It has three fields: `iteration`, `target_error`, and `is_completed`.

run(n_iterations=None)¶

Run the replica-exchange simulation.

This runs at most number_of_iterations iterations. Use extend() to pass the limit.

Parameters:	n_iterations : int, optional If specified, only at most the specified number of iterations will be run (default is None).

sampler_states¶

A copy of the sampler states list at the current iteration.

This can be set only before running.

class yank.multistate.paralleltempering.ParallelTemperingAnalyzer(*args, unbias_restraint=True, restraint_energy_cutoff='auto', restraint_distance_cutoff='auto', **kwargs)[source]¶

The ParallelTemperingAnalyzer is the analyzer for a simulation generated from a Parallel Tempering sampler simulation, implemented as an instance of the ReplicaExchangeAnalyzer as the sampler is a subclass of the yank.multistate.ReplicaExchangeSampler

See also

PhaseAnalyzer, ReplicaExchangeAnalyzer

clear()¶

Reset all cached objects.

This must to be called if the information in the reporter changes after analysis.

generate_mixing_statistics(number_equilibrated: typing.Union[int, NoneType] = None) → typing.NamedTuple¶

Compute and return replica mixing statistics.

Compute the transition state matrix, its eigenvalues sorted from greatest to least, and the state index correlation function.

Parameters:	number_equilibrated : int, optional, default=None If specified, only samples `number_equilibrated:end` will be used in analysis. If not specified, automatically retrieves the number from equilibration data or generates it from the internal energy.
Returns:	mixing_statistics : namedtuple A namedtuple containing the following attributes: - `transition_matrix`: (nstates by nstates `np.array`) - `eigenvalues`: (nstates-dimensional `np.array`) - `statistical_inefficiency`: float

get_effective_energy_timeseries(energies=None, replica_state_indices=None)¶

Generate the effective energy (negative log deviance) timeseries that is generated for this phase.

The effective energy for a series of samples x_n, n = 1..N, is defined as

u_n = - ln pi(x_n) + c

where pi(x) is the probability density being sampled, and c is an arbitrary constant.

Parameters:

energies : ndarray of shape (K,L,N), optional, Default: None: Energies from replicas K, sampled states L, and iterations N. If provided, then states input_sampled_states must also be provided.
replica_state_indices : ndarray of shape (K,N), optional, Default: None: Integer indices of each sampled state (matching L dimension in input_energy). that each replica K sampled every iteration N. If provided, then states input_energies must also be provided.

Returns:

u_n : ndarray of shape (N,): u_n[n] is the negative log deviance of the same from iteration n Timeseries used to determine equilibration time and statistical inefficiency.

get_enthalpy()¶

Compute the difference in enthalpy and error in that estimate from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:	DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Difference in enthalpy from each state relative to each other state dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Error in the difference in enthalpy from each state relative to each other state

get_entropy()¶

Compute the difference in entropy and error in that estimate from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:	DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Difference in enthalpy from each state relative to each other state dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Error in the difference in enthalpy from each state relative to each other state

get_free_energy()¶

Compute the free energy and error in free energy from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:	DeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Difference in free energy from each state relative to each other state dDeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Error in the difference in free energy from each state relative to each other state

has_log_weights¶: Return True if the storage has log weights, False otherwise

kT¶

Quantity of boltzmann constant times temperature of the phase in units of energy per mol

Allows conversion between dimensionless energy and unit bearing energy

n_equilibration_iterations¶: int: The number of equilibration interations.

n_iterations¶: int: The total number of iterations of the phase.

n_replicas¶: int: Number of replicas.

n_states¶: int: Number of sampled thermodynamic states.

name¶: User-readable string name of the phase

observables¶: List of observables that the instanced analyzer can compute/fetch.

read_energies()¶

Extract energies from the ncfile and order them by replica, state, iteration.

Returns:

sampled_energy_matrix : np.ndarray of shape [n_replicas, n_states, n_iterations]: Potential energy matrix of the sampled states.
unsampled_energy_matrix : np.ndarray of shape [n_replicas, n_unsamped_states, n_iterations]: Potential energy matrix of the unsampled states. Energy from each drawn sample n, evaluated at unsampled state l. If no unsampled states were drawn, this will be shape (0,N).
neighborhoods : np.ndarray of shape [n_replicas, n_states, n_iterations]: Neighborhood energies were computed at, uses a boolean mask over the energy_matrix.
replica_state_indices : np.ndarray of shape [n_replicas, n_iterations]: States sampled by the replicas in the energy_matrix

read_logZ(iteration=None)¶

Extract logZ estimates from the ncfile, if present. Returns ValueError if not present.

Parameters:	iteration : int or slice, optional, default=None If specified, iteration or slice of iterations to extract
Returns:	logZ : np.ndarray of shape [n_states, n_iterations] logZ[l,n] is the online logZ estimate for state `l` at iteration `n`

read_log_weights()¶

Extract log weights from the ncfile, if present. Returns ValueError if not present.

Returns:	log_weights : np.ndarray of shape [n_states, n_iterations] log_weights[l,n] is the log weight applied to state `l` during the collection of samples at iteration `n`

reference_states¶: Tuple of reference states i and j for MultiPhaseAnalyzer instances

reformat_energies_for_mbar(u_kln: numpy.ndarray, n_k: typing.Union[numpy.ndarray, NoneType] = None)¶

Convert [replica, state, iteration] data into [state, total_iteration] data

This method assumes that the first dimension are all samplers, the second dimension are all the thermodynamic states energies were evaluated at and an equal number of samples were drawn from each k’th sampler, UNLESS n_k is specified.

Parameters:

u_kln : np.ndarray of shape (K,L,N’)

K = number of replica samplers L = number of thermodynamic states, N’ = number of iterations from state k

n_k : np.ndarray of shape K or None

Number of samples each _SAMPLER_ (k) has drawn This allows you to have trailing entries on a given kth row in the n’th (n prime) index which do not contribute to the conversion.

If this is None, assumes ALL samplers have the same number of samples such that N_k = N’ for all k

WARNING: N_k is number of samples the SAMPLER drew in total, NOT how many samples were drawn from each thermodynamic state L. This method knows nothing of how many samples were drawn from each state.

Returns:

u_ln : np.ndarray of shape (L, N): Reduced, non-sparse data format L = number of thermodynamic states N = sum_k N_k. note this is not N’

reporter¶: Sampler Reporter tied to this object.

show_mixing_statistics(cutoff=0.05, number_equilibrated=None)¶

Print summary of mixing statistics. Passes information off to generate_mixing_statistics then prints it out to the logger

Parameters:	cutoff : float, optional, default=0.05 Only transition probabilities above ‘cutoff’ will be printed number_equilibrated : int, optional, default=None If specified, only samples number_equilibrated:end will be used in analysis If not specified, it uses the internally held statistics best

statistical_inefficiency¶: float: The statistical inefficiency of the sampler.

use_online_data¶: Get the online data flag

SamsSampler¶

Self-adjusted mixture sampling (SAMS), also known as optimally-adjusted mixture sampling.

This implementation uses stochastic approximation to allow one or more replicas to sample the whole range of thermodynamic states for rapid online computation of free energies.

COPYRIGHT

Written by John D. Chodera <john.chodera@choderalab.org> while at Memorial Sloan Kettering Cancer Center.

LICENSE

This code is licensed under the latest available version of the MIT License.

class yank.multistate.sams.SAMSSampler(number_of_iterations=1, log_target_probabilities=None, state_update_scheme='global-jump', locality=5, update_stages='two-stage', flatness_threshold=0.2, weight_update_method='rao-blackwellized', adapt_target_probabilities=False, gamma0=1.0, logZ_guess=None, **kwargs)[source]¶

Self-adjusted mixture sampling (SAMS), also known as optimally-adjusted mixture sampling.

This class provides a facility for self-adjusted mixture sampling simulations. One or more replicas use the method of expanded ensembles [1] to sample multiple thermodynamic states within each replica, with log weights for each thermodynamic state adapted on the fly [2] to achieve the desired target probabilities for each state.

See also

ReplicaExchangeSampler

References

[1] Lyubartsev AP, Martsinovski AA, Shevkunov SV, and Vorontsov-Velyaminov PN. New approach to Monte Carlo calculation of the free energy: Method of expanded ensembles. JCP 96:1776, 1992 http://dx.doi.org/10.1063/1.462133

[2] Tan, Z. Optimally adjusted mixture sampling and locally weighted histogram analysis, Journal of Computational and Graphical Statistics 26:54, 2017. http://dx.doi.org/10.1080/10618600.2015.1113975

Examples

SAMS simulation of alanine dipeptide in implicit solvent at different temperatures.

Create the system:

>>> import math
>>> from simtk import unit
>>> from openmmtools import testsystems, states, mcmc
>>> testsystem = testsystems.AlanineDipeptideVacuum()

Create thermodynamic states for parallel tempering with exponentially-spaced schedule:

>>> n_replicas = 3  # Number of temperature replicas.
>>> T_min = 298.0 * unit.kelvin  # Minimum temperature.
>>> T_max = 600.0 * unit.kelvin  # Maximum temperature.
>>> temperatures = [T_min + (T_max - T_min) * (math.exp(float(i) / float(nreplicas-1)) - 1.0) / (math.e - 1.0)
...                 for i in range(n_replicas)]
>>> thermodynamic_states = [states.ThermodynamicState(system=testsystem.system, temperature=T)
...                         for T in temperatures]

Initialize simulation object with options. Run with a GHMC integrator:

>>> move = mcmc.GHMCMove(timestep=2.0*unit.femtoseconds, n_steps=50)
>>> simulation = SAMSSampler(mcmc_moves=move, number_of_iterations=2,
>>>                          state_update_scheme='restricted-range', locality=5,
>>>                          update_stages='two-stage', flatness_threshold=0.2,
>>>                          weight_update_method='rao-blackwellized',
>>>                          adapt_target_probabilities=False)

Create a single-replica SAMS simulation bound to a storage file and run:

>>> storage_path = tempfile.NamedTemporaryFile(delete=False).name + '.nc'
>>> reporter = MultiStateReporter(storage_path, checkpoint_interval=1)
>>> simulation.create(thermodynamic_states=thermodynamic_states,
>>>                   sampler_states=[states.SamplerState(testsystem.positions)],
>>>                   storage=reporter)
>>> simulation.run()  # This runs for a maximum of 2 iterations.
>>> simulation.iteration
2
>>> simulation.run(n_iterations=1)
>>> simulation.iteration
2

To resume a simulation from an existing storage file and extend it beyond the original number of iterations.

>>> del simulation
>>> simulation = SAMSSampler.from_storage(reporter)
>>> simulation.extend(n_iterations=1)
>>> simulation.iteration
3

You can extract several information from the NetCDF file using the Reporter class while the simulation is running. This reads the SamplerStates of every run iteration.

>>> reporter = MultiStateReporter(storage=storage_path, open_mode='r', checkpoint_interval=1)
>>> sampler_states = reporter.read_sampler_states(iteration=range(1, 4))
>>> len(sampler_states)
3
>>> sampler_states[-1].positions.shape  # Alanine dipeptide has 22 atoms.
(22, 3)

Clean up.

>>> os.remove(storage_path)

Attributes:

log_target_probabilities : array-like: log_target_probabilities[state_index] is the log target probability for state state_index
state_update_scheme : str: Thermodynamic state sampling scheme. One of [‘global-jump’, ‘local-jump’, ‘restricted-range’]
locality : int: Number of neighboring states on either side to consider for local update schemes
update_stages : str: Number of stages to use for update. One of [‘one-stage’, ‘two-stage’]
weight_update_method : str: Method to use for updating log weights in SAMS. One of [‘optimal’, ‘rao-blackwellized’]
adapt_target_probabilities : bool: If True, target probabilities will be adapted to achieve minimal thermodynamic length between terminal thermodynamic states.
gamma0 : float, optional, default=0.0: Initial weight adaptation rate.
logZ_guess : array-like of shape [n_states] of floats, optional, default=None: Initial guess for logZ for all states, if available.

class Status(iteration, target_error, is_completed)¶

count(value) → integer -- return number of occurrences of value¶

index(value[, start[, stop]]) → integer -- return first index of value.¶: Raises ValueError if the value is not present.

is_completed¶: Alias for field number 2

iteration¶: Alias for field number 0

target_error¶: Alias for field number 1

create(thermodynamic_states: list, sampler_states, storage, initial_thermodynamic_states=None, unsampled_thermodynamic_states=None, metadata=None)¶

Create new multistate sampler simulation.

Parameters:

thermodynamic_states : list of openmmtools.states.ThermodynamicState

Thermodynamic states to simulate, where one replica is allocated per state. Each state must have a system with the same number of atoms.

sampler_states : openmmtools.states.SamplerState or list

One or more sets of initial sampler states. The number of replicas is taken to be the number of sampler states provided. If the sampler states do not have box_vectors attached and the system is periodic, an exception will be thrown.

storage : str or instanced Reporter

If str: the path to the storage file. Default checkpoint options from Reporter class are used If Reporter: Uses the reporter options and storage path In the future this will be able to take a Storage class as well.

initial_thermodynamic_states : None or list or array-like of int of length len(sampler_states), optional,

default: None. Initial thermodynamic_state index for each sampler_state. If no initial distribution is chosen, sampler_states are distributed between the thermodynamic_states following these rules:

If len(thermodynamic_states) == len(sampler_states): 1-to-1 distribution

If len(thermodynamic_states) > len(sampler_states): First and last state distributed first remaining sampler_states spaced evenly by index until sampler_states are depleted. If there is only one sampler_state, then the only first thermodynamic_state will be chosen

If len(thermodynamic_states) < len(sampler_states), each thermodynamic_state receives an equal number of sampler_states until there are insufficient number of sampler_states remaining to give each thermodynamic_state an equal number. Then the rules from the previous point are followed.

unsampled_thermodynamic_states : list of openmmtools.states.ThermodynamicState, optional, default=None

These are ThermodynamicStates that are not propagated, but their reduced potential is computed at each iteration for each replica. These energy can be used as data for reweighting schemes (default is None).

metadata : dict, optional, default=None

Simulation metadata to be stored in the file.

default_options()¶: dict of all default class options (keyword arguments for __init__ for class and superclasses)

equilibrate(n_iterations, mcmc_moves=None)¶

Equilibrate all replicas.

This does not increase the iteration counter. The equilibrated positions are stored at the end.

Parameters:	n_iterations : int Number of equilibration iterations. mcmc_moves : MCMCMove or list of MCMCMove, optional Optionally, the MCMCMoves to use for equilibration can be different from the ones used in production.

extend(n_iterations)¶

Extend the simulation by the given number of iterations.

Contrarily to run(), this will extend the number of iterations past number_of_iteration if requested.

Parameters:	n_iterations : int The number of iterations to run.

from_storage(storage)¶

Constructor from an existing storage file.

Parameters:	storage : str or Reporter If str: The path to the storage file. If `Reporter`: uses the `Reporter` options In the future this will be able to take a Storage class as well.
Returns:	sampler : MultiStateSampler A new instance of MultiStateSampler (or subclass) in the same state of the last stored iteration.

is_completed¶: Check if we have reached any of the stop target criteria (read-only)

is_periodic¶: Return True if system is periodic, False if not, and None if not initialized

iteration¶

The integer current iteration of the simulation (read-only).

If the simulation has not been created yet, this is None.

mcmc_moves¶

A copy of the MCMCMoves list used to propagate the simulation.

This can be set only before creation.

metadata¶: A copy of the metadata dictionary passed on creation (read-only).

minimize(tolerance=Quantity(value=1.0, unit=kilojoule/(nanometer*mole)), max_iterations=0)¶

Minimize all replicas.

Minimized positions are stored at the end.

Parameters:	tolerance : simtk.unit.Quantity, optional Minimization tolerance (units of energy/mole/length, default is `1.0 * unit.kilojoules_per_mole / unit.nanometers`). max_iterations : int, optional Maximum number of iterations for minimization. If 0, minimization continues until converged.

n_replicas¶: The integer number of replicas (read-only).

n_states¶: The integer number of thermodynamic states (read-only).

options¶: dict of all class options (keyword arguments for __init__ for class and superclasses)

read_status(storage)¶

Read the status of the calculation from the storage file.

This class method can be used to quickly check the status of the simulation before loading the full ReplicaExchange object from disk.

Parameters:	storage : str or Reporter The path to the storage file or the reporter object.
Returns:	status : ReplicaExchange.Status The status of the replica-exchange calculation. It has three fields: `iteration`, `target_error`, and `is_completed`.

run(n_iterations=None)¶

Run the replica-exchange simulation.

This runs at most number_of_iterations iterations. Use extend() to pass the limit.

Parameters:	n_iterations : int, optional If specified, only at most the specified number of iterations will be run (default is None).

sampler_states¶

A copy of the sampler states list at the current iteration.

This can be set only before running.

class yank.multistate.sams.SAMSAnalyzer(*args, unbias_restraint=True, restraint_energy_cutoff='auto', restraint_distance_cutoff='auto', **kwargs)[source]¶

The SAMSAnalyzer is the analyzer for a simulation generated from a SAMSSampler simulation.

See also

ReplicaExchangeAnalyzer, PhaseAnalyzer

clear()¶

Reset all cached objects.

This must to be called if the information in the reporter changes after analysis.

generate_mixing_statistics(number_equilibrated: typing.Union[int, NoneType] = None) → typing.NamedTuple¶

Compute and return replica mixing statistics.

Compute the transition state matrix, its eigenvalues sorted from greatest to least, and the state index correlation function.

Parameters:	number_equilibrated : int, optional, default=None If specified, only samples `number_equilibrated:end` will be used in analysis. If not specified, automatically retrieves the number from equilibration data or generates it from the internal energy.
Returns:	mixing_statistics : namedtuple A namedtuple containing the following attributes: - `transition_matrix`: (nstates by nstates `np.array`) - `eigenvalues`: (nstates-dimensional `np.array`) - `statistical_inefficiency`: float

get_effective_energy_timeseries(energies=None, replica_state_indices=None)¶

Generate the effective energy (negative log deviance) timeseries that is generated for this phase.

The effective energy for a series of samples x_n, n = 1..N, is defined as

u_n = - ln pi(x_n) + c

where pi(x) is the probability density being sampled, and c is an arbitrary constant.

Parameters:

energies : ndarray of shape (K,L,N), optional, Default: None: Energies from replicas K, sampled states L, and iterations N. If provided, then states input_sampled_states must also be provided.
replica_state_indices : ndarray of shape (K,N), optional, Default: None: Integer indices of each sampled state (matching L dimension in input_energy). that each replica K sampled every iteration N. If provided, then states input_energies must also be provided.

Returns:

u_n : ndarray of shape (N,): u_n[n] is the negative log deviance of the same from iteration n Timeseries used to determine equilibration time and statistical inefficiency.

get_enthalpy()¶

Compute the difference in enthalpy and error in that estimate from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:	DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Difference in enthalpy from each state relative to each other state dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Error in the difference in enthalpy from each state relative to each other state

get_entropy()¶

Compute the difference in entropy and error in that estimate from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:	DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Difference in enthalpy from each state relative to each other state dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Error in the difference in enthalpy from each state relative to each other state

get_free_energy()¶

Compute the free energy and error in free energy from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:	DeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Difference in free energy from each state relative to each other state dDeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2) Error in the difference in free energy from each state relative to each other state

has_log_weights¶: Return True if the storage has log weights, False otherwise

kT¶

Quantity of boltzmann constant times temperature of the phase in units of energy per mol

Allows conversion between dimensionless energy and unit bearing energy

n_equilibration_iterations¶: int: The number of equilibration interations.

n_iterations¶: int: The total number of iterations of the phase.

n_replicas¶: int: Number of replicas.

n_states¶: int: Number of sampled thermodynamic states.

name¶: User-readable string name of the phase

observables¶: List of observables that the instanced analyzer can compute/fetch.

read_energies()¶

Extract energies from the ncfile and order them by replica, state, iteration.

Returns:

sampled_energy_matrix : np.ndarray of shape [n_replicas, n_states, n_iterations]: Potential energy matrix of the sampled states.
unsampled_energy_matrix : np.ndarray of shape [n_replicas, n_unsamped_states, n_iterations]: Potential energy matrix of the unsampled states. Energy from each drawn sample n, evaluated at unsampled state l. If no unsampled states were drawn, this will be shape (0,N).
neighborhoods : np.ndarray of shape [n_replicas, n_states, n_iterations]: Neighborhood energies were computed at, uses a boolean mask over the energy_matrix.
replica_state_indices : np.ndarray of shape [n_replicas, n_iterations]: States sampled by the replicas in the energy_matrix

read_logZ(iteration=None)¶

Extract logZ estimates from the ncfile, if present. Returns ValueError if not present.

Parameters:	iteration : int or slice, optional, default=None If specified, iteration or slice of iterations to extract
Returns:	logZ : np.ndarray of shape [n_states, n_iterations] logZ[l,n] is the online logZ estimate for state `l` at iteration `n`

read_log_weights()¶

Extract log weights from the ncfile, if present. Returns ValueError if not present.

Returns:	log_weights : np.ndarray of shape [n_states, n_iterations] log_weights[l,n] is the log weight applied to state `l` during the collection of samples at iteration `n`

reference_states¶: Tuple of reference states i and j for MultiPhaseAnalyzer instances

reformat_energies_for_mbar(u_kln: numpy.ndarray, n_k: typing.Union[numpy.ndarray, NoneType] = None)¶

Convert [replica, state, iteration] data into [state, total_iteration] data

This method assumes that the first dimension are all samplers, the second dimension are all the thermodynamic states energies were evaluated at and an equal number of samples were drawn from each k’th sampler, UNLESS n_k is specified.

Parameters:

u_kln : np.ndarray of shape (K,L,N’)

K = number of replica samplers L = number of thermodynamic states, N’ = number of iterations from state k

n_k : np.ndarray of shape K or None

Number of samples each _SAMPLER_ (k) has drawn This allows you to have trailing entries on a given kth row in the n’th (n prime) index which do not contribute to the conversion.

If this is None, assumes ALL samplers have the same number of samples such that N_k = N’ for all k

WARNING: N_k is number of samples the SAMPLER drew in total, NOT how many samples were drawn from each thermodynamic state L. This method knows nothing of how many samples were drawn from each state.

Returns:

u_ln : np.ndarray of shape (L, N): Reduced, non-sparse data format L = number of thermodynamic states N = sum_k N_k. note this is not N’

reporter¶: Sampler Reporter tied to this object.

show_mixing_statistics(cutoff=0.05, number_equilibrated=None)¶

Print summary of mixing statistics. Passes information off to generate_mixing_statistics then prints it out to the logger

Parameters:	cutoff : float, optional, default=0.05 Only transition probabilities above ‘cutoff’ will be printed number_equilibrated : int, optional, default=None If specified, only samples number_equilibrated:end will be used in analysis If not specified, it uses the internally held statistics best

statistical_inefficiency¶: float: The statistical inefficiency of the sampler.

use_online_data¶: Get the online data flag