MultiState Module

This is the API for the Multi State sub-module and its classes.

MultiState

Multistate Sampling simulation algorithms, specific variants, and analyzers

This module provides a general facility for running multiple thermodynamic state multistate simulations, both general as well as derived classes for special cases such as parallel tempering (in which the states differ only in temperature).

The classes also provide

Provided classes include:

  • yank.multistate.MultiStateSampler
    Base class for general, multi-thermodynamic state parallel multistate
  • yank.multistate.ReplicaExchangeSampler
    Derived class from MultiStateSampler which allows sampled thermodynamic states to swap based on Hamiltonian Replica Exchange
  • yank.multistate.ParallelTemperingSampler
    Convenience subclass of ReplicaExchange for parallel tempering simulations (one System object, many temperatures).
  • yank.multistate.SAMSSampler
    Single-replica sampler which samples through multiple thermodynamic states on the fly.
  • yank.multistate.MultiStateReporter
    Replica Exchange reporter class to store all variables and data

Analyzers

The MultiState module also provides analysis modules to analyze simulations and compute observables from data generated under any of the MultiStateSampler’s

Extending and Subclassing

Subclassing a sampler and analyzer is done by importing and extending any of the following:

  • The base MultiStateSampler from multistatesampler
  • The base MultiStateReporter from multistatereporter
  • The base MultiStateAnalyzer or PhaseAnalyzer and base ObservablesRegistry` from multistateanalyzer

LICENSE

This code is licensed under the latest available version of the MIT License.

MultistateSampler

Base multi-thermodynamic state multistate class

COPYRIGHT

Current version by Andrea Rizzi <andrea.rizzi@choderalab.org>, Levi N. Naden <levi.naden@choderalab.org> and John D. Chodera <john.chodera@choderalab.org> while at Memorial Sloan Kettering Cancer Center.

Original version by John D. Chodera <jchodera@gmail.com> while at the University of California Berkeley.

LICENSE

This code is licensed under the latest available version of the MIT License.

class yank.multistate.multistatesampler.MultiStateSampler(mcmc_moves=None, number_of_iterations=1, online_analysis_interval=200, online_analysis_target_error=0.0, online_analysis_minimum_iterations=200, locality=None)[source]

Base class for samplers that sample multiple thermodynamic states using one or more replicas.

This base class provides a general simulation facility for multistate from multiple thermodynamic states, allowing any set of thermodynamic states to be specified. If instantiated on its own, the thermodynamic state indices associated with each state are specified and replica mixing does not change any thermodynamic states, meaning that each replica remains in its original thermodynamic state.

Stored configurations, energies, swaps, and restart information are all written to a single output file using the platform portable, robust, and efficient NetCDF4 library.

Parameters:
mcmc_moves : MCMCMove or list of MCMCMove, optional

The MCMCMove used to propagate the thermodynamic states. If a list of MCMCMoves, they will be assigned to the correspondent thermodynamic state on creation. If None is provided, Langevin dynamics with 2fm timestep, 5.0/ps collision rate, and 500 steps per iteration will be used.

number_of_iterations : int or infinity, optional, default: 1

The number of iterations to perform. Both float('inf') and numpy.inf are accepted for infinity. If you set this to infinity, be sure to set also online_analysis_interval.

online_analysis_interval : None or Int >= 1, optional, default: 200

Choose the interval at which to perform online analysis of the free energy.

After every interval, the simulation will be stopped and the free energy estimated.

If the error in the free energy estimate is at or below online_analysis_target_error, then the simulation will be considered completed.

If set to None, then no online analysis is performed

online_analysis_target_error : float >= 0, optional, default 0.0

The target error for the online analysis measured in kT per phase.

Once the free energy is at or below this value, the phase will be considered complete.

If online_analysis_interval is None, this option does nothing.

Default is set to 0.0 since online analysis runs by default, but a finite number_of_iterations should also be set to ensure there is some stop condition. If target error is 0 and an infinite number of iterations is set, then the sampler will run until the user stop it manually.

online_analysis_minimum_iterations : int >= 0, optional, default 200

Set the minimum number of iterations which must pass before online analysis is carried out.

Since the initial samples likely not to yield a good estimate of free energy, save time and just skip them If online_analysis_interval is None, this does nothing

locality : int > 0, optional, default None

If None, the energies at all states will be computed for every replica each iteration. If int > 0, energies will only be computed for states range(max(0, state-locality), min(n_states, state+locality)).

Attributes:
n_replicas

The integer number of replicas (read-only).

n_states

The integer number of thermodynamic states (read-only).

iteration

The integer current iteration of the simulation (read-only).

mcmc_moves

A copy of the MCMCMoves list used to propagate the simulation.

sampler_states

A copy of the sampler states list at the current iteration.

metadata

A copy of the metadata dictionary passed on creation (read-only).

is_completed

Check if we have reached any of the stop target criteria (read-only)

:param number_of_iterations: Maximum number of integer iterations that will be run
:param online_analysis_interval: How frequently to carry out online analysis in number of iterations
:param online_analysis_target_error: Target free energy difference error float at which simulation will be stopped during online analysis, in dimensionless energy
:param online_analysis_minimum_iterations: Minimum number of iterations needed before online analysis is run as int
classmethod from_storage(storage)[source]

Constructor from an existing storage file.

Parameters:
storage : str or Reporter

If str: The path to the storage file. If Reporter: uses the Reporter options In the future this will be able to take a Storage class as well.

Returns:
sampler : MultiStateSampler

A new instance of MultiStateSampler (or subclass) in the same state of the last stored iteration.

class Status(iteration, target_error, is_completed)
count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

is_completed

Alias for field number 2

iteration

Alias for field number 0

target_error

Alias for field number 1

classmethod read_status(storage)[source]

Read the status of the calculation from the storage file.

This class method can be used to quickly check the status of the simulation before loading the full ReplicaExchange object from disk.

Parameters:
storage : str or Reporter

The path to the storage file or the reporter object.

Returns:
status : ReplicaExchange.Status

The status of the replica-exchange calculation. It has three fields: iteration, target_error, and is_completed.

n_states

The integer number of thermodynamic states (read-only).

n_replicas

The integer number of replicas (read-only).

iteration

The integer current iteration of the simulation (read-only).

If the simulation has not been created yet, this is None.

mcmc_moves

A copy of the MCMCMoves list used to propagate the simulation.

This can be set only before creation.

sampler_states

A copy of the sampler states list at the current iteration.

This can be set only before running.

is_periodic

Return True if system is periodic, False if not, and None if not initialized

online_analysis_interval

interval to carry out online analysis

metadata

A copy of the metadata dictionary passed on creation (read-only).

is_completed

Check if we have reached any of the stop target criteria (read-only)

create(thermodynamic_states: list, sampler_states, storage, initial_thermodynamic_states=None, unsampled_thermodynamic_states=None, metadata=None)[source]

Create new multistate sampler simulation.

Parameters:
thermodynamic_states : list of openmmtools.states.ThermodynamicState

Thermodynamic states to simulate, where one replica is allocated per state. Each state must have a system with the same number of atoms.

sampler_states : openmmtools.states.SamplerState or list

One or more sets of initial sampler states. The number of replicas is taken to be the number of sampler states provided. If the sampler states do not have box_vectors attached and the system is periodic, an exception will be thrown.

storage : str or instanced Reporter

If str: the path to the storage file. Default checkpoint options from Reporter class are used If Reporter: Uses the reporter options and storage path In the future this will be able to take a Storage class as well.

initial_thermodynamic_states : None or list or array-like of int of length len(sampler_states), optional,

default: None. Initial thermodynamic_state index for each sampler_state. If no initial distribution is chosen, sampler_states are distributed between the thermodynamic_states following these rules:

  • If len(thermodynamic_states) == len(sampler_states): 1-to-1 distribution
  • If len(thermodynamic_states) > len(sampler_states): First and last state distributed first remaining sampler_states spaced evenly by index until sampler_states are depleted. If there is only one sampler_state, then the only first thermodynamic_state will be chosen
  • If len(thermodynamic_states) < len(sampler_states), each thermodynamic_state receives an equal number of sampler_states until there are insufficient number of sampler_states remaining to give each thermodynamic_state an equal number. Then the rules from the previous point are followed.
unsampled_thermodynamic_states : list of openmmtools.states.ThermodynamicState, optional, default=None

These are ThermodynamicStates that are not propagated, but their reduced potential is computed at each iteration for each replica. These energy can be used as data for reweighting schemes (default is None).

metadata : dict, optional, default=None

Simulation metadata to be stored in the file.

minimize(tolerance=Quantity(value=1.0, unit=kilojoule/(nanometer*mole)), max_iterations=0)[source]

Minimize all replicas.

Minimized positions are stored at the end.

Parameters:
tolerance : simtk.unit.Quantity, optional

Minimization tolerance (units of energy/mole/length, default is 1.0 * unit.kilojoules_per_mole / unit.nanometers).

max_iterations : int, optional

Maximum number of iterations for minimization. If 0, minimization continues until converged.

equilibrate(n_iterations, mcmc_moves=None)[source]

Equilibrate all replicas.

This does not increase the iteration counter. The equilibrated positions are stored at the end.

Parameters:
n_iterations : int

Number of equilibration iterations.

mcmc_moves : MCMCMove or list of MCMCMove, optional

Optionally, the MCMCMoves to use for equilibration can be different from the ones used in production.

run(n_iterations=None)[source]

Run the replica-exchange simulation.

This runs at most number_of_iterations iterations. Use extend() to pass the limit.

Parameters:
n_iterations : int, optional

If specified, only at most the specified number of iterations will be run (default is None).

extend(n_iterations)[source]

Extend the simulation by the given number of iterations.

Contrarily to run(), this will extend the number of iterations past number_of_iteration if requested.

Parameters:
n_iterations : int

The number of iterations to run.

classmethod default_options()[source]

dict of all default class options (keyword arguments for __init__ for class and superclasses)

options

dict of all class options (keyword arguments for __init__ for class and superclasses)

ReplicaExchangeSampler

Derived multi-thermodynamic state multistate class with exchanging configurations between replicas

COPYRIGHT

Current version by Andrea Rizzi <andrea.rizzi@choderalab.org>, Levi N. Naden <levi.naden@choderalab.org> and John D. Chodera <john.chodera@choderalab.org> while at Memorial Sloan Kettering Cancer Center.

Original version by John D. Chodera <jchodera@gmail.com> while at the University of California Berkeley.

LICENSE

This code is licensed under the latest available version of the MIT License.

class yank.multistate.replicaexchange.ReplicaExchangeSampler(replica_mixing_scheme='swap-all', **kwargs)[source]

Replica-exchange simulation facility.

This MultiStateSampler class provides a general replica-exchange simulation facility, allowing any set of thermodynamic states to be specified, along with a set of initial positions to be assigned to the replicas in a round-robin fashion.

No distinction is made between one-dimensional and multidimensional replica layout. By default, the replica mixing scheme attempts to mix all replicas to minimize slow diffusion normally found in multidimensional replica exchange simulations (Modification of the ‘replica_mixing_scheme’ setting will allow the traditional ‘neighbor swaps only’ scheme to be used.)

Stored configurations, energies, swaps, and restart information are all written to a single output file using the platform portable, robust, and efficient NetCDF4 library.

Parameters:
mcmc_moves : MCMCMove or list of MCMCMove, optional

The MCMCMove used to propagate the states. If a list of MCMCMoves, they will be assigned to the correspondent thermodynamic state on creation. If None is provided, Langevin dynamics with 2fm timestep, 5.0/ps collision rate, and 500 steps per iteration will be used.

number_of_iterations : int or infinity, optional, default: 1

The number of iterations to perform. Both float('inf') and numpy.inf are accepted for infinity. If you set this to infinity, be sure to set also online_analysis_interval.

replica_mixing_scheme : ‘swap-all’, ‘swap-neighbors’ or None, Default: ‘swap-all’

The scheme used to swap thermodynamic states between replicas.

online_analysis_interval : None or Int >= 1, optional, default None

Choose the interval at which to perform online analysis of the free energy.

After every interval, the simulation will be stopped and the free energy estimated.

If the error in the free energy estimate is at or below online_analysis_target_error, then the simulation will be considered completed.

online_analysis_target_error : float >= 0, optional, default 0.2

The target error for the online analysis measured in kT per phase.

Once the free energy is at or below this value, the phase will be considered complete.

If online_analysis_interval is None, this option does nothing.

online_analysis_minimum_iterations : int >= 0, optional, default 50

Set the minimum number of iterations which must pass before online analysis is carried out.

Since the initial samples likely not to yield a good estimate of free energy, save time and just skip them If online_analysis_interval is None, this does nothing

Examples

Parallel tempering simulation of alanine dipeptide in implicit solvent (replica exchange among temperatures). This is just an illustrative example; use ParallelTempering class for actual production parallel tempering simulations.

Create the system.

>>> import math
>>> from simtk import unit
>>> from openmmtools import testsystems, states, mcmc
>>> testsystem = testsystems.AlanineDipeptideImplicit()

Create thermodynamic states for parallel tempering with exponentially-spaced schedule.

>>> n_replicas = 3  # Number of temperature replicas.
>>> T_min = 298.0 * unit.kelvin  # Minimum temperature.
>>> T_max = 600.0 * unit.kelvin  # Maximum temperature.
>>> temperatures = [T_min + (T_max - T_min) * (math.exp(float(i) / float(nreplicas-1)) - 1.0) / (math.e - 1.0)
...                 for i in range(n_replicas)]
>>> thermodynamic_states = [states.ThermodynamicState(system=testsystem.system, temperature=T)
...                         for T in temperatures]

Initialize simulation object with options. Run with a GHMC integrator.

>>> move = mcmc.GHMCMove(timestep=2.0*unit.femtoseconds, n_steps=50)
>>> simulation = ReplicaExchangeSampler(mcmc_moves=move, number_of_iterations=2)

Create simulation with its storage file (in a temporary directory) and run.

>>> storage_path = tempfile.NamedTemporaryFile(delete=False).name + '.nc'
>>> reporter = MultiStateReporter(storage_path, checkpoint_interval=1)
>>> simulation.create(thermodynamic_states=thermodynamic_states,
>>>                   sampler_states=states.SamplerState(testsystem.positions),
>>>                   storage=reporter)
>>> simulation.run()  # This runs for a maximum of 2 iterations.
>>> simulation.iteration
2
>>> simulation.run(n_iterations=1)
>>> simulation.iteration
2

To resume a simulation from an existing storage file and extend it beyond the original number of iterations.

>>> del simulation
>>> simulation = ReplicaExchangeSampler.from_storage(reporter)
>>> simulation.extend(n_iterations=1)
>>> simulation.iteration
3

You can extract several information from the NetCDF file using the Reporter class while the simulation is running. This reads the SamplerStates of every run iteration.

>>> reporter = MultiStateReporter(storage=storage_path, open_mode='r', checkpoint_interval=1)
>>> sampler_states = reporter.read_sampler_states(iteration=range(1, 4))
>>> len(sampler_states)
3
>>> sampler_states[-1].positions.shape  # Alanine dipeptide has 22 atoms.
(22, 3)

Clean up.

>>> os.remove(storage_path)
Parameters:
  • number_of_iterations – Maximum number of integer iterations that will be run
  • replica_mixing_scheme – Scheme which describes how replicas are exchanged each iteration as string
  • online_analysis_interval – How frequently to carry out online analysis in number of iterations
  • online_analysis_target_error – Target free energy difference error float at which simulation will be stopped during online analysis, in dimensionless energy
  • online_analysis_minimum_iterations – Minimum number of iterations needed before online analysis is run as int
Attributes:
n_replicas

The integer number of replicas (read-only).

iteration

The integer current iteration of the simulation (read-only).

mcmc_moves

A copy of the MCMCMoves list used to propagate the simulation.

sampler_states

A copy of the sampler states list at the current iteration.

metadata

A copy of the metadata dictionary passed on creation (read-only).

is_completed

Check if we have reached any of the stop target criteria (read-only)

class Status(iteration, target_error, is_completed)
count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

is_completed

Alias for field number 2

iteration

Alias for field number 0

target_error

Alias for field number 1

create(thermodynamic_states: list, sampler_states, storage, initial_thermodynamic_states=None, unsampled_thermodynamic_states=None, metadata=None)

Create new multistate sampler simulation.

Parameters:
thermodynamic_states : list of openmmtools.states.ThermodynamicState

Thermodynamic states to simulate, where one replica is allocated per state. Each state must have a system with the same number of atoms.

sampler_states : openmmtools.states.SamplerState or list

One or more sets of initial sampler states. The number of replicas is taken to be the number of sampler states provided. If the sampler states do not have box_vectors attached and the system is periodic, an exception will be thrown.

storage : str or instanced Reporter

If str: the path to the storage file. Default checkpoint options from Reporter class are used If Reporter: Uses the reporter options and storage path In the future this will be able to take a Storage class as well.

initial_thermodynamic_states : None or list or array-like of int of length len(sampler_states), optional,

default: None. Initial thermodynamic_state index for each sampler_state. If no initial distribution is chosen, sampler_states are distributed between the thermodynamic_states following these rules:

  • If len(thermodynamic_states) == len(sampler_states): 1-to-1 distribution
  • If len(thermodynamic_states) > len(sampler_states): First and last state distributed first remaining sampler_states spaced evenly by index until sampler_states are depleted. If there is only one sampler_state, then the only first thermodynamic_state will be chosen
  • If len(thermodynamic_states) < len(sampler_states), each thermodynamic_state receives an equal number of sampler_states until there are insufficient number of sampler_states remaining to give each thermodynamic_state an equal number. Then the rules from the previous point are followed.
unsampled_thermodynamic_states : list of openmmtools.states.ThermodynamicState, optional, default=None

These are ThermodynamicStates that are not propagated, but their reduced potential is computed at each iteration for each replica. These energy can be used as data for reweighting schemes (default is None).

metadata : dict, optional, default=None

Simulation metadata to be stored in the file.

default_options()

dict of all default class options (keyword arguments for __init__ for class and superclasses)

equilibrate(n_iterations, mcmc_moves=None)

Equilibrate all replicas.

This does not increase the iteration counter. The equilibrated positions are stored at the end.

Parameters:
n_iterations : int

Number of equilibration iterations.

mcmc_moves : MCMCMove or list of MCMCMove, optional

Optionally, the MCMCMoves to use for equilibration can be different from the ones used in production.

extend(n_iterations)

Extend the simulation by the given number of iterations.

Contrarily to run(), this will extend the number of iterations past number_of_iteration if requested.

Parameters:
n_iterations : int

The number of iterations to run.

from_storage(storage)

Constructor from an existing storage file.

Parameters:
storage : str or Reporter

If str: The path to the storage file. If Reporter: uses the Reporter options In the future this will be able to take a Storage class as well.

Returns:
sampler : MultiStateSampler

A new instance of MultiStateSampler (or subclass) in the same state of the last stored iteration.

is_completed

Check if we have reached any of the stop target criteria (read-only)

is_periodic

Return True if system is periodic, False if not, and None if not initialized

iteration

The integer current iteration of the simulation (read-only).

If the simulation has not been created yet, this is None.

mcmc_moves

A copy of the MCMCMoves list used to propagate the simulation.

This can be set only before creation.

metadata

A copy of the metadata dictionary passed on creation (read-only).

minimize(tolerance=Quantity(value=1.0, unit=kilojoule/(nanometer*mole)), max_iterations=0)

Minimize all replicas.

Minimized positions are stored at the end.

Parameters:
tolerance : simtk.unit.Quantity, optional

Minimization tolerance (units of energy/mole/length, default is 1.0 * unit.kilojoules_per_mole / unit.nanometers).

max_iterations : int, optional

Maximum number of iterations for minimization. If 0, minimization continues until converged.

n_replicas

The integer number of replicas (read-only).

n_states

The integer number of thermodynamic states (read-only).

options

dict of all class options (keyword arguments for __init__ for class and superclasses)

read_status(storage)

Read the status of the calculation from the storage file.

This class method can be used to quickly check the status of the simulation before loading the full ReplicaExchange object from disk.

Parameters:
storage : str or Reporter

The path to the storage file or the reporter object.

Returns:
status : ReplicaExchange.Status

The status of the replica-exchange calculation. It has three fields: iteration, target_error, and is_completed.

run(n_iterations=None)

Run the replica-exchange simulation.

This runs at most number_of_iterations iterations. Use extend() to pass the limit.

Parameters:
n_iterations : int, optional

If specified, only at most the specified number of iterations will be run (default is None).

sampler_states

A copy of the sampler states list at the current iteration.

This can be set only before running.

class yank.multistate.replicaexchange.ReplicaExchangeAnalyzer(*args, unbias_restraint=True, restraint_energy_cutoff='auto', restraint_distance_cutoff='auto', **kwargs)[source]

The ReplicaExchangeAnalyzer is the analyzer for a simulation generated from a Replica Exchange sampler simulation, implemented as an instance of the MultiStateSamplerAnalyzer.

See also

PhaseAnalyzer, MultiStateSamplerAnalyzer

clear()

Reset all cached objects.

This must to be called if the information in the reporter changes after analysis.

generate_mixing_statistics(number_equilibrated: typing.Union[int, NoneType] = None) → typing.NamedTuple

Compute and return replica mixing statistics.

Compute the transition state matrix, its eigenvalues sorted from greatest to least, and the state index correlation function.

Parameters:
number_equilibrated : int, optional, default=None

If specified, only samples number_equilibrated:end will be used in analysis. If not specified, automatically retrieves the number from equilibration data or generates it from the internal energy.

Returns:
mixing_statistics : namedtuple

A namedtuple containing the following attributes: - transition_matrix: (nstates by nstates np.array) - eigenvalues: (nstates-dimensional np.array) - statistical_inefficiency: float

get_effective_energy_timeseries(energies=None, replica_state_indices=None)

Generate the effective energy (negative log deviance) timeseries that is generated for this phase.

The effective energy for a series of samples x_n, n = 1..N, is defined as

u_n = - ln pi(x_n) + c

where pi(x) is the probability density being sampled, and c is an arbitrary constant.

Parameters:
energies : ndarray of shape (K,L,N), optional, Default: None

Energies from replicas K, sampled states L, and iterations N. If provided, then states input_sampled_states must also be provided.

replica_state_indices : ndarray of shape (K,N), optional, Default: None

Integer indices of each sampled state (matching L dimension in input_energy). that each replica K sampled every iteration N. If provided, then states input_energies must also be provided.

Returns:
u_n : ndarray of shape (N,)

u_n[n] is the negative log deviance of the same from iteration n Timeseries used to determine equilibration time and statistical inefficiency.

get_enthalpy()

Compute the difference in enthalpy and error in that estimate from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:
DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Difference in enthalpy from each state relative to each other state

dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Error in the difference in enthalpy from each state relative to each other state

get_entropy()

Compute the difference in entropy and error in that estimate from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:
DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Difference in enthalpy from each state relative to each other state

dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Error in the difference in enthalpy from each state relative to each other state

get_free_energy()

Compute the free energy and error in free energy from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:
DeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Difference in free energy from each state relative to each other state

dDeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Error in the difference in free energy from each state relative to each other state

has_log_weights

Return True if the storage has log weights, False otherwise

kT

Quantity of boltzmann constant times temperature of the phase in units of energy per mol

Allows conversion between dimensionless energy and unit bearing energy

n_equilibration_iterations

int: The number of equilibration interations.

n_iterations

int: The total number of iterations of the phase.

n_replicas

int: Number of replicas.

n_states

int: Number of sampled thermodynamic states.

name

User-readable string name of the phase

observables

List of observables that the instanced analyzer can compute/fetch.

read_energies()

Extract energies from the ncfile and order them by replica, state, iteration.

Returns:
sampled_energy_matrix : np.ndarray of shape [n_replicas, n_states, n_iterations]

Potential energy matrix of the sampled states.

unsampled_energy_matrix : np.ndarray of shape [n_replicas, n_unsamped_states, n_iterations]

Potential energy matrix of the unsampled states. Energy from each drawn sample n, evaluated at unsampled state l. If no unsampled states were drawn, this will be shape (0,N).

neighborhoods : np.ndarray of shape [n_replicas, n_states, n_iterations]

Neighborhood energies were computed at, uses a boolean mask over the energy_matrix.

replica_state_indices : np.ndarray of shape [n_replicas, n_iterations]

States sampled by the replicas in the energy_matrix

read_logZ(iteration=None)

Extract logZ estimates from the ncfile, if present. Returns ValueError if not present.

Parameters:
iteration : int or slice, optional, default=None

If specified, iteration or slice of iterations to extract

Returns:
logZ : np.ndarray of shape [n_states, n_iterations]

logZ[l,n] is the online logZ estimate for state l at iteration n

read_log_weights()

Extract log weights from the ncfile, if present. Returns ValueError if not present.

Returns:
log_weights : np.ndarray of shape [n_states, n_iterations]

log_weights[l,n] is the log weight applied to state l during the collection of samples at iteration n

reference_states

Tuple of reference states i and j for MultiPhaseAnalyzer instances

reformat_energies_for_mbar(u_kln: numpy.ndarray, n_k: typing.Union[numpy.ndarray, NoneType] = None)

Convert [replica, state, iteration] data into [state, total_iteration] data

This method assumes that the first dimension are all samplers, the second dimension are all the thermodynamic states energies were evaluated at and an equal number of samples were drawn from each k’th sampler, UNLESS n_k is specified.

Parameters:
u_kln : np.ndarray of shape (K,L,N’)

K = number of replica samplers L = number of thermodynamic states, N’ = number of iterations from state k

n_k : np.ndarray of shape K or None

Number of samples each _SAMPLER_ (k) has drawn This allows you to have trailing entries on a given kth row in the n’th (n prime) index which do not contribute to the conversion.

If this is None, assumes ALL samplers have the same number of samples such that N_k = N’ for all k

WARNING: N_k is number of samples the SAMPLER drew in total, NOT how many samples were drawn from each thermodynamic state L. This method knows nothing of how many samples were drawn from each state.

Returns:
u_ln : np.ndarray of shape (L, N)

Reduced, non-sparse data format L = number of thermodynamic states N = sum_k N_k. note this is not N’

reporter

Sampler Reporter tied to this object.

show_mixing_statistics(cutoff=0.05, number_equilibrated=None)

Print summary of mixing statistics. Passes information off to generate_mixing_statistics then prints it out to the logger

Parameters:
cutoff : float, optional, default=0.05

Only transition probabilities above ‘cutoff’ will be printed

number_equilibrated : int, optional, default=None

If specified, only samples number_equilibrated:end will be used in analysis If not specified, it uses the internally held statistics best

statistical_inefficiency

float: The statistical inefficiency of the sampler.

use_online_data

Get the online data flag

ParallelTemperingSampler

Derived multi-thermodynamic state multistate class with exchanging configurations between replicas of different temperatures. This is a special case which accepts a single thermodynamic_state and different temperatures to sample. If you want different temperatures and Hamiltonians, use ReplicaExchangeSampler with temperatures pre-set.

COPYRIGHT

Current version by Andrea Rizzi <andrea.rizzi@choderalab.org>, Levi N. Naden <levi.naden@choderalab.org> and John D. Chodera <john.chodera@choderalab.org> while at Memorial Sloan Kettering Cancer Center.

Original version by John D. Chodera <jchodera@gmail.com> while at the University of California Berkeley.

LICENSE

This code is licensed under the latest available version of the MIT License.

class yank.multistate.paralleltempering.ParallelTemperingSampler(replica_mixing_scheme='swap-all', **kwargs)[source]

Parallel tempering simulation facility.

This class provides a facility for parallel tempering simulations. It is a subclass of ReplicaExchange, but provides efficiency improvements for parallel tempering simulations, so should be preferred for this type of simulation. In particular, this makes use of the fact that the reduced potentials are linear in inverse temperature.

See also

MultiStateSampler, ReplicaExchangeSampler

Examples

Create the system.

>>> from simtk import unit
>>> from openmmtools import testsystems, states, mcmc
>>> import tempfile
>>> testsystem = testsystems.AlanineDipeptideImplicit()

Create thermodynamic states for parallel tempering with exponentially-spaced schedule.

>>> n_replicas = 3  # Number of temperature replicas.
>>> T_min = 298.0 * unit.kelvin  # Minimum temperature.
>>> T_max = 600.0 * unit.kelvin  # Maximum temperature.
>>> reference_state = states.ThermodynamicState(system=testsystem.system, temperature=T_min)

Initialize simulation object with options. Run with a GHMC integrator.

>>> move = mcmc.GHMCMove(timestep=2.0*unit.femtoseconds, n_steps=50)
>>> simulation = ParallelTemperingSampler(mcmc_moves=move, number_of_iterations=2)

Create simulation with its storage file (in a temporary directory) and run.

>>> storage_path = tempfile.NamedTemporaryFile(delete=False).name + '.nc'
>>> reporter = MultiStateReporter(storage_path, checkpoint_interval=10)
>>> simulation.create(reference_state,
...                   states.SamplerState(testsystem.positions),
...                   reporter, min_temperature=T_min,
...                   max_temperature=T_max, n_temperatures=n_replicas)
>>> simulation.run(n_iterations=1)

Clean up.

>>> os.remove(storage_path)
create(thermodynamic_state, sampler_states: list, storage, min_temperature=None, max_temperature=None, n_temperatures=None, temperatures=None, **kwargs)[source]

Initialize a parallel tempering simulation object.

Parameters:
thermodynamic_state : openmmtools.states.ThermodynamicState

Reference thermodynamic state that will be simulated at the given temperatures.

WARNING: This is a SINGLE state, not a list of states!

sampler_states : openmmtools.states.SamplerState or list

One or more sets of initial sampler states. If a list of SamplerStates, they will be assigned to replicas in a round-robin fashion.

storage : str or Reporter

If str: path to the storage file, checkpoint options are default If Reporter: Instanced Reporter class, checkpoint information is read from In the future this will be able to take a Storage class as well.

min_temperature : simtk.unit.Quantity, optional

Minimum temperature (units of temperature, default is None).

max_temperature : simtk.unit.Quantity, optional

Maximum temperature (units of temperature, default is None).

n_temperatures : int, optional

Number of exponentially-spaced temperatures between min_temperature and max_temperature (default is None).

temperatures : list of simtk.unit.Quantity, optional

If specified, this list of temperatures will be used instead of min_temperature, max_temperature, and n_temperatures (units of temperature, default is None).

metadata : dict, optional

Simulation metadata to be stored in the file.

Notes

Either (min_temperature, max_temperature, n_temperatures) must all be specified or the list of ‘temperatures‘ must be specified.

class Status(iteration, target_error, is_completed)
count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

is_completed

Alias for field number 2

iteration

Alias for field number 0

target_error

Alias for field number 1

default_options()

dict of all default class options (keyword arguments for __init__ for class and superclasses)

equilibrate(n_iterations, mcmc_moves=None)

Equilibrate all replicas.

This does not increase the iteration counter. The equilibrated positions are stored at the end.

Parameters:
n_iterations : int

Number of equilibration iterations.

mcmc_moves : MCMCMove or list of MCMCMove, optional

Optionally, the MCMCMoves to use for equilibration can be different from the ones used in production.

extend(n_iterations)

Extend the simulation by the given number of iterations.

Contrarily to run(), this will extend the number of iterations past number_of_iteration if requested.

Parameters:
n_iterations : int

The number of iterations to run.

from_storage(storage)

Constructor from an existing storage file.

Parameters:
storage : str or Reporter

If str: The path to the storage file. If Reporter: uses the Reporter options In the future this will be able to take a Storage class as well.

Returns:
sampler : MultiStateSampler

A new instance of MultiStateSampler (or subclass) in the same state of the last stored iteration.

is_completed

Check if we have reached any of the stop target criteria (read-only)

is_periodic

Return True if system is periodic, False if not, and None if not initialized

iteration

The integer current iteration of the simulation (read-only).

If the simulation has not been created yet, this is None.

mcmc_moves

A copy of the MCMCMoves list used to propagate the simulation.

This can be set only before creation.

metadata

A copy of the metadata dictionary passed on creation (read-only).

minimize(tolerance=Quantity(value=1.0, unit=kilojoule/(nanometer*mole)), max_iterations=0)

Minimize all replicas.

Minimized positions are stored at the end.

Parameters:
tolerance : simtk.unit.Quantity, optional

Minimization tolerance (units of energy/mole/length, default is 1.0 * unit.kilojoules_per_mole / unit.nanometers).

max_iterations : int, optional

Maximum number of iterations for minimization. If 0, minimization continues until converged.

n_replicas

The integer number of replicas (read-only).

n_states

The integer number of thermodynamic states (read-only).

options

dict of all class options (keyword arguments for __init__ for class and superclasses)

read_status(storage)

Read the status of the calculation from the storage file.

This class method can be used to quickly check the status of the simulation before loading the full ReplicaExchange object from disk.

Parameters:
storage : str or Reporter

The path to the storage file or the reporter object.

Returns:
status : ReplicaExchange.Status

The status of the replica-exchange calculation. It has three fields: iteration, target_error, and is_completed.

run(n_iterations=None)

Run the replica-exchange simulation.

This runs at most number_of_iterations iterations. Use extend() to pass the limit.

Parameters:
n_iterations : int, optional

If specified, only at most the specified number of iterations will be run (default is None).

sampler_states

A copy of the sampler states list at the current iteration.

This can be set only before running.

class yank.multistate.paralleltempering.ParallelTemperingAnalyzer(*args, unbias_restraint=True, restraint_energy_cutoff='auto', restraint_distance_cutoff='auto', **kwargs)[source]

The ParallelTemperingAnalyzer is the analyzer for a simulation generated from a Parallel Tempering sampler simulation, implemented as an instance of the ReplicaExchangeAnalyzer as the sampler is a subclass of the yank.multistate.ReplicaExchangeSampler

See also

PhaseAnalyzer, ReplicaExchangeAnalyzer

clear()

Reset all cached objects.

This must to be called if the information in the reporter changes after analysis.

generate_mixing_statistics(number_equilibrated: typing.Union[int, NoneType] = None) → typing.NamedTuple

Compute and return replica mixing statistics.

Compute the transition state matrix, its eigenvalues sorted from greatest to least, and the state index correlation function.

Parameters:
number_equilibrated : int, optional, default=None

If specified, only samples number_equilibrated:end will be used in analysis. If not specified, automatically retrieves the number from equilibration data or generates it from the internal energy.

Returns:
mixing_statistics : namedtuple

A namedtuple containing the following attributes: - transition_matrix: (nstates by nstates np.array) - eigenvalues: (nstates-dimensional np.array) - statistical_inefficiency: float

get_effective_energy_timeseries(energies=None, replica_state_indices=None)

Generate the effective energy (negative log deviance) timeseries that is generated for this phase.

The effective energy for a series of samples x_n, n = 1..N, is defined as

u_n = - ln pi(x_n) + c

where pi(x) is the probability density being sampled, and c is an arbitrary constant.

Parameters:
energies : ndarray of shape (K,L,N), optional, Default: None

Energies from replicas K, sampled states L, and iterations N. If provided, then states input_sampled_states must also be provided.

replica_state_indices : ndarray of shape (K,N), optional, Default: None

Integer indices of each sampled state (matching L dimension in input_energy). that each replica K sampled every iteration N. If provided, then states input_energies must also be provided.

Returns:
u_n : ndarray of shape (N,)

u_n[n] is the negative log deviance of the same from iteration n Timeseries used to determine equilibration time and statistical inefficiency.

get_enthalpy()

Compute the difference in enthalpy and error in that estimate from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:
DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Difference in enthalpy from each state relative to each other state

dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Error in the difference in enthalpy from each state relative to each other state

get_entropy()

Compute the difference in entropy and error in that estimate from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:
DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Difference in enthalpy from each state relative to each other state

dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Error in the difference in enthalpy from each state relative to each other state

get_free_energy()

Compute the free energy and error in free energy from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:
DeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Difference in free energy from each state relative to each other state

dDeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Error in the difference in free energy from each state relative to each other state

has_log_weights

Return True if the storage has log weights, False otherwise

kT

Quantity of boltzmann constant times temperature of the phase in units of energy per mol

Allows conversion between dimensionless energy and unit bearing energy

n_equilibration_iterations

int: The number of equilibration interations.

n_iterations

int: The total number of iterations of the phase.

n_replicas

int: Number of replicas.

n_states

int: Number of sampled thermodynamic states.

name

User-readable string name of the phase

observables

List of observables that the instanced analyzer can compute/fetch.

read_energies()

Extract energies from the ncfile and order them by replica, state, iteration.

Returns:
sampled_energy_matrix : np.ndarray of shape [n_replicas, n_states, n_iterations]

Potential energy matrix of the sampled states.

unsampled_energy_matrix : np.ndarray of shape [n_replicas, n_unsamped_states, n_iterations]

Potential energy matrix of the unsampled states. Energy from each drawn sample n, evaluated at unsampled state l. If no unsampled states were drawn, this will be shape (0,N).

neighborhoods : np.ndarray of shape [n_replicas, n_states, n_iterations]

Neighborhood energies were computed at, uses a boolean mask over the energy_matrix.

replica_state_indices : np.ndarray of shape [n_replicas, n_iterations]

States sampled by the replicas in the energy_matrix

read_logZ(iteration=None)

Extract logZ estimates from the ncfile, if present. Returns ValueError if not present.

Parameters:
iteration : int or slice, optional, default=None

If specified, iteration or slice of iterations to extract

Returns:
logZ : np.ndarray of shape [n_states, n_iterations]

logZ[l,n] is the online logZ estimate for state l at iteration n

read_log_weights()

Extract log weights from the ncfile, if present. Returns ValueError if not present.

Returns:
log_weights : np.ndarray of shape [n_states, n_iterations]

log_weights[l,n] is the log weight applied to state l during the collection of samples at iteration n

reference_states

Tuple of reference states i and j for MultiPhaseAnalyzer instances

reformat_energies_for_mbar(u_kln: numpy.ndarray, n_k: typing.Union[numpy.ndarray, NoneType] = None)

Convert [replica, state, iteration] data into [state, total_iteration] data

This method assumes that the first dimension are all samplers, the second dimension are all the thermodynamic states energies were evaluated at and an equal number of samples were drawn from each k’th sampler, UNLESS n_k is specified.

Parameters:
u_kln : np.ndarray of shape (K,L,N’)

K = number of replica samplers L = number of thermodynamic states, N’ = number of iterations from state k

n_k : np.ndarray of shape K or None

Number of samples each _SAMPLER_ (k) has drawn This allows you to have trailing entries on a given kth row in the n’th (n prime) index which do not contribute to the conversion.

If this is None, assumes ALL samplers have the same number of samples such that N_k = N’ for all k

WARNING: N_k is number of samples the SAMPLER drew in total, NOT how many samples were drawn from each thermodynamic state L. This method knows nothing of how many samples were drawn from each state.

Returns:
u_ln : np.ndarray of shape (L, N)

Reduced, non-sparse data format L = number of thermodynamic states N = sum_k N_k. note this is not N’

reporter

Sampler Reporter tied to this object.

show_mixing_statistics(cutoff=0.05, number_equilibrated=None)

Print summary of mixing statistics. Passes information off to generate_mixing_statistics then prints it out to the logger

Parameters:
cutoff : float, optional, default=0.05

Only transition probabilities above ‘cutoff’ will be printed

number_equilibrated : int, optional, default=None

If specified, only samples number_equilibrated:end will be used in analysis If not specified, it uses the internally held statistics best

statistical_inefficiency

float: The statistical inefficiency of the sampler.

use_online_data

Get the online data flag

SamsSampler

Self-adjusted mixture sampling (SAMS), also known as optimally-adjusted mixture sampling.

This implementation uses stochastic approximation to allow one or more replicas to sample the whole range of thermodynamic states for rapid online computation of free energies.

COPYRIGHT

Written by John D. Chodera <john.chodera@choderalab.org> while at Memorial Sloan Kettering Cancer Center.

LICENSE

This code is licensed under the latest available version of the MIT License.

class yank.multistate.sams.SAMSSampler(number_of_iterations=1, log_target_probabilities=None, state_update_scheme='global-jump', locality=5, update_stages='two-stage', flatness_threshold=0.2, weight_update_method='rao-blackwellized', adapt_target_probabilities=False, gamma0=1.0, logZ_guess=None, **kwargs)[source]

Self-adjusted mixture sampling (SAMS), also known as optimally-adjusted mixture sampling.

This class provides a facility for self-adjusted mixture sampling simulations. One or more replicas use the method of expanded ensembles [1] to sample multiple thermodynamic states within each replica, with log weights for each thermodynamic state adapted on the fly [2] to achieve the desired target probabilities for each state.

See also

ReplicaExchangeSampler

References

[1] Lyubartsev AP, Martsinovski AA, Shevkunov SV, and Vorontsov-Velyaminov PN. New approach to Monte Carlo calculation of the free energy: Method of expanded ensembles. JCP 96:1776, 1992 http://dx.doi.org/10.1063/1.462133

[2] Tan, Z. Optimally adjusted mixture sampling and locally weighted histogram analysis, Journal of Computational and Graphical Statistics 26:54, 2017. http://dx.doi.org/10.1080/10618600.2015.1113975

Examples

SAMS simulation of alanine dipeptide in implicit solvent at different temperatures.

Create the system:

>>> import math
>>> from simtk import unit
>>> from openmmtools import testsystems, states, mcmc
>>> testsystem = testsystems.AlanineDipeptideVacuum()

Create thermodynamic states for parallel tempering with exponentially-spaced schedule:

>>> n_replicas = 3  # Number of temperature replicas.
>>> T_min = 298.0 * unit.kelvin  # Minimum temperature.
>>> T_max = 600.0 * unit.kelvin  # Maximum temperature.
>>> temperatures = [T_min + (T_max - T_min) * (math.exp(float(i) / float(nreplicas-1)) - 1.0) / (math.e - 1.0)
...                 for i in range(n_replicas)]
>>> thermodynamic_states = [states.ThermodynamicState(system=testsystem.system, temperature=T)
...                         for T in temperatures]

Initialize simulation object with options. Run with a GHMC integrator:

>>> move = mcmc.GHMCMove(timestep=2.0*unit.femtoseconds, n_steps=50)
>>> simulation = SAMSSampler(mcmc_moves=move, number_of_iterations=2,
>>>                          state_update_scheme='restricted-range', locality=5,
>>>                          update_stages='two-stage', flatness_threshold=0.2,
>>>                          weight_update_method='rao-blackwellized',
>>>                          adapt_target_probabilities=False)

Create a single-replica SAMS simulation bound to a storage file and run:

>>> storage_path = tempfile.NamedTemporaryFile(delete=False).name + '.nc'
>>> reporter = MultiStateReporter(storage_path, checkpoint_interval=1)
>>> simulation.create(thermodynamic_states=thermodynamic_states,
>>>                   sampler_states=[states.SamplerState(testsystem.positions)],
>>>                   storage=reporter)
>>> simulation.run()  # This runs for a maximum of 2 iterations.
>>> simulation.iteration
2
>>> simulation.run(n_iterations=1)
>>> simulation.iteration
2

To resume a simulation from an existing storage file and extend it beyond the original number of iterations.

>>> del simulation
>>> simulation = SAMSSampler.from_storage(reporter)
>>> simulation.extend(n_iterations=1)
>>> simulation.iteration
3

You can extract several information from the NetCDF file using the Reporter class while the simulation is running. This reads the SamplerStates of every run iteration.

>>> reporter = MultiStateReporter(storage=storage_path, open_mode='r', checkpoint_interval=1)
>>> sampler_states = reporter.read_sampler_states(iteration=range(1, 4))
>>> len(sampler_states)
3
>>> sampler_states[-1].positions.shape  # Alanine dipeptide has 22 atoms.
(22, 3)

Clean up.

>>> os.remove(storage_path)
Attributes:
log_target_probabilities : array-like

log_target_probabilities[state_index] is the log target probability for state state_index

state_update_scheme : str

Thermodynamic state sampling scheme. One of [‘global-jump’, ‘local-jump’, ‘restricted-range’]

locality : int

Number of neighboring states on either side to consider for local update schemes

update_stages : str

Number of stages to use for update. One of [‘one-stage’, ‘two-stage’]

weight_update_method : str

Method to use for updating log weights in SAMS. One of [‘optimal’, ‘rao-blackwellized’]

adapt_target_probabilities : bool

If True, target probabilities will be adapted to achieve minimal thermodynamic length between terminal thermodynamic states.

gamma0 : float, optional, default=0.0

Initial weight adaptation rate.

logZ_guess : array-like of shape [n_states] of floats, optional, default=None

Initial guess for logZ for all states, if available.

class Status(iteration, target_error, is_completed)
count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

is_completed

Alias for field number 2

iteration

Alias for field number 0

target_error

Alias for field number 1

create(thermodynamic_states: list, sampler_states, storage, initial_thermodynamic_states=None, unsampled_thermodynamic_states=None, metadata=None)

Create new multistate sampler simulation.

Parameters:
thermodynamic_states : list of openmmtools.states.ThermodynamicState

Thermodynamic states to simulate, where one replica is allocated per state. Each state must have a system with the same number of atoms.

sampler_states : openmmtools.states.SamplerState or list

One or more sets of initial sampler states. The number of replicas is taken to be the number of sampler states provided. If the sampler states do not have box_vectors attached and the system is periodic, an exception will be thrown.

storage : str or instanced Reporter

If str: the path to the storage file. Default checkpoint options from Reporter class are used If Reporter: Uses the reporter options and storage path In the future this will be able to take a Storage class as well.

initial_thermodynamic_states : None or list or array-like of int of length len(sampler_states), optional,

default: None. Initial thermodynamic_state index for each sampler_state. If no initial distribution is chosen, sampler_states are distributed between the thermodynamic_states following these rules:

  • If len(thermodynamic_states) == len(sampler_states): 1-to-1 distribution
  • If len(thermodynamic_states) > len(sampler_states): First and last state distributed first remaining sampler_states spaced evenly by index until sampler_states are depleted. If there is only one sampler_state, then the only first thermodynamic_state will be chosen
  • If len(thermodynamic_states) < len(sampler_states), each thermodynamic_state receives an equal number of sampler_states until there are insufficient number of sampler_states remaining to give each thermodynamic_state an equal number. Then the rules from the previous point are followed.
unsampled_thermodynamic_states : list of openmmtools.states.ThermodynamicState, optional, default=None

These are ThermodynamicStates that are not propagated, but their reduced potential is computed at each iteration for each replica. These energy can be used as data for reweighting schemes (default is None).

metadata : dict, optional, default=None

Simulation metadata to be stored in the file.

default_options()

dict of all default class options (keyword arguments for __init__ for class and superclasses)

equilibrate(n_iterations, mcmc_moves=None)

Equilibrate all replicas.

This does not increase the iteration counter. The equilibrated positions are stored at the end.

Parameters:
n_iterations : int

Number of equilibration iterations.

mcmc_moves : MCMCMove or list of MCMCMove, optional

Optionally, the MCMCMoves to use for equilibration can be different from the ones used in production.

extend(n_iterations)

Extend the simulation by the given number of iterations.

Contrarily to run(), this will extend the number of iterations past number_of_iteration if requested.

Parameters:
n_iterations : int

The number of iterations to run.

from_storage(storage)

Constructor from an existing storage file.

Parameters:
storage : str or Reporter

If str: The path to the storage file. If Reporter: uses the Reporter options In the future this will be able to take a Storage class as well.

Returns:
sampler : MultiStateSampler

A new instance of MultiStateSampler (or subclass) in the same state of the last stored iteration.

is_completed

Check if we have reached any of the stop target criteria (read-only)

is_periodic

Return True if system is periodic, False if not, and None if not initialized

iteration

The integer current iteration of the simulation (read-only).

If the simulation has not been created yet, this is None.

mcmc_moves

A copy of the MCMCMoves list used to propagate the simulation.

This can be set only before creation.

metadata

A copy of the metadata dictionary passed on creation (read-only).

minimize(tolerance=Quantity(value=1.0, unit=kilojoule/(nanometer*mole)), max_iterations=0)

Minimize all replicas.

Minimized positions are stored at the end.

Parameters:
tolerance : simtk.unit.Quantity, optional

Minimization tolerance (units of energy/mole/length, default is 1.0 * unit.kilojoules_per_mole / unit.nanometers).

max_iterations : int, optional

Maximum number of iterations for minimization. If 0, minimization continues until converged.

n_replicas

The integer number of replicas (read-only).

n_states

The integer number of thermodynamic states (read-only).

options

dict of all class options (keyword arguments for __init__ for class and superclasses)

read_status(storage)

Read the status of the calculation from the storage file.

This class method can be used to quickly check the status of the simulation before loading the full ReplicaExchange object from disk.

Parameters:
storage : str or Reporter

The path to the storage file or the reporter object.

Returns:
status : ReplicaExchange.Status

The status of the replica-exchange calculation. It has three fields: iteration, target_error, and is_completed.

run(n_iterations=None)

Run the replica-exchange simulation.

This runs at most number_of_iterations iterations. Use extend() to pass the limit.

Parameters:
n_iterations : int, optional

If specified, only at most the specified number of iterations will be run (default is None).

sampler_states

A copy of the sampler states list at the current iteration.

This can be set only before running.

class yank.multistate.sams.SAMSAnalyzer(*args, unbias_restraint=True, restraint_energy_cutoff='auto', restraint_distance_cutoff='auto', **kwargs)[source]

The SAMSAnalyzer is the analyzer for a simulation generated from a SAMSSampler simulation.

See also

ReplicaExchangeAnalyzer, PhaseAnalyzer

clear()

Reset all cached objects.

This must to be called if the information in the reporter changes after analysis.

generate_mixing_statistics(number_equilibrated: typing.Union[int, NoneType] = None) → typing.NamedTuple

Compute and return replica mixing statistics.

Compute the transition state matrix, its eigenvalues sorted from greatest to least, and the state index correlation function.

Parameters:
number_equilibrated : int, optional, default=None

If specified, only samples number_equilibrated:end will be used in analysis. If not specified, automatically retrieves the number from equilibration data or generates it from the internal energy.

Returns:
mixing_statistics : namedtuple

A namedtuple containing the following attributes: - transition_matrix: (nstates by nstates np.array) - eigenvalues: (nstates-dimensional np.array) - statistical_inefficiency: float

get_effective_energy_timeseries(energies=None, replica_state_indices=None)

Generate the effective energy (negative log deviance) timeseries that is generated for this phase.

The effective energy for a series of samples x_n, n = 1..N, is defined as

u_n = - ln pi(x_n) + c

where pi(x) is the probability density being sampled, and c is an arbitrary constant.

Parameters:
energies : ndarray of shape (K,L,N), optional, Default: None

Energies from replicas K, sampled states L, and iterations N. If provided, then states input_sampled_states must also be provided.

replica_state_indices : ndarray of shape (K,N), optional, Default: None

Integer indices of each sampled state (matching L dimension in input_energy). that each replica K sampled every iteration N. If provided, then states input_energies must also be provided.

Returns:
u_n : ndarray of shape (N,)

u_n[n] is the negative log deviance of the same from iteration n Timeseries used to determine equilibration time and statistical inefficiency.

get_enthalpy()

Compute the difference in enthalpy and error in that estimate from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:
DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Difference in enthalpy from each state relative to each other state

dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Error in the difference in enthalpy from each state relative to each other state

get_entropy()

Compute the difference in entropy and error in that estimate from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:
DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Difference in enthalpy from each state relative to each other state

dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Error in the difference in enthalpy from each state relative to each other state

get_free_energy()

Compute the free energy and error in free energy from the MBAR object

Output shape changes based on if there are unsampled states detected in the sampler

Returns:
DeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Difference in free energy from each state relative to each other state

dDeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2)

Error in the difference in free energy from each state relative to each other state

has_log_weights

Return True if the storage has log weights, False otherwise

kT

Quantity of boltzmann constant times temperature of the phase in units of energy per mol

Allows conversion between dimensionless energy and unit bearing energy

n_equilibration_iterations

int: The number of equilibration interations.

n_iterations

int: The total number of iterations of the phase.

n_replicas

int: Number of replicas.

n_states

int: Number of sampled thermodynamic states.

name

User-readable string name of the phase

observables

List of observables that the instanced analyzer can compute/fetch.

read_energies()

Extract energies from the ncfile and order them by replica, state, iteration.

Returns:
sampled_energy_matrix : np.ndarray of shape [n_replicas, n_states, n_iterations]

Potential energy matrix of the sampled states.

unsampled_energy_matrix : np.ndarray of shape [n_replicas, n_unsamped_states, n_iterations]

Potential energy matrix of the unsampled states. Energy from each drawn sample n, evaluated at unsampled state l. If no unsampled states were drawn, this will be shape (0,N).

neighborhoods : np.ndarray of shape [n_replicas, n_states, n_iterations]

Neighborhood energies were computed at, uses a boolean mask over the energy_matrix.

replica_state_indices : np.ndarray of shape [n_replicas, n_iterations]

States sampled by the replicas in the energy_matrix

read_logZ(iteration=None)

Extract logZ estimates from the ncfile, if present. Returns ValueError if not present.

Parameters:
iteration : int or slice, optional, default=None

If specified, iteration or slice of iterations to extract

Returns:
logZ : np.ndarray of shape [n_states, n_iterations]

logZ[l,n] is the online logZ estimate for state l at iteration n

read_log_weights()

Extract log weights from the ncfile, if present. Returns ValueError if not present.

Returns:
log_weights : np.ndarray of shape [n_states, n_iterations]

log_weights[l,n] is the log weight applied to state l during the collection of samples at iteration n

reference_states

Tuple of reference states i and j for MultiPhaseAnalyzer instances

reformat_energies_for_mbar(u_kln: numpy.ndarray, n_k: typing.Union[numpy.ndarray, NoneType] = None)

Convert [replica, state, iteration] data into [state, total_iteration] data

This method assumes that the first dimension are all samplers, the second dimension are all the thermodynamic states energies were evaluated at and an equal number of samples were drawn from each k’th sampler, UNLESS n_k is specified.

Parameters:
u_kln : np.ndarray of shape (K,L,N’)

K = number of replica samplers L = number of thermodynamic states, N’ = number of iterations from state k

n_k : np.ndarray of shape K or None

Number of samples each _SAMPLER_ (k) has drawn This allows you to have trailing entries on a given kth row in the n’th (n prime) index which do not contribute to the conversion.

If this is None, assumes ALL samplers have the same number of samples such that N_k = N’ for all k

WARNING: N_k is number of samples the SAMPLER drew in total, NOT how many samples were drawn from each thermodynamic state L. This method knows nothing of how many samples were drawn from each state.

Returns:
u_ln : np.ndarray of shape (L, N)

Reduced, non-sparse data format L = number of thermodynamic states N = sum_k N_k. note this is not N’

reporter

Sampler Reporter tied to this object.

show_mixing_statistics(cutoff=0.05, number_equilibrated=None)

Print summary of mixing statistics. Passes information off to generate_mixing_statistics then prints it out to the logger

Parameters:
cutoff : float, optional, default=0.05

Only transition probabilities above ‘cutoff’ will be printed

number_equilibrated : int, optional, default=None

If specified, only samples number_equilibrated:end will be used in analysis If not specified, it uses the internally held statistics best

statistical_inefficiency

float: The statistical inefficiency of the sampler.

use_online_data

Get the online data flag