MultiState Module¶
This is the API for the Multi State sub-module and its classes.
MultiState¶
Multistate Sampling simulation algorithms, specific variants, and analyzers
This module provides a general facility for running multiple thermodynamic state multistate simulations, both general as well as derived classes for special cases such as parallel tempering (in which the states differ only in temperature).
The classes also provide
Provided classes include:
yank.multistate.MultiStateSampler
- Base class for general, multi-thermodynamic state parallel multistate
yank.multistate.ReplicaExchangeSampler
- Derived class from MultiStateSampler which allows sampled thermodynamic states to swap based on Hamiltonian Replica Exchange
yank.multistate.ParallelTemperingSampler
- Convenience subclass of ReplicaExchange for parallel tempering simulations (one System object, many temperatures).
yank.multistate.SAMSSampler
- Single-replica sampler which samples through multiple thermodynamic states on the fly.
yank.multistate.MultiStateReporter
- Replica Exchange reporter class to store all variables and data
Analyzers¶
The MultiState module also provides analysis modules to analyze simulations and compute observables from data generated under any of the MultiStateSampler’s
Extending and Subclassing¶
Subclassing a sampler and analyzer is done by importing and extending any of the following:
- The base
MultiStateSampler
frommultistatesampler
- The base
MultiStateReporter
frommultistatereporter
- The base
MultiStateAnalyzer
orPhaseAnalyzer
and base ObservablesRegistry` frommultistateanalyzer
COPYRIGHT¶
Current version by Andrea Rizzi <andrea.rizzi@choderalab.org>, Levi N. Naden <levi.naden@choderalab.org> and John D. Chodera <john.chodera@choderalab.org> while at Memorial Sloan Kettering Cancer Center.
Original version by John D. Chodera <jchodera@gmail.com> while at the University of California Berkeley.
LICENSE¶
This code is licensed under the latest available version of the MIT License.
MultistateSampler¶
Base multi-thermodynamic state multistate class
COPYRIGHT
Current version by Andrea Rizzi <andrea.rizzi@choderalab.org>, Levi N. Naden <levi.naden@choderalab.org> and John D. Chodera <john.chodera@choderalab.org> while at Memorial Sloan Kettering Cancer Center.
Original version by John D. Chodera <jchodera@gmail.com> while at the University of California Berkeley.
LICENSE
This code is licensed under the latest available version of the MIT License.
-
class
yank.multistate.multistatesampler.
MultiStateSampler
(mcmc_moves=None, number_of_iterations=1, online_analysis_interval=200, online_analysis_target_error=0.0, online_analysis_minimum_iterations=200, locality=None)[source]¶ Base class for samplers that sample multiple thermodynamic states using one or more replicas.
This base class provides a general simulation facility for multistate from multiple thermodynamic states, allowing any set of thermodynamic states to be specified. If instantiated on its own, the thermodynamic state indices associated with each state are specified and replica mixing does not change any thermodynamic states, meaning that each replica remains in its original thermodynamic state.
Stored configurations, energies, swaps, and restart information are all written to a single output file using the platform portable, robust, and efficient NetCDF4 library.
Parameters: - mcmc_moves : MCMCMove or list of MCMCMove, optional
The MCMCMove used to propagate the thermodynamic states. If a list of MCMCMoves, they will be assigned to the correspondent thermodynamic state on creation. If None is provided, Langevin dynamics with 2fm timestep, 5.0/ps collision rate, and 500 steps per iteration will be used.
- number_of_iterations : int or infinity, optional, default: 1
The number of iterations to perform. Both
float('inf')
andnumpy.inf
are accepted for infinity. If you set this to infinity, be sure to set alsoonline_analysis_interval
.- online_analysis_interval : None or Int >= 1, optional, default: 200
Choose the interval at which to perform online analysis of the free energy.
After every interval, the simulation will be stopped and the free energy estimated.
If the error in the free energy estimate is at or below
online_analysis_target_error
, then the simulation will be considered completed.If set to
None
, then no online analysis is performed- online_analysis_target_error : float >= 0, optional, default 0.0
The target error for the online analysis measured in kT per phase.
Once the free energy is at or below this value, the phase will be considered complete.
If
online_analysis_interval
is None, this option does nothing.Default is set to 0.0 since online analysis runs by default, but a finite
number_of_iterations
should also be set to ensure there is some stop condition. If target error is 0 and an infinite number of iterations is set, then the sampler will run until the user stop it manually.- online_analysis_minimum_iterations : int >= 0, optional, default 200
Set the minimum number of iterations which must pass before online analysis is carried out.
Since the initial samples likely not to yield a good estimate of free energy, save time and just skip them If
online_analysis_interval
is None, this does nothing- locality : int > 0, optional, default None
If None, the energies at all states will be computed for every replica each iteration. If int > 0, energies will only be computed for states
range(max(0, state-locality), min(n_states, state+locality))
.
Attributes: n_replicas
The integer number of replicas (read-only).
n_states
The integer number of thermodynamic states (read-only).
iteration
The integer current iteration of the simulation (read-only).
mcmc_moves
A copy of the MCMCMoves list used to propagate the simulation.
sampler_states
A copy of the sampler states list at the current iteration.
metadata
A copy of the metadata dictionary passed on creation (read-only).
is_completed
Check if we have reached any of the stop target criteria (read-only)
- :param number_of_iterations: Maximum number of integer iterations that will be run
- :param online_analysis_interval: How frequently to carry out online analysis in number of iterations
- :param online_analysis_target_error: Target free energy difference error float at which simulation will be stopped during online analysis, in dimensionless energy
- :param online_analysis_minimum_iterations: Minimum number of iterations needed before online analysis is run as int
-
classmethod
from_storage
(storage)[source]¶ Constructor from an existing storage file.
Parameters: - storage : str or Reporter
If str: The path to the storage file. If
Reporter
: uses theReporter
options In the future this will be able to take a Storage class as well.
Returns: - sampler : MultiStateSampler
A new instance of MultiStateSampler (or subclass) in the same state of the last stored iteration.
-
class
Status
(iteration, target_error, is_completed)¶ -
count
(value) → integer -- return number of occurrences of value¶
-
index
(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
is_completed
¶ Alias for field number 2
-
iteration
¶ Alias for field number 0
-
target_error
¶ Alias for field number 1
-
-
classmethod
read_status
(storage)[source]¶ Read the status of the calculation from the storage file.
This class method can be used to quickly check the status of the simulation before loading the full
ReplicaExchange
object from disk.Parameters: - storage : str or Reporter
The path to the storage file or the reporter object.
Returns: - status : ReplicaExchange.Status
The status of the replica-exchange calculation. It has three fields:
iteration
,target_error
, andis_completed
.
-
n_states
¶ The integer number of thermodynamic states (read-only).
-
n_replicas
¶ The integer number of replicas (read-only).
-
iteration
¶ The integer current iteration of the simulation (read-only).
If the simulation has not been created yet, this is None.
-
mcmc_moves
¶ A copy of the MCMCMoves list used to propagate the simulation.
This can be set only before creation.
-
sampler_states
¶ A copy of the sampler states list at the current iteration.
This can be set only before running.
-
is_periodic
¶ Return True if system is periodic, False if not, and None if not initialized
-
online_analysis_interval
¶ interval to carry out online analysis
-
metadata
¶ A copy of the metadata dictionary passed on creation (read-only).
-
is_completed
¶ Check if we have reached any of the stop target criteria (read-only)
-
create
(thermodynamic_states: list, sampler_states, storage, initial_thermodynamic_states=None, unsampled_thermodynamic_states=None, metadata=None)[source]¶ Create new multistate sampler simulation.
Parameters: - thermodynamic_states : list of openmmtools.states.ThermodynamicState
Thermodynamic states to simulate, where one replica is allocated per state. Each state must have a system with the same number of atoms.
- sampler_states : openmmtools.states.SamplerState or list
One or more sets of initial sampler states. The number of replicas is taken to be the number of sampler states provided. If the sampler states do not have box_vectors attached and the system is periodic, an exception will be thrown.
- storage : str or instanced Reporter
If str: the path to the storage file. Default checkpoint options from Reporter class are used If Reporter: Uses the reporter options and storage path In the future this will be able to take a Storage class as well.
- initial_thermodynamic_states : None or list or array-like of int of length len(sampler_states), optional,
default: None. Initial thermodynamic_state index for each sampler_state. If no initial distribution is chosen,
sampler_states
are distributed between thethermodynamic_states
following these rules:- If
len(thermodynamic_states) == len(sampler_states)
: 1-to-1 distribution - If
len(thermodynamic_states) > len(sampler_states)
: First and last state distributed first remainingsampler_states
spaced evenly by index untilsampler_states
are depleted. If there is only onesampler_state
, then the only firstthermodynamic_state
will be chosen - If
len(thermodynamic_states) < len(sampler_states)
, eachthermodynamic_state
receives an equal number ofsampler_states
until there are insufficient number ofsampler_states
remaining to give eachthermodynamic_state
an equal number. Then the rules from the previous point are followed.
- If
- unsampled_thermodynamic_states : list of openmmtools.states.ThermodynamicState, optional, default=None
These are ThermodynamicStates that are not propagated, but their reduced potential is computed at each iteration for each replica. These energy can be used as data for reweighting schemes (default is None).
- metadata : dict, optional, default=None
Simulation metadata to be stored in the file.
-
minimize
(tolerance=Quantity(value=1.0, unit=kilojoule/(nanometer*mole)), max_iterations=0)[source]¶ Minimize all replicas.
Minimized positions are stored at the end.
Parameters: - tolerance : simtk.unit.Quantity, optional
Minimization tolerance (units of energy/mole/length, default is
1.0 * unit.kilojoules_per_mole / unit.nanometers
).- max_iterations : int, optional
Maximum number of iterations for minimization. If 0, minimization continues until converged.
-
equilibrate
(n_iterations, mcmc_moves=None)[source]¶ Equilibrate all replicas.
This does not increase the iteration counter. The equilibrated positions are stored at the end.
Parameters: - n_iterations : int
Number of equilibration iterations.
- mcmc_moves : MCMCMove or list of MCMCMove, optional
Optionally, the MCMCMoves to use for equilibration can be different from the ones used in production.
-
run
(n_iterations=None)[source]¶ Run the replica-exchange simulation.
This runs at most
number_of_iterations
iterations. Useextend()
to pass the limit.Parameters: - n_iterations : int, optional
If specified, only at most the specified number of iterations will be run (default is None).
-
extend
(n_iterations)[source]¶ Extend the simulation by the given number of iterations.
Contrarily to
run()
, this will extend the number of iterations pastnumber_of_iteration
if requested.Parameters: - n_iterations : int
The number of iterations to run.
-
classmethod
default_options
()[source]¶ dict of all default class options (keyword arguments for __init__ for class and superclasses)
-
options
¶ dict of all class options (keyword arguments for __init__ for class and superclasses)
ReplicaExchangeSampler¶
Derived multi-thermodynamic state multistate class with exchanging configurations between replicas
COPYRIGHT
Current version by Andrea Rizzi <andrea.rizzi@choderalab.org>, Levi N. Naden <levi.naden@choderalab.org> and John D. Chodera <john.chodera@choderalab.org> while at Memorial Sloan Kettering Cancer Center.
Original version by John D. Chodera <jchodera@gmail.com> while at the University of California Berkeley.
LICENSE
This code is licensed under the latest available version of the MIT License.
-
class
yank.multistate.replicaexchange.
ReplicaExchangeSampler
(replica_mixing_scheme='swap-all', **kwargs)[source]¶ Replica-exchange simulation facility.
This MultiStateSampler class provides a general replica-exchange simulation facility, allowing any set of thermodynamic states to be specified, along with a set of initial positions to be assigned to the replicas in a round-robin fashion.
No distinction is made between one-dimensional and multidimensional replica layout. By default, the replica mixing scheme attempts to mix all replicas to minimize slow diffusion normally found in multidimensional replica exchange simulations (Modification of the ‘replica_mixing_scheme’ setting will allow the traditional ‘neighbor swaps only’ scheme to be used.)
Stored configurations, energies, swaps, and restart information are all written to a single output file using the platform portable, robust, and efficient NetCDF4 library.
Parameters: - mcmc_moves : MCMCMove or list of MCMCMove, optional
The MCMCMove used to propagate the states. If a list of MCMCMoves, they will be assigned to the correspondent thermodynamic state on creation. If None is provided, Langevin dynamics with 2fm timestep, 5.0/ps collision rate, and 500 steps per iteration will be used.
- number_of_iterations : int or infinity, optional, default: 1
The number of iterations to perform. Both
float('inf')
andnumpy.inf
are accepted for infinity. If you set this to infinity, be sure to set alsoonline_analysis_interval
.- replica_mixing_scheme : ‘swap-all’, ‘swap-neighbors’ or None, Default: ‘swap-all’
The scheme used to swap thermodynamic states between replicas.
- online_analysis_interval : None or Int >= 1, optional, default None
Choose the interval at which to perform online analysis of the free energy.
After every interval, the simulation will be stopped and the free energy estimated.
If the error in the free energy estimate is at or below
online_analysis_target_error
, then the simulation will be considered completed.- online_analysis_target_error : float >= 0, optional, default 0.2
The target error for the online analysis measured in kT per phase.
Once the free energy is at or below this value, the phase will be considered complete.
If
online_analysis_interval
is None, this option does nothing.- online_analysis_minimum_iterations : int >= 0, optional, default 50
Set the minimum number of iterations which must pass before online analysis is carried out.
Since the initial samples likely not to yield a good estimate of free energy, save time and just skip them If
online_analysis_interval
is None, this does nothing
Examples
Parallel tempering simulation of alanine dipeptide in implicit solvent (replica exchange among temperatures). This is just an illustrative example; use
ParallelTempering
class for actual production parallel tempering simulations.Create the system.
>>> import math >>> from simtk import unit >>> from openmmtools import testsystems, states, mcmc >>> testsystem = testsystems.AlanineDipeptideImplicit()
Create thermodynamic states for parallel tempering with exponentially-spaced schedule.
>>> n_replicas = 3 # Number of temperature replicas. >>> T_min = 298.0 * unit.kelvin # Minimum temperature. >>> T_max = 600.0 * unit.kelvin # Maximum temperature. >>> temperatures = [T_min + (T_max - T_min) * (math.exp(float(i) / float(nreplicas-1)) - 1.0) / (math.e - 1.0) ... for i in range(n_replicas)] >>> thermodynamic_states = [states.ThermodynamicState(system=testsystem.system, temperature=T) ... for T in temperatures]
Initialize simulation object with options. Run with a GHMC integrator.
>>> move = mcmc.GHMCMove(timestep=2.0*unit.femtoseconds, n_steps=50) >>> simulation = ReplicaExchangeSampler(mcmc_moves=move, number_of_iterations=2)
Create simulation with its storage file (in a temporary directory) and run.
>>> storage_path = tempfile.NamedTemporaryFile(delete=False).name + '.nc' >>> reporter = MultiStateReporter(storage_path, checkpoint_interval=1) >>> simulation.create(thermodynamic_states=thermodynamic_states, >>> sampler_states=states.SamplerState(testsystem.positions), >>> storage=reporter) >>> simulation.run() # This runs for a maximum of 2 iterations. >>> simulation.iteration 2 >>> simulation.run(n_iterations=1) >>> simulation.iteration 2
To resume a simulation from an existing storage file and extend it beyond the original number of iterations.
>>> del simulation >>> simulation = ReplicaExchangeSampler.from_storage(reporter) >>> simulation.extend(n_iterations=1) >>> simulation.iteration 3
You can extract several information from the NetCDF file using the Reporter class while the simulation is running. This reads the SamplerStates of every run iteration.
>>> reporter = MultiStateReporter(storage=storage_path, open_mode='r', checkpoint_interval=1) >>> sampler_states = reporter.read_sampler_states(iteration=range(1, 4)) >>> len(sampler_states) 3 >>> sampler_states[-1].positions.shape # Alanine dipeptide has 22 atoms. (22, 3)
Clean up.
>>> os.remove(storage_path)
Parameters: - number_of_iterations – Maximum number of integer iterations that will be run
- replica_mixing_scheme – Scheme which describes how replicas are exchanged each iteration as string
- online_analysis_interval – How frequently to carry out online analysis in number of iterations
- online_analysis_target_error – Target free energy difference error float at which simulation will be stopped during online analysis, in dimensionless energy
- online_analysis_minimum_iterations – Minimum number of iterations needed before online analysis is run as int
Attributes: n_replicas
The integer number of replicas (read-only).
iteration
The integer current iteration of the simulation (read-only).
mcmc_moves
A copy of the MCMCMoves list used to propagate the simulation.
sampler_states
A copy of the sampler states list at the current iteration.
metadata
A copy of the metadata dictionary passed on creation (read-only).
is_completed
Check if we have reached any of the stop target criteria (read-only)
-
class
Status
(iteration, target_error, is_completed)¶ -
count
(value) → integer -- return number of occurrences of value¶
-
index
(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
is_completed
¶ Alias for field number 2
-
iteration
¶ Alias for field number 0
-
target_error
¶ Alias for field number 1
-
-
create
(thermodynamic_states: list, sampler_states, storage, initial_thermodynamic_states=None, unsampled_thermodynamic_states=None, metadata=None)¶ Create new multistate sampler simulation.
Parameters: - thermodynamic_states : list of openmmtools.states.ThermodynamicState
Thermodynamic states to simulate, where one replica is allocated per state. Each state must have a system with the same number of atoms.
- sampler_states : openmmtools.states.SamplerState or list
One or more sets of initial sampler states. The number of replicas is taken to be the number of sampler states provided. If the sampler states do not have box_vectors attached and the system is periodic, an exception will be thrown.
- storage : str or instanced Reporter
If str: the path to the storage file. Default checkpoint options from Reporter class are used If Reporter: Uses the reporter options and storage path In the future this will be able to take a Storage class as well.
- initial_thermodynamic_states : None or list or array-like of int of length len(sampler_states), optional,
default: None. Initial thermodynamic_state index for each sampler_state. If no initial distribution is chosen,
sampler_states
are distributed between thethermodynamic_states
following these rules:- If
len(thermodynamic_states) == len(sampler_states)
: 1-to-1 distribution - If
len(thermodynamic_states) > len(sampler_states)
: First and last state distributed first remainingsampler_states
spaced evenly by index untilsampler_states
are depleted. If there is only onesampler_state
, then the only firstthermodynamic_state
will be chosen - If
len(thermodynamic_states) < len(sampler_states)
, eachthermodynamic_state
receives an equal number ofsampler_states
until there are insufficient number ofsampler_states
remaining to give eachthermodynamic_state
an equal number. Then the rules from the previous point are followed.
- If
- unsampled_thermodynamic_states : list of openmmtools.states.ThermodynamicState, optional, default=None
These are ThermodynamicStates that are not propagated, but their reduced potential is computed at each iteration for each replica. These energy can be used as data for reweighting schemes (default is None).
- metadata : dict, optional, default=None
Simulation metadata to be stored in the file.
-
default_options
()¶ dict of all default class options (keyword arguments for __init__ for class and superclasses)
-
equilibrate
(n_iterations, mcmc_moves=None)¶ Equilibrate all replicas.
This does not increase the iteration counter. The equilibrated positions are stored at the end.
Parameters: - n_iterations : int
Number of equilibration iterations.
- mcmc_moves : MCMCMove or list of MCMCMove, optional
Optionally, the MCMCMoves to use for equilibration can be different from the ones used in production.
-
extend
(n_iterations)¶ Extend the simulation by the given number of iterations.
Contrarily to
run()
, this will extend the number of iterations pastnumber_of_iteration
if requested.Parameters: - n_iterations : int
The number of iterations to run.
-
from_storage
(storage)¶ Constructor from an existing storage file.
Parameters: - storage : str or Reporter
If str: The path to the storage file. If
Reporter
: uses theReporter
options In the future this will be able to take a Storage class as well.
Returns: - sampler : MultiStateSampler
A new instance of MultiStateSampler (or subclass) in the same state of the last stored iteration.
-
is_completed
¶ Check if we have reached any of the stop target criteria (read-only)
-
is_periodic
¶ Return True if system is periodic, False if not, and None if not initialized
-
iteration
¶ The integer current iteration of the simulation (read-only).
If the simulation has not been created yet, this is None.
-
mcmc_moves
¶ A copy of the MCMCMoves list used to propagate the simulation.
This can be set only before creation.
-
metadata
¶ A copy of the metadata dictionary passed on creation (read-only).
-
minimize
(tolerance=Quantity(value=1.0, unit=kilojoule/(nanometer*mole)), max_iterations=0)¶ Minimize all replicas.
Minimized positions are stored at the end.
Parameters: - tolerance : simtk.unit.Quantity, optional
Minimization tolerance (units of energy/mole/length, default is
1.0 * unit.kilojoules_per_mole / unit.nanometers
).- max_iterations : int, optional
Maximum number of iterations for minimization. If 0, minimization continues until converged.
-
n_replicas
¶ The integer number of replicas (read-only).
-
n_states
¶ The integer number of thermodynamic states (read-only).
-
options
¶ dict of all class options (keyword arguments for __init__ for class and superclasses)
-
read_status
(storage)¶ Read the status of the calculation from the storage file.
This class method can be used to quickly check the status of the simulation before loading the full
ReplicaExchange
object from disk.Parameters: - storage : str or Reporter
The path to the storage file or the reporter object.
Returns: - status : ReplicaExchange.Status
The status of the replica-exchange calculation. It has three fields:
iteration
,target_error
, andis_completed
.
-
run
(n_iterations=None)¶ Run the replica-exchange simulation.
This runs at most
number_of_iterations
iterations. Useextend()
to pass the limit.Parameters: - n_iterations : int, optional
If specified, only at most the specified number of iterations will be run (default is None).
-
sampler_states
¶ A copy of the sampler states list at the current iteration.
This can be set only before running.
-
class
yank.multistate.replicaexchange.
ReplicaExchangeAnalyzer
(*args, unbias_restraint=True, restraint_energy_cutoff='auto', restraint_distance_cutoff='auto', **kwargs)[source]¶ The ReplicaExchangeAnalyzer is the analyzer for a simulation generated from a Replica Exchange sampler simulation, implemented as an instance of the
MultiStateSamplerAnalyzer
.See also
PhaseAnalyzer
,MultiStateSamplerAnalyzer
-
clear
()¶ Reset all cached objects.
This must to be called if the information in the reporter changes after analysis.
-
generate_mixing_statistics
(number_equilibrated: typing.Union[int, NoneType] = None) → typing.NamedTuple¶ Compute and return replica mixing statistics.
Compute the transition state matrix, its eigenvalues sorted from greatest to least, and the state index correlation function.
Parameters: - number_equilibrated : int, optional, default=None
If specified, only samples
number_equilibrated:end
will be used in analysis. If not specified, automatically retrieves the number from equilibration data or generates it from the internal energy.
Returns: - mixing_statistics : namedtuple
A namedtuple containing the following attributes: -
transition_matrix
: (nstates by nstatesnp.array
) -eigenvalues
: (nstates-dimensionalnp.array
) -statistical_inefficiency
: float
-
get_effective_energy_timeseries
(energies=None, replica_state_indices=None)¶ Generate the effective energy (negative log deviance) timeseries that is generated for this phase.
The effective energy for a series of samples x_n, n = 1..N, is defined as
u_n = - ln pi(x_n) + c
where pi(x) is the probability density being sampled, and c is an arbitrary constant.
Parameters: - energies : ndarray of shape (K,L,N), optional, Default: None
Energies from replicas K, sampled states L, and iterations N. If provided, then states input_sampled_states must also be provided.
- replica_state_indices : ndarray of shape (K,N), optional, Default: None
Integer indices of each sampled state (matching L dimension in input_energy). that each replica K sampled every iteration N. If provided, then states input_energies must also be provided.
Returns: - u_n : ndarray of shape (N,)
u_n[n] is the negative log deviance of the same from iteration
n
Timeseries used to determine equilibration time and statistical inefficiency.
-
get_enthalpy
()¶ Compute the difference in enthalpy and error in that estimate from the MBAR object
Output shape changes based on if there are unsampled states detected in the sampler
Returns: - DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Difference in enthalpy from each state relative to each other state
- dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Error in the difference in enthalpy from each state relative to each other state
-
get_entropy
()¶ Compute the difference in entropy and error in that estimate from the MBAR object
Output shape changes based on if there are unsampled states detected in the sampler
Returns: - DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Difference in enthalpy from each state relative to each other state
- dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Error in the difference in enthalpy from each state relative to each other state
-
get_free_energy
()¶ Compute the free energy and error in free energy from the MBAR object
Output shape changes based on if there are unsampled states detected in the sampler
Returns: - DeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Difference in free energy from each state relative to each other state
- dDeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Error in the difference in free energy from each state relative to each other state
-
has_log_weights
¶ Return True if the storage has log weights, False otherwise
-
kT
¶ Quantity of boltzmann constant times temperature of the phase in units of energy per mol
Allows conversion between dimensionless energy and unit bearing energy
-
n_equilibration_iterations
¶ int: The number of equilibration interations.
-
n_iterations
¶ int: The total number of iterations of the phase.
-
n_replicas
¶ int: Number of replicas.
-
n_states
¶ int: Number of sampled thermodynamic states.
-
name
¶ User-readable string name of the phase
-
observables
¶ List of observables that the instanced analyzer can compute/fetch.
-
read_energies
()¶ Extract energies from the ncfile and order them by replica, state, iteration.
Returns: - sampled_energy_matrix : np.ndarray of shape [n_replicas, n_states, n_iterations]
Potential energy matrix of the sampled states.
- unsampled_energy_matrix : np.ndarray of shape [n_replicas, n_unsamped_states, n_iterations]
Potential energy matrix of the unsampled states. Energy from each drawn sample n, evaluated at unsampled state l. If no unsampled states were drawn, this will be shape (0,N).
- neighborhoods : np.ndarray of shape [n_replicas, n_states, n_iterations]
Neighborhood energies were computed at, uses a boolean mask over the energy_matrix.
- replica_state_indices : np.ndarray of shape [n_replicas, n_iterations]
States sampled by the replicas in the energy_matrix
-
read_logZ
(iteration=None)¶ Extract logZ estimates from the ncfile, if present. Returns ValueError if not present.
Parameters: - iteration : int or slice, optional, default=None
If specified, iteration or slice of iterations to extract
Returns: - logZ : np.ndarray of shape [n_states, n_iterations]
logZ[l,n] is the online logZ estimate for state
l
at iterationn
-
read_log_weights
()¶ Extract log weights from the ncfile, if present. Returns ValueError if not present.
Returns: - log_weights : np.ndarray of shape [n_states, n_iterations]
log_weights[l,n] is the log weight applied to state
l
during the collection of samples at iterationn
-
reference_states
¶ Tuple of reference states
i
andj
forMultiPhaseAnalyzer
instances
-
reformat_energies_for_mbar
(u_kln: numpy.ndarray, n_k: typing.Union[numpy.ndarray, NoneType] = None)¶ Convert [replica, state, iteration] data into [state, total_iteration] data
This method assumes that the first dimension are all samplers, the second dimension are all the thermodynamic states energies were evaluated at and an equal number of samples were drawn from each k’th sampler, UNLESS n_k is specified.
Parameters: - u_kln : np.ndarray of shape (K,L,N’)
K = number of replica samplers L = number of thermodynamic states, N’ = number of iterations from state k
- n_k : np.ndarray of shape K or None
Number of samples each _SAMPLER_ (k) has drawn This allows you to have trailing entries on a given kth row in the n’th (n prime) index which do not contribute to the conversion.
If this is None, assumes ALL samplers have the same number of samples such that N_k = N’ for all k
WARNING: N_k is number of samples the SAMPLER drew in total, NOT how many samples were drawn from each thermodynamic state L. This method knows nothing of how many samples were drawn from each state.
Returns: - u_ln : np.ndarray of shape (L, N)
Reduced, non-sparse data format L = number of thermodynamic states N = sum_k N_k. note this is not N’
-
reporter
¶ Sampler Reporter tied to this object.
-
show_mixing_statistics
(cutoff=0.05, number_equilibrated=None)¶ Print summary of mixing statistics. Passes information off to generate_mixing_statistics then prints it out to the logger
Parameters: - cutoff : float, optional, default=0.05
Only transition probabilities above ‘cutoff’ will be printed
- number_equilibrated : int, optional, default=None
If specified, only samples number_equilibrated:end will be used in analysis If not specified, it uses the internally held statistics best
-
statistical_inefficiency
¶ float: The statistical inefficiency of the sampler.
-
use_online_data
¶ Get the online data flag
-
ParallelTemperingSampler¶
Derived multi-thermodynamic state multistate class with exchanging configurations between replicas of different temperatures. This is a special case which accepts a single thermodynamic_state and different temperatures to sample. If you want different temperatures and Hamiltonians, use ReplicaExchangeSampler with temperatures pre-set.
COPYRIGHT
Current version by Andrea Rizzi <andrea.rizzi@choderalab.org>, Levi N. Naden <levi.naden@choderalab.org> and John D. Chodera <john.chodera@choderalab.org> while at Memorial Sloan Kettering Cancer Center.
Original version by John D. Chodera <jchodera@gmail.com> while at the University of California Berkeley.
LICENSE
This code is licensed under the latest available version of the MIT License.
-
class
yank.multistate.paralleltempering.
ParallelTemperingSampler
(replica_mixing_scheme='swap-all', **kwargs)[source]¶ Parallel tempering simulation facility.
This class provides a facility for parallel tempering simulations. It is a subclass of
ReplicaExchange
, but provides efficiency improvements for parallel tempering simulations, so should be preferred for this type of simulation. In particular, this makes use of the fact that the reduced potentials are linear in inverse temperature.See also
MultiStateSampler
,ReplicaExchangeSampler
Examples
Create the system.
>>> from simtk import unit >>> from openmmtools import testsystems, states, mcmc >>> import tempfile >>> testsystem = testsystems.AlanineDipeptideImplicit()
Create thermodynamic states for parallel tempering with exponentially-spaced schedule.
>>> n_replicas = 3 # Number of temperature replicas. >>> T_min = 298.0 * unit.kelvin # Minimum temperature. >>> T_max = 600.0 * unit.kelvin # Maximum temperature. >>> reference_state = states.ThermodynamicState(system=testsystem.system, temperature=T_min)
Initialize simulation object with options. Run with a GHMC integrator.
>>> move = mcmc.GHMCMove(timestep=2.0*unit.femtoseconds, n_steps=50) >>> simulation = ParallelTemperingSampler(mcmc_moves=move, number_of_iterations=2)
Create simulation with its storage file (in a temporary directory) and run.
>>> storage_path = tempfile.NamedTemporaryFile(delete=False).name + '.nc' >>> reporter = MultiStateReporter(storage_path, checkpoint_interval=10) >>> simulation.create(reference_state, ... states.SamplerState(testsystem.positions), ... reporter, min_temperature=T_min, ... max_temperature=T_max, n_temperatures=n_replicas) >>> simulation.run(n_iterations=1)
Clean up.
>>> os.remove(storage_path)
-
create
(thermodynamic_state, sampler_states: list, storage, min_temperature=None, max_temperature=None, n_temperatures=None, temperatures=None, **kwargs)[source]¶ Initialize a parallel tempering simulation object.
Parameters: - thermodynamic_state : openmmtools.states.ThermodynamicState
Reference thermodynamic state that will be simulated at the given temperatures.
WARNING: This is a SINGLE state, not a list of states!
- sampler_states : openmmtools.states.SamplerState or list
One or more sets of initial sampler states. If a list of SamplerStates, they will be assigned to replicas in a round-robin fashion.
- storage : str or Reporter
If str: path to the storage file, checkpoint options are default If Reporter: Instanced
Reporter
class, checkpoint information is read from In the future this will be able to take a Storage class as well.- min_temperature : simtk.unit.Quantity, optional
Minimum temperature (units of temperature, default is None).
- max_temperature : simtk.unit.Quantity, optional
Maximum temperature (units of temperature, default is None).
- n_temperatures : int, optional
Number of exponentially-spaced temperatures between
min_temperature
andmax_temperature
(default is None).- temperatures : list of simtk.unit.Quantity, optional
If specified, this list of temperatures will be used instead of
min_temperature
,max_temperature
, andn_temperatures
(units of temperature, default is None).- metadata : dict, optional
Simulation metadata to be stored in the file.
Notes
Either (
min_temperature
,max_temperature
,n_temperatures
) must all be specified or the list of ‘temperatures‘ must be specified.
-
class
Status
(iteration, target_error, is_completed)¶ -
count
(value) → integer -- return number of occurrences of value¶
-
index
(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
is_completed
¶ Alias for field number 2
-
iteration
¶ Alias for field number 0
-
target_error
¶ Alias for field number 1
-
-
default_options
()¶ dict of all default class options (keyword arguments for __init__ for class and superclasses)
-
equilibrate
(n_iterations, mcmc_moves=None)¶ Equilibrate all replicas.
This does not increase the iteration counter. The equilibrated positions are stored at the end.
Parameters: - n_iterations : int
Number of equilibration iterations.
- mcmc_moves : MCMCMove or list of MCMCMove, optional
Optionally, the MCMCMoves to use for equilibration can be different from the ones used in production.
-
extend
(n_iterations)¶ Extend the simulation by the given number of iterations.
Contrarily to
run()
, this will extend the number of iterations pastnumber_of_iteration
if requested.Parameters: - n_iterations : int
The number of iterations to run.
-
from_storage
(storage)¶ Constructor from an existing storage file.
Parameters: - storage : str or Reporter
If str: The path to the storage file. If
Reporter
: uses theReporter
options In the future this will be able to take a Storage class as well.
Returns: - sampler : MultiStateSampler
A new instance of MultiStateSampler (or subclass) in the same state of the last stored iteration.
-
is_completed
¶ Check if we have reached any of the stop target criteria (read-only)
-
is_periodic
¶ Return True if system is periodic, False if not, and None if not initialized
-
iteration
¶ The integer current iteration of the simulation (read-only).
If the simulation has not been created yet, this is None.
-
mcmc_moves
¶ A copy of the MCMCMoves list used to propagate the simulation.
This can be set only before creation.
-
metadata
¶ A copy of the metadata dictionary passed on creation (read-only).
-
minimize
(tolerance=Quantity(value=1.0, unit=kilojoule/(nanometer*mole)), max_iterations=0)¶ Minimize all replicas.
Minimized positions are stored at the end.
Parameters: - tolerance : simtk.unit.Quantity, optional
Minimization tolerance (units of energy/mole/length, default is
1.0 * unit.kilojoules_per_mole / unit.nanometers
).- max_iterations : int, optional
Maximum number of iterations for minimization. If 0, minimization continues until converged.
-
n_replicas
¶ The integer number of replicas (read-only).
-
n_states
¶ The integer number of thermodynamic states (read-only).
-
options
¶ dict of all class options (keyword arguments for __init__ for class and superclasses)
-
read_status
(storage)¶ Read the status of the calculation from the storage file.
This class method can be used to quickly check the status of the simulation before loading the full
ReplicaExchange
object from disk.Parameters: - storage : str or Reporter
The path to the storage file or the reporter object.
Returns: - status : ReplicaExchange.Status
The status of the replica-exchange calculation. It has three fields:
iteration
,target_error
, andis_completed
.
-
run
(n_iterations=None)¶ Run the replica-exchange simulation.
This runs at most
number_of_iterations
iterations. Useextend()
to pass the limit.Parameters: - n_iterations : int, optional
If specified, only at most the specified number of iterations will be run (default is None).
-
sampler_states
¶ A copy of the sampler states list at the current iteration.
This can be set only before running.
-
-
class
yank.multistate.paralleltempering.
ParallelTemperingAnalyzer
(*args, unbias_restraint=True, restraint_energy_cutoff='auto', restraint_distance_cutoff='auto', **kwargs)[source]¶ The ParallelTemperingAnalyzer is the analyzer for a simulation generated from a Parallel Tempering sampler simulation, implemented as an instance of the
ReplicaExchangeAnalyzer
as the sampler is a subclass of theyank.multistate.ReplicaExchangeSampler
See also
PhaseAnalyzer
,ReplicaExchangeAnalyzer
-
clear
()¶ Reset all cached objects.
This must to be called if the information in the reporter changes after analysis.
-
generate_mixing_statistics
(number_equilibrated: typing.Union[int, NoneType] = None) → typing.NamedTuple¶ Compute and return replica mixing statistics.
Compute the transition state matrix, its eigenvalues sorted from greatest to least, and the state index correlation function.
Parameters: - number_equilibrated : int, optional, default=None
If specified, only samples
number_equilibrated:end
will be used in analysis. If not specified, automatically retrieves the number from equilibration data or generates it from the internal energy.
Returns: - mixing_statistics : namedtuple
A namedtuple containing the following attributes: -
transition_matrix
: (nstates by nstatesnp.array
) -eigenvalues
: (nstates-dimensionalnp.array
) -statistical_inefficiency
: float
-
get_effective_energy_timeseries
(energies=None, replica_state_indices=None)¶ Generate the effective energy (negative log deviance) timeseries that is generated for this phase.
The effective energy for a series of samples x_n, n = 1..N, is defined as
u_n = - ln pi(x_n) + c
where pi(x) is the probability density being sampled, and c is an arbitrary constant.
Parameters: - energies : ndarray of shape (K,L,N), optional, Default: None
Energies from replicas K, sampled states L, and iterations N. If provided, then states input_sampled_states must also be provided.
- replica_state_indices : ndarray of shape (K,N), optional, Default: None
Integer indices of each sampled state (matching L dimension in input_energy). that each replica K sampled every iteration N. If provided, then states input_energies must also be provided.
Returns: - u_n : ndarray of shape (N,)
u_n[n] is the negative log deviance of the same from iteration
n
Timeseries used to determine equilibration time and statistical inefficiency.
-
get_enthalpy
()¶ Compute the difference in enthalpy and error in that estimate from the MBAR object
Output shape changes based on if there are unsampled states detected in the sampler
Returns: - DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Difference in enthalpy from each state relative to each other state
- dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Error in the difference in enthalpy from each state relative to each other state
-
get_entropy
()¶ Compute the difference in entropy and error in that estimate from the MBAR object
Output shape changes based on if there are unsampled states detected in the sampler
Returns: - DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Difference in enthalpy from each state relative to each other state
- dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Error in the difference in enthalpy from each state relative to each other state
-
get_free_energy
()¶ Compute the free energy and error in free energy from the MBAR object
Output shape changes based on if there are unsampled states detected in the sampler
Returns: - DeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Difference in free energy from each state relative to each other state
- dDeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Error in the difference in free energy from each state relative to each other state
-
has_log_weights
¶ Return True if the storage has log weights, False otherwise
-
kT
¶ Quantity of boltzmann constant times temperature of the phase in units of energy per mol
Allows conversion between dimensionless energy and unit bearing energy
-
n_equilibration_iterations
¶ int: The number of equilibration interations.
-
n_iterations
¶ int: The total number of iterations of the phase.
-
n_replicas
¶ int: Number of replicas.
-
n_states
¶ int: Number of sampled thermodynamic states.
-
name
¶ User-readable string name of the phase
-
observables
¶ List of observables that the instanced analyzer can compute/fetch.
-
read_energies
()¶ Extract energies from the ncfile and order them by replica, state, iteration.
Returns: - sampled_energy_matrix : np.ndarray of shape [n_replicas, n_states, n_iterations]
Potential energy matrix of the sampled states.
- unsampled_energy_matrix : np.ndarray of shape [n_replicas, n_unsamped_states, n_iterations]
Potential energy matrix of the unsampled states. Energy from each drawn sample n, evaluated at unsampled state l. If no unsampled states were drawn, this will be shape (0,N).
- neighborhoods : np.ndarray of shape [n_replicas, n_states, n_iterations]
Neighborhood energies were computed at, uses a boolean mask over the energy_matrix.
- replica_state_indices : np.ndarray of shape [n_replicas, n_iterations]
States sampled by the replicas in the energy_matrix
-
read_logZ
(iteration=None)¶ Extract logZ estimates from the ncfile, if present. Returns ValueError if not present.
Parameters: - iteration : int or slice, optional, default=None
If specified, iteration or slice of iterations to extract
Returns: - logZ : np.ndarray of shape [n_states, n_iterations]
logZ[l,n] is the online logZ estimate for state
l
at iterationn
-
read_log_weights
()¶ Extract log weights from the ncfile, if present. Returns ValueError if not present.
Returns: - log_weights : np.ndarray of shape [n_states, n_iterations]
log_weights[l,n] is the log weight applied to state
l
during the collection of samples at iterationn
-
reference_states
¶ Tuple of reference states
i
andj
forMultiPhaseAnalyzer
instances
-
reformat_energies_for_mbar
(u_kln: numpy.ndarray, n_k: typing.Union[numpy.ndarray, NoneType] = None)¶ Convert [replica, state, iteration] data into [state, total_iteration] data
This method assumes that the first dimension are all samplers, the second dimension are all the thermodynamic states energies were evaluated at and an equal number of samples were drawn from each k’th sampler, UNLESS n_k is specified.
Parameters: - u_kln : np.ndarray of shape (K,L,N’)
K = number of replica samplers L = number of thermodynamic states, N’ = number of iterations from state k
- n_k : np.ndarray of shape K or None
Number of samples each _SAMPLER_ (k) has drawn This allows you to have trailing entries on a given kth row in the n’th (n prime) index which do not contribute to the conversion.
If this is None, assumes ALL samplers have the same number of samples such that N_k = N’ for all k
WARNING: N_k is number of samples the SAMPLER drew in total, NOT how many samples were drawn from each thermodynamic state L. This method knows nothing of how many samples were drawn from each state.
Returns: - u_ln : np.ndarray of shape (L, N)
Reduced, non-sparse data format L = number of thermodynamic states N = sum_k N_k. note this is not N’
-
reporter
¶ Sampler Reporter tied to this object.
-
show_mixing_statistics
(cutoff=0.05, number_equilibrated=None)¶ Print summary of mixing statistics. Passes information off to generate_mixing_statistics then prints it out to the logger
Parameters: - cutoff : float, optional, default=0.05
Only transition probabilities above ‘cutoff’ will be printed
- number_equilibrated : int, optional, default=None
If specified, only samples number_equilibrated:end will be used in analysis If not specified, it uses the internally held statistics best
-
statistical_inefficiency
¶ float: The statistical inefficiency of the sampler.
-
use_online_data
¶ Get the online data flag
-
SamsSampler¶
Self-adjusted mixture sampling (SAMS), also known as optimally-adjusted mixture sampling.
This implementation uses stochastic approximation to allow one or more replicas to sample the whole range of thermodynamic states for rapid online computation of free energies.
COPYRIGHT
Written by John D. Chodera <john.chodera@choderalab.org> while at Memorial Sloan Kettering Cancer Center.
LICENSE
This code is licensed under the latest available version of the MIT License.
-
class
yank.multistate.sams.
SAMSSampler
(number_of_iterations=1, log_target_probabilities=None, state_update_scheme='global-jump', locality=5, update_stages='two-stage', flatness_threshold=0.2, weight_update_method='rao-blackwellized', adapt_target_probabilities=False, gamma0=1.0, logZ_guess=None, **kwargs)[source]¶ Self-adjusted mixture sampling (SAMS), also known as optimally-adjusted mixture sampling.
This class provides a facility for self-adjusted mixture sampling simulations. One or more replicas use the method of expanded ensembles [1] to sample multiple thermodynamic states within each replica, with log weights for each thermodynamic state adapted on the fly [2] to achieve the desired target probabilities for each state.
See also
ReplicaExchangeSampler
References
[1] Lyubartsev AP, Martsinovski AA, Shevkunov SV, and Vorontsov-Velyaminov PN. New approach to Monte Carlo calculation of the free energy: Method of expanded ensembles. JCP 96:1776, 1992 http://dx.doi.org/10.1063/1.462133
[2] Tan, Z. Optimally adjusted mixture sampling and locally weighted histogram analysis, Journal of Computational and Graphical Statistics 26:54, 2017. http://dx.doi.org/10.1080/10618600.2015.1113975
Examples
SAMS simulation of alanine dipeptide in implicit solvent at different temperatures.
Create the system:
>>> import math >>> from simtk import unit >>> from openmmtools import testsystems, states, mcmc >>> testsystem = testsystems.AlanineDipeptideVacuum()
Create thermodynamic states for parallel tempering with exponentially-spaced schedule:
>>> n_replicas = 3 # Number of temperature replicas. >>> T_min = 298.0 * unit.kelvin # Minimum temperature. >>> T_max = 600.0 * unit.kelvin # Maximum temperature. >>> temperatures = [T_min + (T_max - T_min) * (math.exp(float(i) / float(nreplicas-1)) - 1.0) / (math.e - 1.0) ... for i in range(n_replicas)] >>> thermodynamic_states = [states.ThermodynamicState(system=testsystem.system, temperature=T) ... for T in temperatures]
Initialize simulation object with options. Run with a GHMC integrator:
>>> move = mcmc.GHMCMove(timestep=2.0*unit.femtoseconds, n_steps=50) >>> simulation = SAMSSampler(mcmc_moves=move, number_of_iterations=2, >>> state_update_scheme='restricted-range', locality=5, >>> update_stages='two-stage', flatness_threshold=0.2, >>> weight_update_method='rao-blackwellized', >>> adapt_target_probabilities=False)
Create a single-replica SAMS simulation bound to a storage file and run:
>>> storage_path = tempfile.NamedTemporaryFile(delete=False).name + '.nc' >>> reporter = MultiStateReporter(storage_path, checkpoint_interval=1) >>> simulation.create(thermodynamic_states=thermodynamic_states, >>> sampler_states=[states.SamplerState(testsystem.positions)], >>> storage=reporter) >>> simulation.run() # This runs for a maximum of 2 iterations. >>> simulation.iteration 2 >>> simulation.run(n_iterations=1) >>> simulation.iteration 2
To resume a simulation from an existing storage file and extend it beyond the original number of iterations.
>>> del simulation >>> simulation = SAMSSampler.from_storage(reporter) >>> simulation.extend(n_iterations=1) >>> simulation.iteration 3
You can extract several information from the NetCDF file using the Reporter class while the simulation is running. This reads the SamplerStates of every run iteration.
>>> reporter = MultiStateReporter(storage=storage_path, open_mode='r', checkpoint_interval=1) >>> sampler_states = reporter.read_sampler_states(iteration=range(1, 4)) >>> len(sampler_states) 3 >>> sampler_states[-1].positions.shape # Alanine dipeptide has 22 atoms. (22, 3)
Clean up.
>>> os.remove(storage_path)
Attributes: - log_target_probabilities : array-like
log_target_probabilities[state_index] is the log target probability for state
state_index
- state_update_scheme : str
Thermodynamic state sampling scheme. One of [‘global-jump’, ‘local-jump’, ‘restricted-range’]
- locality : int
Number of neighboring states on either side to consider for local update schemes
- update_stages : str
Number of stages to use for update. One of [‘one-stage’, ‘two-stage’]
- weight_update_method : str
Method to use for updating log weights in SAMS. One of [‘optimal’, ‘rao-blackwellized’]
- adapt_target_probabilities : bool
If True, target probabilities will be adapted to achieve minimal thermodynamic length between terminal thermodynamic states.
- gamma0 : float, optional, default=0.0
Initial weight adaptation rate.
- logZ_guess : array-like of shape [n_states] of floats, optional, default=None
Initial guess for logZ for all states, if available.
-
class
Status
(iteration, target_error, is_completed)¶ -
count
(value) → integer -- return number of occurrences of value¶
-
index
(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
is_completed
¶ Alias for field number 2
-
iteration
¶ Alias for field number 0
-
target_error
¶ Alias for field number 1
-
-
create
(thermodynamic_states: list, sampler_states, storage, initial_thermodynamic_states=None, unsampled_thermodynamic_states=None, metadata=None)¶ Create new multistate sampler simulation.
Parameters: - thermodynamic_states : list of openmmtools.states.ThermodynamicState
Thermodynamic states to simulate, where one replica is allocated per state. Each state must have a system with the same number of atoms.
- sampler_states : openmmtools.states.SamplerState or list
One or more sets of initial sampler states. The number of replicas is taken to be the number of sampler states provided. If the sampler states do not have box_vectors attached and the system is periodic, an exception will be thrown.
- storage : str or instanced Reporter
If str: the path to the storage file. Default checkpoint options from Reporter class are used If Reporter: Uses the reporter options and storage path In the future this will be able to take a Storage class as well.
- initial_thermodynamic_states : None or list or array-like of int of length len(sampler_states), optional,
default: None. Initial thermodynamic_state index for each sampler_state. If no initial distribution is chosen,
sampler_states
are distributed between thethermodynamic_states
following these rules:- If
len(thermodynamic_states) == len(sampler_states)
: 1-to-1 distribution - If
len(thermodynamic_states) > len(sampler_states)
: First and last state distributed first remainingsampler_states
spaced evenly by index untilsampler_states
are depleted. If there is only onesampler_state
, then the only firstthermodynamic_state
will be chosen - If
len(thermodynamic_states) < len(sampler_states)
, eachthermodynamic_state
receives an equal number ofsampler_states
until there are insufficient number ofsampler_states
remaining to give eachthermodynamic_state
an equal number. Then the rules from the previous point are followed.
- If
- unsampled_thermodynamic_states : list of openmmtools.states.ThermodynamicState, optional, default=None
These are ThermodynamicStates that are not propagated, but their reduced potential is computed at each iteration for each replica. These energy can be used as data for reweighting schemes (default is None).
- metadata : dict, optional, default=None
Simulation metadata to be stored in the file.
-
default_options
()¶ dict of all default class options (keyword arguments for __init__ for class and superclasses)
-
equilibrate
(n_iterations, mcmc_moves=None)¶ Equilibrate all replicas.
This does not increase the iteration counter. The equilibrated positions are stored at the end.
Parameters: - n_iterations : int
Number of equilibration iterations.
- mcmc_moves : MCMCMove or list of MCMCMove, optional
Optionally, the MCMCMoves to use for equilibration can be different from the ones used in production.
-
extend
(n_iterations)¶ Extend the simulation by the given number of iterations.
Contrarily to
run()
, this will extend the number of iterations pastnumber_of_iteration
if requested.Parameters: - n_iterations : int
The number of iterations to run.
-
from_storage
(storage)¶ Constructor from an existing storage file.
Parameters: - storage : str or Reporter
If str: The path to the storage file. If
Reporter
: uses theReporter
options In the future this will be able to take a Storage class as well.
Returns: - sampler : MultiStateSampler
A new instance of MultiStateSampler (or subclass) in the same state of the last stored iteration.
-
is_completed
¶ Check if we have reached any of the stop target criteria (read-only)
-
is_periodic
¶ Return True if system is periodic, False if not, and None if not initialized
-
iteration
¶ The integer current iteration of the simulation (read-only).
If the simulation has not been created yet, this is None.
-
mcmc_moves
¶ A copy of the MCMCMoves list used to propagate the simulation.
This can be set only before creation.
-
metadata
¶ A copy of the metadata dictionary passed on creation (read-only).
-
minimize
(tolerance=Quantity(value=1.0, unit=kilojoule/(nanometer*mole)), max_iterations=0)¶ Minimize all replicas.
Minimized positions are stored at the end.
Parameters: - tolerance : simtk.unit.Quantity, optional
Minimization tolerance (units of energy/mole/length, default is
1.0 * unit.kilojoules_per_mole / unit.nanometers
).- max_iterations : int, optional
Maximum number of iterations for minimization. If 0, minimization continues until converged.
-
n_replicas
¶ The integer number of replicas (read-only).
-
n_states
¶ The integer number of thermodynamic states (read-only).
-
options
¶ dict of all class options (keyword arguments for __init__ for class and superclasses)
-
read_status
(storage)¶ Read the status of the calculation from the storage file.
This class method can be used to quickly check the status of the simulation before loading the full
ReplicaExchange
object from disk.Parameters: - storage : str or Reporter
The path to the storage file or the reporter object.
Returns: - status : ReplicaExchange.Status
The status of the replica-exchange calculation. It has three fields:
iteration
,target_error
, andis_completed
.
-
run
(n_iterations=None)¶ Run the replica-exchange simulation.
This runs at most
number_of_iterations
iterations. Useextend()
to pass the limit.Parameters: - n_iterations : int, optional
If specified, only at most the specified number of iterations will be run (default is None).
-
sampler_states
¶ A copy of the sampler states list at the current iteration.
This can be set only before running.
-
class
yank.multistate.sams.
SAMSAnalyzer
(*args, unbias_restraint=True, restraint_energy_cutoff='auto', restraint_distance_cutoff='auto', **kwargs)[source]¶ The SAMSAnalyzer is the analyzer for a simulation generated from a SAMSSampler simulation.
See also
ReplicaExchangeAnalyzer
,PhaseAnalyzer
-
clear
()¶ Reset all cached objects.
This must to be called if the information in the reporter changes after analysis.
-
generate_mixing_statistics
(number_equilibrated: typing.Union[int, NoneType] = None) → typing.NamedTuple¶ Compute and return replica mixing statistics.
Compute the transition state matrix, its eigenvalues sorted from greatest to least, and the state index correlation function.
Parameters: - number_equilibrated : int, optional, default=None
If specified, only samples
number_equilibrated:end
will be used in analysis. If not specified, automatically retrieves the number from equilibration data or generates it from the internal energy.
Returns: - mixing_statistics : namedtuple
A namedtuple containing the following attributes: -
transition_matrix
: (nstates by nstatesnp.array
) -eigenvalues
: (nstates-dimensionalnp.array
) -statistical_inefficiency
: float
-
get_effective_energy_timeseries
(energies=None, replica_state_indices=None)¶ Generate the effective energy (negative log deviance) timeseries that is generated for this phase.
The effective energy for a series of samples x_n, n = 1..N, is defined as
u_n = - ln pi(x_n) + c
where pi(x) is the probability density being sampled, and c is an arbitrary constant.
Parameters: - energies : ndarray of shape (K,L,N), optional, Default: None
Energies from replicas K, sampled states L, and iterations N. If provided, then states input_sampled_states must also be provided.
- replica_state_indices : ndarray of shape (K,N), optional, Default: None
Integer indices of each sampled state (matching L dimension in input_energy). that each replica K sampled every iteration N. If provided, then states input_energies must also be provided.
Returns: - u_n : ndarray of shape (N,)
u_n[n] is the negative log deviance of the same from iteration
n
Timeseries used to determine equilibration time and statistical inefficiency.
-
get_enthalpy
()¶ Compute the difference in enthalpy and error in that estimate from the MBAR object
Output shape changes based on if there are unsampled states detected in the sampler
Returns: - DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Difference in enthalpy from each state relative to each other state
- dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Error in the difference in enthalpy from each state relative to each other state
-
get_entropy
()¶ Compute the difference in entropy and error in that estimate from the MBAR object
Output shape changes based on if there are unsampled states detected in the sampler
Returns: - DeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Difference in enthalpy from each state relative to each other state
- dDeltaH_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Error in the difference in enthalpy from each state relative to each other state
-
get_free_energy
()¶ Compute the free energy and error in free energy from the MBAR object
Output shape changes based on if there are unsampled states detected in the sampler
Returns: - DeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Difference in free energy from each state relative to each other state
- dDeltaF_ij : ndarray of floats, shape (K,K) or (K+2, K+2)
Error in the difference in free energy from each state relative to each other state
-
has_log_weights
¶ Return True if the storage has log weights, False otherwise
-
kT
¶ Quantity of boltzmann constant times temperature of the phase in units of energy per mol
Allows conversion between dimensionless energy and unit bearing energy
-
n_equilibration_iterations
¶ int: The number of equilibration interations.
-
n_iterations
¶ int: The total number of iterations of the phase.
-
n_replicas
¶ int: Number of replicas.
-
n_states
¶ int: Number of sampled thermodynamic states.
-
name
¶ User-readable string name of the phase
-
observables
¶ List of observables that the instanced analyzer can compute/fetch.
-
read_energies
()¶ Extract energies from the ncfile and order them by replica, state, iteration.
Returns: - sampled_energy_matrix : np.ndarray of shape [n_replicas, n_states, n_iterations]
Potential energy matrix of the sampled states.
- unsampled_energy_matrix : np.ndarray of shape [n_replicas, n_unsamped_states, n_iterations]
Potential energy matrix of the unsampled states. Energy from each drawn sample n, evaluated at unsampled state l. If no unsampled states were drawn, this will be shape (0,N).
- neighborhoods : np.ndarray of shape [n_replicas, n_states, n_iterations]
Neighborhood energies were computed at, uses a boolean mask over the energy_matrix.
- replica_state_indices : np.ndarray of shape [n_replicas, n_iterations]
States sampled by the replicas in the energy_matrix
-
read_logZ
(iteration=None)¶ Extract logZ estimates from the ncfile, if present. Returns ValueError if not present.
Parameters: - iteration : int or slice, optional, default=None
If specified, iteration or slice of iterations to extract
Returns: - logZ : np.ndarray of shape [n_states, n_iterations]
logZ[l,n] is the online logZ estimate for state
l
at iterationn
-
read_log_weights
()¶ Extract log weights from the ncfile, if present. Returns ValueError if not present.
Returns: - log_weights : np.ndarray of shape [n_states, n_iterations]
log_weights[l,n] is the log weight applied to state
l
during the collection of samples at iterationn
-
reference_states
¶ Tuple of reference states
i
andj
forMultiPhaseAnalyzer
instances
-
reformat_energies_for_mbar
(u_kln: numpy.ndarray, n_k: typing.Union[numpy.ndarray, NoneType] = None)¶ Convert [replica, state, iteration] data into [state, total_iteration] data
This method assumes that the first dimension are all samplers, the second dimension are all the thermodynamic states energies were evaluated at and an equal number of samples were drawn from each k’th sampler, UNLESS n_k is specified.
Parameters: - u_kln : np.ndarray of shape (K,L,N’)
K = number of replica samplers L = number of thermodynamic states, N’ = number of iterations from state k
- n_k : np.ndarray of shape K or None
Number of samples each _SAMPLER_ (k) has drawn This allows you to have trailing entries on a given kth row in the n’th (n prime) index which do not contribute to the conversion.
If this is None, assumes ALL samplers have the same number of samples such that N_k = N’ for all k
WARNING: N_k is number of samples the SAMPLER drew in total, NOT how many samples were drawn from each thermodynamic state L. This method knows nothing of how many samples were drawn from each state.
Returns: - u_ln : np.ndarray of shape (L, N)
Reduced, non-sparse data format L = number of thermodynamic states N = sum_k N_k. note this is not N’
-
reporter
¶ Sampler Reporter tied to this object.
-
show_mixing_statistics
(cutoff=0.05, number_equilibrated=None)¶ Print summary of mixing statistics. Passes information off to generate_mixing_statistics then prints it out to the logger
Parameters: - cutoff : float, optional, default=0.05
Only transition probabilities above ‘cutoff’ will be printed
- number_equilibrated : int, optional, default=None
If specified, only samples number_equilibrated:end will be used in analysis If not specified, it uses the internally held statistics best
-
statistical_inefficiency
¶ float: The statistical inefficiency of the sampler.
-
use_online_data
¶ Get the online data flag
-