Samplers Header for YAML Files¶
The samplers
section tells YANK how it should sample multiple thermodynamic states in order to estimate free
energies between states of interest.
Together with the mcmc_moves section, this section provides a flexible way to control how
thermodynamic states are efficiently sampled.
This block is fully optional for those who do not wish to fiddle with such settings and to
support backwards compatible YAML scripts. If this no sampler
is given to experiment
, then a DEFAULT one is
used.
Samplers Syntax¶
samplers:
{UserDefinedSamplerName}:
type: {Sampler}
mcmc_moves: {MCMCName}
number_of_iterations: {NumberOfIterations}
{SamplerOptions}
The {Sampler}
can be one of the following:
- MultistateSampler: Independent simulations at distinct thermodynamic states
- ReplicaExchangeSampler: Replica exchange among thermodynamic states (also called Hamiltonian exchange if only the Hamiltonian is changing)
- SAMSSampler: Self-adjusted mixture sampling (also known as optimally-adjusted mixture sampling)
The above block is the minimum syntax needed for any definition for any of the options.
The {MCMCName}
denotes the name of an MCMC scheme block to be used to update the replicas at fixed thermodynamic state.
See MCMC defaults for more information about MCMC schemes.
The {NumberOfIterations}
is a non-negative integer that denotes the maximum number of iterations to be run.
When this block is not given and default_number_of_iterations is set
in the main Options for YAML files block, then that number is used instead. See that description for more information
{SamplerOptions}
denotes an optional set of sampler-specific options that can individually be specified if user
desires to override defaults. The MultiStateSampler
is the base class for all other samples and options which
can be provided to it are global to all other samplers. Below the MultiStateSampler
are the individual options
available for every other choice of type
.
MultiStateSampler
options¶
The MultiStateSampler
carries out independent simulations at multiple distinct thermodynamic states.
samplers:
{UserDefinedSamplerName}:
type: MultiStateSampler
mcmc_moves: {MCMCName}
number_of_iterations: {NumberOfIterations}
{MoreGlobalSamplerOptions}
The {MoreGlobalSamplerOptions}
are detailed below and are valid for any other type
of Sampler.
locality
¶
samplers:
{UserDefinedSamplerName}:
type: MultiStateSampler
mcmc_moves: {MCMCName}
number_of_iterations: {NumberOfIterations}
locality: {Locality}
Specify the number of states around the sampled state to compute energies between.
By default this is set to null
for global locality and all samples are computed in all states.
If the user desires the states at which energies are to be evaluated should be restricted to a neighborhood
[k-locality, k+locality]
around the current state k
, an integer can be specified. This is a non-wrapping
locality; e.g. For 10 states, State 0 (first state) with a locality: 2
will include states 1
and 2
but
NOT 9
and 8
. If locality
is greater than or equal to the number of states, then the behavior is the same
as null
.
Valid Options: [null
]/int
> 0
Todo
Later, we want to allow more complex neighborhoods to be specified via lists of lists.
ReplicaExchangeSampler
options¶
The ReplicaExchangeSampler
carries out simulations at multiple thermodynamic states, allowing pairs of replica to
periodically exchange thermodynamic states. If locality is specified (i.e. not null
), then
replica_mixing_scheme must be swap-neighbors
.
with this scheme, you must use in replica exchange because there exists one replica per thermodynamic state, and global locality is required for replica exchange to work.
samplers:
{UserDefinedSamplerName}:
type: ReplicaExchangeSampler
mcmc_moves: {MCMCName}
replica_mixing_scheme: {ReplicaMixingScheme}
A simple example:
samplers:
replica-exchange:
type: ReplicaExchangeSampler
mcmc_moves: langevin
replica_mixing_scheme: swap-all
replica_mixing_scheme
¶
options:
replica_mixing_scheme: swap-all
Specifies how the Hamiltonian Replica Exchange attempts swaps between replicas.
swap-all
will attempt to exchange every state with every other state. swap-neighbors
will attempt only
exchanges between adjacent states. If null
is specified, no mixing is done, and effectively disables all replica
exchange functionality.
Valid Options: [swap-all]/swap-neighbors/null
SAMSSampler
options¶
Like ReplicaExchangeSampler
, the SAMSSampler
carries out simulations at one or more thermodynamic states, but
state updates are performed independently, which can allow for more rapid exploration of the entire set of thermodynamic
states.
If multiple replicas are used, all replicas contribute to the update of the log weights for each state, in principle
accelerating convergence at a rate proportional to the number of replicas.
Many of the default options for this sampler should be considered acceptable and you should not need to manually set them, however, the ability to do so is present.
Todo
Provide a way to specify multiple replicas.
samplers:
{UserDefinedSamplerName}:
type: SAMSSampler
mcmc_moves: {MCMCName}
state_update_scheme: {JumpScheme}
gamma0: {GammaValue}
flatness_threshold: {FlatnessThreshold}
log_target_probabilities: {LogTargetProbabilities}
A simple example:
samplers:
sams:
type: SAMSSampler
mcmc_moves: langevin
state_update_scheme: global-jump
flatness_threshold: 2.0
number_of_iterations: 10000
gamma0: 10.0
state_update_scheme
¶
samplers:
sams:
type: SAMSSampler
mcmc_moves: langevin
state_update_scheme: global-jump
The scheme of how SAMS chooses to jump between sampled thermodynamic states, the behavior depends on which scheme is chosen:
global-jump
(default): The sampler can jump to any thermodynamic state (RECOMMENDED)restricted-range-jump
: The sampler can jump to any thermodynamic state within the specified local neighborhood (EXPERIMENTAL)local-jump
: Only proposals within the specified neighborhood are considered, but rejection rates may be high
Valid Options: global-jump
(Others are experimental and disabled for now)
gamma0
¶
samplers:
sams:
type: SAMSSampler
mcmc_moves: langevin
gamma0: 1.0
Controls the rate at which the initial heuristic stage accumulates log weight
Valid Options (1.0): float > 0
flatness_threshold
¶
samplers:
sams:
type: SAMSSampler
mcmc_moves: langevin
flatness_threshold: 0.2
Controls the fractional log weight that must be accumulated for each thermodynamic state before the weight adjustment scheme switches from the initial heuristic adjustment scheme to the asymptotically optimal scheme.
By default the log target probabilities are all equal, resulting in SAMS attempting to adjust the log weights to equally sample all thermodynamic states.
Valid Options (0.2): float > 0
Online Analysis Parameters¶
YANK’s samplers also supports an online free energy analysis framework which allows running simulations up to some target error in the free energy. Note that this will pause the simulation to run this analysis. The longer the simulation gets, the slower this process becomes. This is available for all samplers.
online_analysis_interval
¶
samplers:
{UserDefinedSamplerName}:
type: {SamplerOfChoice}
mcmc_moves: {MCMCName}
number_of_iterations: {NumberOfIterations}
online_analysis_interval: 100
Both the toggle and iteration count between online analysis operations. Every interval, the Multistate Bennet Acceptance Ratio estimate for the free energy is calculated and the error is computed. Some data is preserved each iteration to speed up future calculations, but this operation will still slow down as more iterations are added. We recommend choosing an interval of at least 100, if not more.
If set to checkpoint
, then the online analysis is run every checkpoint_interval
If set to null
, then online analysis is not run.
Valid Options (checkpoint
): checkpoint
, null
, or <Int >= 1>
online_analysis_target_error
¶
samplers:
{UserDefinedSamplerName}:
type: {SamplerOfChoice}
mcmc_moves: {MCMCName}
number_of_iterations: {NumberOfIterations}
online_analysis_target_error: 1.0
The target error for the online analysis measured in kT per phase. Once the free energy is at or below this value, the phase will be considered complete. This value should be a number greater than 0, even though 0 is a valid option. The error free energy estimate between states is never zero except in very rare cases, so your simulation may never converge if you set this to 0.
If online_analysis_interval is null
, this option does nothing.
Valid Options (0.0): <Float >= 0>
online_analysis_minimum_iterations
¶
samplers:
{UserDefinedSamplerName}:
type: {SamplerOfChoice}
mcmc_moves: {MCMCName}
number_of_iterations: {NumberOfIterations}
online_analysis_minimum_iterations: 50
Number of iterations that are skipped at the beginning of the simulation before online analysis is attempted. This is a speed option since most of the initial iterations will be either equilibration or under sampled. We recommend choosing an initial number that is at least one or two online_analysis_interval‘s for speed’s sake.
This number is only the threshold above when online analysis is run, and the iteration at which first analysis is
performed is tracked as the modulo of the current iteration.
E.g. if you have online_analysis_interval: 100
and
online_analysis_minimum_iterations: 150
, online analysis would happen at iteration 200, not iteration 250.
If online_analysis_interval is null
, this option does nothing.
Valid Options (200): <Int >=1>