Physical-Chemistry¶

2025-06-27
in Physical-Chemistry, Metadynamics
43 min read

Practicing Metadynamics on the Müller-Brown Potential Energy Surface

1. Introduction to Metadynamics

Based on my interest in the molecular dynamics(MD)¹, my advisor recommended that I learn the PLUMED² plugin to deepen my understanding of MD capabilities and become familiar with state-of-the-art enhanced sampling techniques.

After hearing this advice, I did some research and discovered the PLUMED Masterclasses³ which gather experts to explain cutting-edge methods and how to solve challenging problems. In this post, I will explain some parts of Masterclass 9. In some blog post in the future I will proved some challenges with the production of the state-of-art pipeline for the drug testing of the preclinical phase. That will only be our concern. In the multiple posts I will try to provide the path to getting the understanding of the highly accurate meta dynamics-based calculation⁴ of protein-protein and protein-ligand binding potencies.

As a motivation, I was inspired by recent research on glycine formation. In that work, glycine was hypothesized to form stepwise from smaller molecules and was investigated using exploratory CPMD⁵ and free energy simulations⁶. Ab Initio Study of Glycine Formation in the Condensed Phase: Carbon Monoxide, Formaldimine, and Water Are Enough provides the basis for much of this blog post. This study served as my entry point into the field.

Fig 1 Stepwise reactions observed in the glycine formation study (Francisco Carrascoza et al 2023). This diagram from the article shows a sequence of reactions (system S3V0) starting from carbon monoxide and formaldimine in water and ultimately forming glycine. (We will not delve into these specific reactions here; Figure 1 is meant to illustrate the kind of complex pathways metadynamics can discover.)

When I came across the lecture on reaction pathways⁷, I was able to put this idea into practice, but first I need to work through the simpler examples.

2. Enhanced‐Sampling Tasks

2.1 Key Concepts

Path-metadynamics (PMD)⁸ is an enhanced sampling method that simultaneously converges a transition pathway and computes the free-energy profile along that path. It achieves this by performing metadynamics on an adaptive reaction path connecting two stable states – thereby yielding either an average transition path or the minimum free-energy path (MFEP)⁹, depending on parameters.

To simplify computation and visualization, we will employ a basic model system. We’ll use PLUMED’s PESMD module standalone without coupling it to any MD engine¹⁰ or full molecular system so GROMACS won’t be required for this task. These enhanced-sampling techniques can be applied to existing studies with minimal modifications, which bodes well for investigations of glycine formation.

We don't want to wait a lot of time for those things to do some interesting things so we just use few key descriptive degrees of freedom of the transition, commonly called collective variables¹¹(CVs).

The choice of CV's is a task of great importance. In the glycine system I would have to came up with the CV's for example in this task the free energy surface¹² (FES) is fully known and described by a set of $N$ CV's, $z_{i}(q)$ with $i = 1..N$, which are all functions of system's particle positions $q(t)$ We will also assume that these particle positions $q(t)$ evolve in time with a canonical equilibrium distribution¹³ at temperature $T$ under a potential $V(q)$.

For example in the research authors explicitly describe several CVs that biased or restrained during their metadynamics/CP-MD runs: - In the exploration stage - drive the system from reactants toward glycine they used the path CVs $s$ and $z$ . - $s$ measures progress along a reference pathway built from reactant to product structures; $z$ is the orthogonal deviation from that path. The distance metric for the path was the distance-mean-square deviation (DMSD) in exploration and the MSD in FES runs. - When they calculated free-energy-surfaces (FES) calculations, the same path CV $z$ was kept within $0.14$ Angstrom - The cartesian inter-atomic distances between selected atoms(e.g, the two carbon atoms that form the new C-C bond, and the C of CO with the O of a nearby water). Which is very similar to the example that was used in the master class that used similar distances and the angles between the atoms. - In the proton-transfer sub-reaction which is a usual thing in the molecular dynamics(water to formaldimine formulation) the $N-H$ cartesian distance was measured between the nitrogen of formaldimine and a proton from a water molecule. - Only two reference points (neutral and protonated) were needed; bias was applied directly to this distance.

Fig. 2 Representation of the time scale for the "rare events" which we describe in the manner of changes in the chemical reaction. For example we're interested in the nanosecond scale. Slide from representation the plumed masterclass 9.

2.2 Rare Events in MD

In molecular systems, the experimentally observed timescales on which they occur, and why each regime is painful to simulate with straightforward MD. Each type of those timescales give us different properties. - Electronic excitation and non-radiative decay (photo-excitation, internal conversion). Which is on timescale of $1 - 100fs$ $(10^{-15} - 10^{-13}s)$. The nuclear motion is slower than electronic motion, because of that classical MD cannot follow electrons. Needs sub-fs steps. In my example simulations that are in my github repository is an example of a simulation that was done in timestep $dt$ in picoseconds($ps$) as a dt = 0.0002ps is a $0.0002ps \times \frac{1000fs}{1ps} = 2fs$. in that case full quantum treatment of electrons and nuclei(nonadiabatic dynamics), tiny time steps, high cost of excited-state potential energy surfaces are key simulation challenges. For those problems we have mixed quantum-classical techniques like surface hopping but that's not for now to talk about.

The last was contact control during FES runs. The coordination-number CV between heavy-atom groups ($G_{1},G_{2}$) were used. They restrained or monitored the number of contacts. Defined with the standard smooth switching function¹⁴ in PLUMED.

For the metadynamics we are interested in exploring the space spanned by the CVs. $z_{i}$ and quantifying the free energy. In order to do so, in metadynamics we exert a history-dependent repulsive potential by summing Gaussian kernels¹⁵ over time. That's the summation

$V_{bias}(z,t) = \sum_{k \tau <t} H(k \tau) \exp\left( -\sum_{i=1}^{N} \frac{(z_{i}-{z}_{i}(q(k \tau)))^2}{2W^2_{i}} \right)$

2.3 Gaussian Kernel: Derivation & Interpretation

Lets start with the easy examples. We should think of gaussian kernel as a smooth, bell-shaped hill that you drop at a specific point in space( or along a variable). Here's a breakdown: - The “Bell Curve” Shape in Gaussian is just the familiar bell curve you’ve seen in statistics class. It’s highest at its centre and falls off smoothly as you move away. In many algorithms (like metadynamics, or in machine learning methods such as kernel density estimation), we place (“kernel”) a little bell-shaped hill wherever the system visits. Each hill nudges the system away from places it’s already been, encouraging exploration. Mathematically, in one dimension it looks like $$K(x) = \exp\left( -\frac{x-x_{0}^2}{2 \sigma^2} \right) $$ where $x_{o}$ is the centre of the hill and $\sigma$ controls the ditch (a larger $\sigma$ means a fatter, wider hill) Fig. 3 With the centre almost on the origin ( $x_{0} \approx 0$) and a small width $\sigma \approx 0.77$, the Gaussian forms a tall, needle-like peak that falls to near-zero within just a few units of $x$. This setting illustrates a kernel that gives very local influence: it strongly weights points right at $x_{0}$ while rapidly discounting anything farther away.

Fig. 4 Here the bell’s centre is moved to $x_{0} \approx -2.7$ and the width is widened to $\sigma \approx 2.7$, producing a lower-slope mound that stays above zero across much of the axis. The larger $\sigma$ spreads the kernel’s influence over a wide range, while the shift in $x_{o}$ simply translates the entire curve without altering its shape.

after that we have several variables(let's say $x$ and $y$), you just combine two such bells into one 2D hill $$ K(x,y) = \exp\left( -\frac{[(x-x_{0})^2 + (y-y_{0})^2]}{2\sigma^2} \right) $$ Fig. 5 The $3D$ surface shows a smooth, gently rounded Gaussian that tops out near $(x_{0},y_{0}) \approx (-0.14,0.79)$. With a width of $\sigma \approx 1.3$ the peak is neither needle-thin nor extremely flat, so the kernel still dominates close to its centre but fades only gradually as you move a few units away. The concentric rings in the contour plot confirm this: their spacing widens slowly, signalling a fairly broad region of influence around the red centre point.

Fig. 6 Here the kernel is pushed to $(x_{0},y_{0})\approx(1.8,−2.0)$ and squeezed to a narrow $\sigma \approx 0.48$. The $3D$ view reveals a steep, slender spike whose base occupies just a small patch of the plane. Correspondingly, the contour plot displays only a handful of closely packed rings around the red dot, illustrating how quickly the kernel’s weight plummets: it grants strong emphasis to points very near $(1.8,−2.0)$ and almost none elsewhere.

The hill is infinitely smooth, no sharp corners. It won’t jerk¹⁶ the system around. The bias only affects the nearby region(because it decays exponentially), so we only push the system out of the places it's already been sampled well.

Now when we know what the Gaussian kernel is, we can take the natural next step in the metadynamics formalism: first we recall that in the long-time limit the accumulated bias exactly cancels out the underlying FES. In other words, if you keep dropping these Gaussian hills forever, the sum of all those hills fills in every free‐energy well so that $$ \lim_{ t \to \infty } V_{\text{bias}} (z,t) = -F(z) + const. $$ where $F(z)$ is the true free energy as a function of the collective variables $z$

So putting it all together we: 1. Deposit Gaussians every $\tau$ time units at the current CVs $z(q(k\tau))$, each with height $H$ and widths $W_{i}$. 2. Bias builds up because you avoid revisiting already‐sampled regions - the system is “pushed” into new territory. 3. Free‐energy estimate emerges in the long‐time limit, since $\(V_{bias}(z,t) = \sum_{k \tau <t} H(k \tau) \exp\left( -\sum_{i=1}^{N} \frac{(z_{i}-{z}_{i}(q(k \tau)))^2}{2W^2_{i}} \right) \to -F(z)$\) In standard metadynamics, all hills have the same height $H$. In well-tempered metadynamics²¹, one lets the effective hill height decay as more bias accumulates: $$H_{k} = H_{0} \exp\left( - \frac{V_{\text{bias}}(z(q(k\tau)),k\tau)}{k_{B}\Delta T} \right), $$ so that deep wells get filled more slowly. The resulting bias converges to a fraction of the true free energy: $$\lim_{ t \to \infty } V_{\text{bias}}(z,t) = - \frac{\Delta T}{T+\Delta T}F(z) + \text{const.} $$so by choosing the "bias temperature" $\Delta T$, you control the trade-off between exploration speed and accuracy of $F(z)$. Metadynamics also has several extensions to accelerate convergence and improve computational performance, such as the multiple-walker²² and said well-tempered versions.

Metadynamics can handle a few CVs simultaneously in a trivial manner, which spares us from having to find a single perfect CV. Instead, an appropriate set of CVs allows us to converge an insightful multidimensional free energy landscape, in which (meta)stable states and connecting MFEPs can be identified. In practice however, especially when investigating complex transitions, the number of CVs is limited to $N \approx 3$, because of the exponential growth of computational cost with CV dimensionality.

For the glycine what would be not enough we need to find another way

2.4 Path‐Metadynamics

In order to tackle complex transitions that require many CVs, path-based methods were developed(parallel-tempering metadynamics, bias-exchange metadynamics, transition-tempered metadynamics, infrequent metadynamics, funnel metadynamics, adaptive gaussian metadynamics, on-the-fly probability enhanced sampling).

For example, if we want to make SN2 nucleophilic substitution in solution: Key chemical events are concerted bond-forming to the nucleophile (Nu) and bond-breaking from the leaving group (LG), plus solvent reorganization around both. Typical CVs($N \approx 5 \to 8$) is that reaction would be: 1. $(d_{C})_{LG}$ the distance between the central carbon and LG 2. $(d_{C})_{NU}$ the distance between the central carbon and Nu 3. $CN_{LG}$ the coordination number of solvent oxygens around LG 4. $CN_{NU}$ the coordination number of solvent oxygens around Nu 5. $\theta_{LG-C-Nu}$, the Nu-C-LG angle(inversion coordinate) 6. A solvent-orientation collective coordinate (e.g. projection of water dipole along the reaction axis) 7. Optional dihedral(s) if the Nu or LG has internal rotors that gate the approach Those many CVs are necessary for the SN2. The SN2 barrier and mechanism depend not just on the two bond distances but also on how the solvent shell reorganizes to stabilize charge build-up on LG and Nu in the transition state.

Fig. 7 The scene captures early, “reactant-side” snapshot where the $SN2$ system just after the nucleophile (blue) begins its approach. In the $3D$ view the $Nu - C$ distance is still long, while the $C - LG$ bond (red) remains intact. The bar chart of collective variables ($CVs$) shows moderate solvent re-orientation and only partial solvent coordination around both $Nu$ and $LG$, consistent with an early stage. In the free-energy profile the marker sits near $\sigma \approx 0.3$, on the rising limb that leads to the transition state, and the slider beneath confirms that reaction progress is still low ($\approx 3$%). Fig. 8 Here the nucleophile has almost completed the substitution: the $Nu - C$ bond is short, the $LG -C$ bond is greatly elongated, and solvent molecules have reorganised to stabilise the emerging charges. The $CV$ bars reflect this shift - large coordination to $Nu$, diminished coordination to $LG$, and an increased solvent-orientation value. On the free-energy plot the marker now lies far to the right ($\sigma \approx 0.94$) on the downhill path from the transition state toward products, matching the reaction-progress slider that reads ($\approx 94$ %).

In these schemes, a path-CV is introduced. Those paths are simply a continuous, one-dimensional curve through the high-dimensional CV space that links two end states $A$ and $B$. Concretely, a path is a parametrized mapping $s(\sigma) : [0,1] \to R^N$ such that, $s(0)$ sits in basin $A$(reactant or initial state) and $s(1)$ sits in basin $B$(product or final state)

Then we got the initial guesses("schemes") which consists of: 1. Linear interpolation in CV-space. We simply draw a straight-line between the CV-vectors of A and B and discretize it into equal segments. 2. String method guess, which use a rough preliminary path from a coarse string-method or from a metadynamics run. 3. Nudged Elastic Band¹⁷ (NEB) is a chain of replicas (images) connected by springs; you can take those images as your initial path nodes. 4. Committor-based sampling¹⁸, a short run of unbiased MD bursts started along some CV-interpolated pathway and stitch together average transition points to form a path.

And once we picked an initial path $s^0(\sigma)$, path-metadynamics iteratively “bends” it toward regions of high transition flux¹⁹ by sampling points $z_{k}$ near the current path via metadynamics bias on the progress coordinate $\sigma(z)$. Computing the average positions of those samples in each $\sigma$ slice and updating the discrete path nodes toward those averages(while re-equidistantly reparametrizing)

We would like to ask, why to bother with that? But the answer is clear. This technique collapses an $N$-dimensional problem onto a single coordinate $\sigma$ along the path!

We now don't need to fill an exponentially large grid in CV-space²³, only to refine a one dimensional curve. In the narrow “tube” limit, the same machinery recovers the minimum free-energy path (MFEP); in the wider limit, you get the average reactive flux path.

This progress parameter emulates the reaction coordinate and can in principle be connected to the committor value. Since the transition path curve is not known a priori, the path-CV must be adaptive. If we assume that, in the vicinity of the path, the iso-committor planes²⁰ $S_{\sigma}$ are perpendicular to $s(\sigma)$, and that the configurational probability is a good indicator of the transition flux density, then we can converge the average transition path by iteratively adapting a guess path to the cumulative average density: $s_{g}(\sigma_{g}) = \langle z \rangle_{\sigma_{g}}$ We slice our MD samples into "bins" according to their current value of the progress coordinate $\sigma(z)$. For each bin(labelled $\sigma_{g}$), we compute the arithmetic mean of the z-vectors in that bin: $\langle z \rangle_{\sigma_{g}}$ then we move our path node $s_{g}(\sigma_{g})$ to that mean position.

Like it was said the key advantage of PMD is that the metadynamics biasing can be performed on the $1D$ progress parameter along the adaptive path-CV, instead of the $N-$dimensional CV-space. PMD is able to converge a path, and the free energy along it, with a sublinear rise in computational cost with respect to the growth in CV dimensionality (instead of the exponential relation when not using a path-CV)

How it is calculated? Numerically, the adaptive path-CV is implemented as a set of $M$ ordered nodes (coordinates in CV-space) $s_{g} (\sigma_{g}, t) \to s_{j}^{t_{i}}$ with $j = 1, \dots,M$, where the $t_{i}$ is the discrete time of path updates. The projection of any point $z$ onto the path is done considering the closest two nodes, and $\sigma$ is obtained by interpolation. This approach requires equidistant nodes which are imposed by a reparameterization step after each path update. The update step for the path nodes is done by $$s_{j}^{t_{t+1}} = s_{j}^{t_{t}} + \frac{\sum_{k} w_{k} \times (z_{k}-s^{t_{i}}(\sigma(z_{k})))}{\sum_{k} \xi^{t_{i}-k}w_{j,k}} $$ where $$ w_{k} = max \left[0,\left(1- \frac{\mid\mid s_{j}^{t_{i}}-s^{t_{i}}(\sigma(z_{k})) )\mid\mid }{\mid\mid s_{j}^{t_{i}}-s^{t_{i}}_{j+1}\mid\mid}\right)\right]$$, where $k$ is the current MD step and $w$ is the weight of the adjustment, which is non-zero only for the two nodes closest to the average CV density.

This is very hard to unpack at the first glance, but I will try. Let’s unpack the numerical update of the path‐nodes $s_{j}^{t_{i}} \to s_{j}^{t_{i+1}}$ in a series of clear steps

Discretize the path into nodes. At update time $t_{i}$, the continuous path $s(\sigma)$ is represented by $M$ discrete nodes $\(\{{s_{1}^{t_{i}},s_{2}^{t_{i}},\dots,s_{M}^{t_{i}}}\}$\)These are spaced evenly in the $\sigma-$parameter (after the previous reparameterization).
Collect MD samples $\{z_{k}\}$ since the last update. As the MD runs (with metadynamics bias on $\sigma$), you accumulate configurations $z_{k}$ at each MD step $k$.
Project each sample onto the current path. For each sample $z_{k}$, we need to find it's projection $s^{t_{i}}(\sigma(z_{k}))$ on the discrete path(by locating the two nearest nodes and linearly interpolating). After that we compute the vector displacement $\Delta z_{k} = z_{k} - s^{t_{i}}(\sigma(z_{k}))$
Compute the local weight $w_{j,k}$ of sample $k$ on node $j$. In our case only the two nodes bracketing the projection get nonzero weight. For node $j$ $$ w_{k} = max \left[0,\left(1- \frac{\mid\mid s_{j}^{t_{i}}-s^{t_{i}}(\sigma(z_{k})) )\mid\mid }{\mid\mid s_{j}^{t_{i}}-s^{t_{i}}_{j+1}\mid\mid}\right)\right]$$ geometrically, this tapers linearly from 1 (if the sample projects exactly onto node $j$) down to 0 at the next node.
Apply a "fade" factor for older samples. To let the path "forget" old history, each past sample at MD step $k$ is further multiplied by $$ \xi^{t_{i}-k}, \xi = e^{-\ln(2)/\tau_{1/2}}$$ so that samples older than one half-life $\tau_{1 / 2}$ count half as much.
Accumulate weighted displacements and normalize. For each node $j$, build the numerator and denominator $$ \text{numerator}{j} = \sum$$}(\xi^{t_{i}-k}w_{j,k})\Delta z_{k$\(\text{denominator}_{j} = \sum_{k} \xi^{t_{i}-k}w_{j,k}$\)
Update the node position. Shift each node by the average weighted displacement: $$ s_{j}^{t_{i+1}} = s_{j}^{t_{i}} + \frac{\text{numerator}{j}}{\text{denominator}.$$}

After those steps finally, the $M$ updated nodes are "re-evened out" so that they again sit at equal $\sigma-$intervals along the new curve $s(\sigma)$ . So the discrete path steadily bends toward the true, flux weighted transition tube. Fig. 9 One adaptive-update step in a toy $2D$ CV space. Blue crosses mark the MD configurations gathered since the last update; the blue line with dots is the original straight-line guess connecting reactant and product. Applying the node-update rule pulls each node toward the weighted mean of the samples in its neighbourhood, yielding the orange path. Notice how every node has shifted slightly upward, exactly where the samples are denser, illustrating how path-metadynamics bends an initial guess toward the true, flux-weighted transition tube while keeping the nodes evenly spaced along the progress coordinate σ.

Additionally, we can improve the method by placing trailing nodes at both ends of the path to better capture the free-energy basins. For the basic code, the results look like this: Fig. 10 Straight-ish polyline, with no tube potential( or a very soft one), path cuts straight across contour rings, even wandering over higher-energy regions. The algorithm was density-driven. It found a shortcut that balances flux coming from different directions - good for the average reactive current, not optimal in free energy

For the adaptive implementation using the tube mechanism, the results are as follows: Fig. 11 Smooth arc hugging the blue valley, with tube potential applied(probably fairly stiff), where path bends around the central barrier, consistently following the lowest-energy lane. The narrow tube forced sampling right on the curve, this updates followed local $\nabla F$, which recovered the MFEP

The restraint we add is a harmonic spring on the distance between every instantaneous configuration $z_{k}$ and its projection onto the current path $s^{t_{i}}(\sigma(z_{k}))$ $$U_{\text{tube}}(z_{k}) = \frac{1}{2}k_{\text{tube}} \mid\mid z_{k} - s^{t_{i}}(\sigma(z_{k})) \mid\mid^2 $$ So in wide tube(small $k_{\text{tube}}$) samples are free to stray a fair distance from the path. The path-update rule therefore averages over a broad density of configurations you converge to the average transition path(sometimes called the reactive flux tube). With narrow tube(large $k_{tube}$ "infinitesimally narrow" in the limit) sampling is confined to a thin corridor around the current curve. The only points that matter sit very close to the path, so the update step is dominated by the local free-energy gradient rather tan by a global density average. In this limit the algorithm chases the minimum-free-energy path (MFEP) - the steepest-descent trajectory that traces the valley floor between A and B.

So because we can tune $k_{\text{tube}}$ we have a knob that continuously interpolates between “density-driven” (average path) and “gradient-driven” (MFEP) behaviour.

Then we would work on the trailing nodes at both ends. Those are just extra, flexible points tacked onto the fixed endpoints in basins A and B. Because the metadynamics bias keeps pushing the system out of the wells, the trailing nodes can slide down the funnel and settle exactly where the free-energy valley floor is, giving the main section of the path a cleaner launch/landing zone. Without them, the first movable node would often sit too high up the wall of the basin and the path would start with an awkward kink. Something we can indeed spot in the first plot but not in the second.

Metadynamics then steps in as the workhorse that breathes life into the adaptive path-CV: by depositing a time-dependent series of Gaussian hills directly on the progress coordinate $\sigma$ , it repeatedly pushes the system back and forth along the curve, ensuring that every segment of the transition tube is visited many times.

Because each new hill partially overwrites the ones laid down earlier, the bias continually “self-heals,” and in the long-time limit the accumulated potential converges to the negative of the true one-dimensional free-energy profile, $V_{\text{bias}}(\sigma, t\to \infty) = -F(\sigma)$ . In practice this means that a single PMD run yields both the optimal pathway (average path or MFEP, depending on how tight you make the tube) and the quantitative free-energy barrier along it.

All of the usual metadynamics refinements - well-tempered damping of hill heights, transition-tempered focusing on the barrier region, multiple-walker parallelisation - plug in without modification, and one can even swap metadynamics for umbrella sampling²⁴, steered MD²⁵, or perpendicular biasing whenever a particular segment of the mechanism demands finer control.

Thus the path algorithm and metadynamics form a self-consistent loop: metadynamics drives exploration and reconstructs the free energy, the path update bends the curve toward the physically relevant channel, and the process iterates until both geometry and energetics have converged.

The whole output looks right now like this Fig. 12 Left: the final path (black) connecting basins A and B winds smoothly along the valley floor of the $2D$ free-energy surface (contours and colour map). Green and red dots are the flexible trailing nodes that sit inside the basins, giving the main section a kink-free launch and landing.
Right: a time-trace of the path evolution: every faint poly-line is one intermediate node set during successive updates. Beginning with a crude straight guess, the nodes first fan out within the wide harmonic tube, then tighten and slide downhill as the tube is narrowed, ultimately collapsing onto the single, high-flux channel shown on the left. The two panels together illustrate how the spring-tube restraint and iterative updates steer the curve from a density-dominated average toward the minimum-free-energy (MFEP) trajectory.

Alright, that concludes our introduction to the topic. Let’s now move on to the examples presented in the PLUMED Masterclass.

2.5 Müller–Brown Potential Exploration

Fig. 13 Müller–Brown potential energy surface with contour shading reveals the rugged two–dimensional landscape used in the exercise: two deep minima (dark purple) sit in the lower-right and upper-left, separated by a broad saddle and flanked by a shallow mid-basin. Energies rise steeply toward the yellow plateau at the far right, creating $\approx 5k_{B}T$ barriers between wells at the chosen simulation temperature.

In our first exercise, we ran a straightforward MD simulation on the classicMüller–Brown (MB)²⁶ potential energy surface (PES), using PLUMED’s pesmd engine. Below is the snapshot of our sampling: Fig. 14 Plain-MD sampling footprint with the green trajectory remains trapped in the lower-right well, oscillating within a tight oval around $(x \approx 0.6, y \approx 0)$. No excursions reach the neighbouring basins or the ridge: with T = 0.5 and friction $\gamma = 0.1$the system never acquires enough thermal energy to surmount the Müller-Brown barriers.

We can see that the sampling is confined. The trajectory (green line) sits tightly around one minimum (around $cv.x \approx 0.6$, $cv.y \approx 0$ ). You never see it wander into the upper-left basin or cross the ridge to the right, it’s “stuck” in that well. We can also see that the cloud of points is narrow( $\Delta x, \Delta y \leq 0.05$) ), reflecting the low temperature $(T = 0.5)$ and moderate friction $( \gamma = 0.1)$. The thermal fluctuations are relatively small. With these plain MD settings the system doesn’t gain enough energy to hop the $5 k_{B}T$ barrier to the other wells.

Fig. 15 Time traces of the collective variables. Purple $(cv.x)$ and teal $(cv.y)$ traces fluctuate around fixed mean values with amplitudes $< 0.05$, confirming the visual impression from Fig. 14. Over half-a-million integration steps the coordinates stay confined, illustrating the fundamental limitation of plain MD on rough landscapes: without biasing, rare barrier-crossing events are exceedingly unlikely on affordable time scales.

2.5.1 Why Plain MD Gets “Stuck”?

(following the Müller-Brown exercise from the PLUMED Masterclass) The MB surface has multiple minima separated by significant barriers. In plain Langevin dynamics, the temperature controls the average kinetic energy. Friction damps motion, making long “ballistic” hops rare. Timestep and simulation length merely scale how many small fluctuations you collect. At $T=0.5$, the barrier height of $\approx5k_{B}T$ is prohibitive: rare events on MD timescales. If we want more exploration without invoking metadynamics or path-CV, we can tweak the MD “knobs”:

Parameter	Effect	Suggested Change
Temperature	Increases barrier hops	Raise $T$ to $1.0$ or $1.5$
Friction	Reduces damping	Lower $\gamma$ to $0.01-0.05$
Timestep	Speeds up exploration	Increase $\Delta t$ $(0.002–0.005)$ ps
Simulation length	More sampling	Double or triple `nstep`
Random seed / start	Different noise / basin	Change `idum`, `posx_init`
By increasing the temperature or reducing friction, the particle gains more kinetic energy and fewer dissipative losses²⁷ - making barrier crossings more probable. Extending the run length or using a larger timestep further improves the odds of hopping to other minima.
These modifications are reflected in the input file.
`nstep 5000000 ; # (was 500000) temperature 1.5 ; # (was 0.5) friction 0.05 ; # (was 0.1) Δt 0.002 ; # (slightly increased from 0.001) ... (remaining inputs unchanged) ...`

Fig. 16 "Pushed” plain-MD still hugs the same well and even after cranking the knobs $(T = 1.5, \gamma = 0.05, 10\times \text{longer run})$ the cyan trajectory merely fattens its oval around the lower-right Müller-Brown minimum; no points reach the mid-basin or the upper-left well. The hotter, less damped particle explores a wider patch of the valley floor but the $\approx 5k_{B}T$ ridge remains insurmountable without bias.

Fig. 17 Time traces confirm localisation. $Cv.x$ (purple) now wanders by $\pm 0.15$ and $Cv.y$ (green) by $\pm 0.5$-roughly double the earlier amplitudes-yet both signals stay centred near their original means for five million steps. The absence of level-shifts or large excursions underscores the lesson: simply raising temperature or extending simulation time yields diminishing returns against multi-kilojoule barriers²⁸, highlighting the need for dedicated biasing techniques.

2.5.2 New basin escape approach

(following the Müller-Brown exercise from the PLUMED Masterclass) Well, we've got some learning on that. Plain MD - even when you push it - struggles to overcome free-energy barriers that are several $k_{B}T$ high. What I have demonstrated is that extending time or raising temperature yields diminishing returns and can distort the physics²⁹ if you go too far. Also to efficiently explore multiple basins, you need biasing methods that “fill in” barriers, rather than hoping for a lucky thermal kick.

The final method involves increasing the temperature tenfold. Let’s examine the results now, along with the corresponding step sizes.

Fig. 18 Barrier-crossing achieved at the price of “over-heating.” Raising the temperature by an order of magnitude unleashes the walker: the turquoise trajectory now shuttles back and forth between the lower-right well and the mid-basin at $(y \approx 1.5)$, repeatedly traversing the Müller–Brown saddle. Although the run finally escapes the original minimum, the path fans out into high-energy (yellow) regions that would be thermally inaccessible at physically realistic T, illustrating how brute-force heating trades accuracy for reach.

Fig. 19 Collective-variable traces reveal erratic, high-T dynamics. The purple $Cv.x$ and green $Cv.y$ series exhibit large, abrupt jumps and long dwell periods in two distinct value ranges, confirming frequent basin-to-basin hops. Between these plateaus the signals spike to extreme values, signalling excursions onto steep ridges of the surface. Such jagged, high-amplitude motion underscores the drawback of the temperature-boost strategy: enhanced exploration comes with distorted kinetics and a loss of fidelity to the true free-energy landscape.

2.5.3 Introducing Metadynamics to the problem

(following the Müller-Brown exercise from the PLUMED Masterclass) Let’s take this insight forward by adding a simple metadynamics block to our plumed.dat file. Here's a minimal example that I will "build up" bias on $cv_{x}$ and $cv_{y}$

UNITS LENGTH=nm TIME=fs ENERGY=kcal/mol         # keep nm/fs because pesmd expects it

# ---- collective variables ---------------------------------------------------
cv:      DISTANCE ATOMS=1,2 COMPONENTS          # gives cv.x  cv.y  (= the 2-D coords)

# ---- harmonic “walls” to keep us on the grid -------------------------------
LOWER_WALLS ARG=cv.x,cv.y AT={-1.5,-0.5} KAPPA={1000,1000}
UPPER_WALLS ARG=cv.x,cv.y AT={ 1.5, 2.5} KAPPA={1000,1000}

# ---- well-tempered metadynamics --------------------------------------------
metad:  METAD ...
           ARG=cv.x,cv.y           # bias both coordinates
           SIGMA=0.10,0.10         # Gaussian widths      (hint says ≃0.1)
           HEIGHT=0.1              # Gaussian height      (≈1 kBT at T=0.5)
           PACE=500                # deposit every 500 steps
           BIASFACTOR=10           # well-tempered flavour
           FILE=HILLS              # where Gaussians are stored
        ... METAD

# ---- output -----------------------------------------------------------------
PRINT ARG=cv.x,cv.y,metad.bias STRIDE=100 FILE=colvar.out

We may ask right now, how to choose metadynamics parameters? Well, When selecting parameters for metadynamics, a good approach is to match the Gaussian width (SIGMA) to the natural fluctuation scale of your collective variables -which usually is around $0.05-0.15 \text{ nm}$. If this width is too small, your energy surface becomes fragmented or "pixelated"; if it’s too large, you lose important details. The Gaussian height (HEIGHT) should be about 1 $k_{B}T$ (approximately 0.1 $\frac{\text{kcal}}{\text{mol}}$ at room temperature), which is gentle enough not to distort your system's kinetics yet effective enough to speed up sampling. Depositing Gaussians every $0.5 \text{ ps}$ (PACE = 500 steps at $1 \text{ fs}$ per step) strikes a balance: it's frequent enough to fill energy basins efficiently without excessive overlap. A typical BIASFACTOR of around 10 ensures your simulation remains "well-tempered," meaning it gradually smooths out the free-energy landscape without losing critical features.

In practice, keep in mind that your Gaussians don't have to be spherical - if your energy landscape has more pronounced features in one direction, consider using different widths for each collective variable. Additionally, aim for Gaussian heights roughly equal to one contour interval of your energy map; you don't want them so small they're negligible, nor so large they overwhelm the basin. For example, during a typical $5 \text{ ns}$ simulation, you'll deposit around $10,000$ Gaussians, enough to thoroughly explore a two-dimensional landscape without over-smoothing it.

These parameter choices will help you achieve publication-quality results by accelerating exploration without compromising the fundamental dynamics of your system. - SIGMA should match the natural fluctuation scale($\approx 0.05-0.15 nm$ here) - HEIGHT $k_{B}T$ biases gently without distorting kinetics - PACE sets how often you add hills, $500$ steps $\approx 0.5ps$ with $\Delta t = 1fs$

Here have the HILLS file, which shows where the Gaussians are located. Fig. 20 First few HILLS records in plain-text HILLS file lists every Gaussian that well-tempered metadynamics has deposited so far. For each entry you see the simulation time (steps), the centre in $(Cv.x, Cv.y)$, the widths $\sigma_{x}, \sigma_{y} (=0.1)$, the Gaussian height (initially $0.1 \frac{\text{kcal}}{\text{mol}}$), and the running bias factor. The gradual fall-off of the “height” column confirms that the bias is being tempered as intended.

Next, we need to sum those hills to reconstruct the energy landscape. Fig. 21 Reconstructing the bias with plumed sum_hills where the terminal log shows PLUMED reading the HILLS file, recognising the two collective variables, and serially adding the stored Gaussians. This command produces a grid-based free-energy surface that can be plotted or queried to monitor convergence.

The collective variables currently appear as follows. This is promising, as the Gaussians are steadily pushed outward in circular patterns away from the minima, as shown in the plot.
Fig. 22 Early metadynamics push-out ($Cv.x$ only) where the purple trace starts around 0.6, oscillates narrowly, then - as Gaussians accumulate - undergoes larger excursions first to negative values and finally back toward $+0.9$. The widening envelope illustrates how the growing bias “forces” the system outward in concentric rings, steadily emptying the initial minimum.

The system escapes into the intermediate region. It spends time in both the initial and intermediate states, and after roughly 200,000 steps, it overcomes the bottleneck and reaches the product state. We observe one crossing and one recrossing.

Fig. 23 Two-coordinate time series reveal a successful barrier crossing. Both $Cv.x$ (purple) and $Cv.y$ (green) remain confined until $\approx 2 \times 10^5$ steps, after which they jump to a new plateau that corresponds to the product basin, linger, and eventually recross. The clear plateaus separated by abrupt transitions demonstrate that the bias has lowered the $\approx 5 k_{B}T$ ridge enough for repeated hopping.

And that’s the result. Fig. 24 Final sampling footprint under metadynamics. Scatter points now populate two elongated, low-energy troughs - the reactant and product valleys - while high-energy regions remain sparsely visited. Compared with the brute-force high-temperature run, metadynamics achieves thorough exploration of the kinetically relevant pathways without driving the system into thermodynamically irrelevant, yellow plateau regions.

What an efficient landscape search, right? It focuses sampling on the low-energy paths. It still scales exponentially, but with umbrella sampling we’d end up gridding even the regions we’re not interested in. Now we need to determine whether we’ve converged and sampled sufficiently. How can we assess this? One approach is to monitor the relative energy differences over time. Even as we continuously add Gaussians, these differences should stabilize. We can compute a running average at each point - or simply evaluate the average once the simulation is complete - to check for convergence.

We can employ well-tempered metadynamics to converge the Gaussians dynamically - reducing their heights on the fly as the simulation progresses.

2.5.4 Well-tempered PMD

(following the Müller-Brown exercise from the PLUMED Masterclass)

When we try to get the first path on the map, we should start with the most naïve guess: draw a straight line through the Müller–Brown landscape by linearly interpolating the collective-variable space (Fig. 25). Each black dot is a node; between them the genpath routine simply connects the dots. At this point the path carries no physics - it is only a ruler-straight scaffold on which we will build something smarter.

Fig. 25 Naïve straight-line scaffold with the purple poly-line is nothing more than a linear interpolation between the two Müller-Brown minima. It slices straight across high-energy contour rings-geometrically shortest but thermodynamically absurd. The black dots are the equally spaced nodes that will later be allowed to move.

Measuring how the path relaxes gives us a quick diagnostic is the path-to-configuration distance plotted versus simulation time (Fig. 26). Every time the purple trace spikes, the current configuration has drifted far away from the straight-line guess; every dip means we have slid back onto it. Large, persistent oscillations here are a red flag - the system is telling us that our path fails to track the valley floor. Fig. 26 Path-to-configuration distance before restraints. Each spike in the purple trace marks a moment when the walker has drifted far from the straight path; dips show brief returns. The large, erratic excursions warn that the initial path does not follow the energetic valley floor and offers no meaningful guidance to the dynamics.

Switching on the tube potential keep the walker from drifting too far. We wrap the path in a harmonic “tube” and let PLUMED restrain the orthogonal displacement:

tube: RESTRAINT ARG=pcv.z AT=0.0 KAPPA=1000.0

Think of it as a soft pipe encasing the poly-line; configurations are allowed to rattle inside, but not to escape. With the tube active, the nodes (blue dots) begin to bow toward the high-density region (Fig. 28). Because the path is now adaptive, each node shift is weighted by the local population - z_avg.cv.x and z_avg.cv.z - so that crowded slices pull harder than sparse ones. The parabola on the distance of the path. It tries to now the nodes are on those positions.

Fig. 27 First adaptive update table. This excerpt of PLUMED’s node log lists, for every path node, the weighted averages of the configurations that fell into its $\sigma-$slice (z_avg.cv.x, z_avg.cv.y) and the cumulative weight (wsum). Near the ends the weights remain zero - little sampling - whereas mid-path nodes have begun to pull toward populated regions, signalling the onset of density-driven bending.

In this method, the z_avg.cv.x and z_avg.cv.z variables are weighted, and during the adaptive-path procedure we update the density and flux using a simple distance/weight algorithm. To let the endpoints “breathe” and avoid an artificial kink where the free-energy basins meet the flexible path segment, we attach 20 trailing and 20 leading nodes that can slide into the minima, while the central 20 path nodes adjust the transition segment (Fig. 28). These extra nodes act like elastic shock-absorbers, anchoring the path smoothly inside each well.

The final setup thus comprises 20 trailing nodes, 20 leading nodes, and 20 core path nodes. Fig. 28 Path after tube restraint and trailing nodes. With a harmonic tube wrapped around it and 20 flexible trailing nodes at each end, the magenta path bows gently into the low-energy gutter. Turquoise sampling points now cling to a narrow ribbon along the curve, indicating that orthogonal wanderings are tamed while the transition segment smoothly links the basins.

If the path behavior becomes unstable (Fig. 29) - “thrashing” erratically between different shapes instead of converging - it usually means the tube restraint was set too wide or the biasing was too aggressive. In such cases, there is effectively a tug-of-war between entropy (the path taking shortcuts) and energy (the path trying to stay in the valley). A chronically jagged $\sigma(t)$ trace is a clear sign that the Gaussian hills are too large for the landscape or that the path update interval is too short, causing the algorithm to overshoot.

Fig. 29 “Thrashing” diagnostic for an over-aggressive setup. Here the $\sigma-$coordinate flips wildly between plateaus and shows high-frequency jitter. Such behaviour means the bias hills or update stride are too large relative to the landscape: the path alternately follows entropic shortcuts and steepest-descent routes without settling, a cue to relax the parameters.

Finally, once the path has stabilized, we can take a slice through the free energy surface along the converged path. In practice, we project the walker’s sampled density onto the ($CV.x$, $CV.z$) plane and directly visualize the valley walls carved out by the adaptive path (Fig. 30). Fig. 30 One-dimensional free-energy slice along the converged path $(t = 500 \text{ ps})$. Projected onto $\sigma$, the profile now exhibits two $\approx -200 \frac{\text{ kJ}}{\text{mol}}$ wells separated by a $70-80 \frac{\text{ kJ}}{\text{mol}}$barrier at $simga \approx 0.5$. The deep, smooth trough confirms that the adaptive path tracks the true valley floor rather than cresting ridges.

Based on this, we can integrate to calculate the area under the valleys and quantify the difference - thanks to the trailing nodes.

And how many Gaussians are enough? In well-tempered mode the algorithm needed only 60 hills to fill the basin, cross the saddle, and recross back to the reactant state (Fig. 31). The running average of the bias height plateaus after roughly $40$ hills, a convenient heuristic that the free-energy difference has converged within $\approx 1 \frac{\text{ kJ}}{\text{mol}}$. If the running average still slopes after three or four bias half-lives, either widen SIGMA or lower HEIGHT - the simulation is painting the landscape with a brush that is too fine.

Fig. 31 Convergence of the well-tempered bias. Successive reconstructs of the PMD bias after 10, 20, 60 Gaussians collapse onto one another beyond $\approx 40$ hills, showing that the free-energy difference has stabilised within $\approx 1 \frac{\text{kJ}}{\text{mol}}$. The plateau indicates that the chosen SIGMA, HEIGHT, and PACE fill the basins efficiently without over-smoothing, completing the landscape in barely sixty deposits.

Based on that, we can compute a running average to assess convergence. With the path now collapsed onto a single progress coordinate $\sigma$, the one Gaussian width you specify acquires a very tangible role: it is literally the half-width of each hill along that axis. A sensible rule of thumb is to match it to the natural jiggle of $\sigma$ during the interval $\tau$ between hill depositions and to the spacing between adjacent transition nodes ($\approx 0.5$ here). A wider width blurs neighbouring basins into one another, while a narrower one leaves gaps, so the summed bias becomes ragged and the free-energy estimate noisy.

The purple $\sigma-$trace in (Fig. 26) now doubles as a seismograph for the effective diffusion landscape. Long horizontal plateaus at $\sigma \approx 0$ and $\sigma \approx 1$ mark the two Müller-Brown minima; once the bias nudges the walker past $\approx 0.2$ or $0.8$ it shoots almost ballistically across the mid-section, because the underlying $2D$ valley is broad and shallow there. In other words, the apparent diffusion coefficient varies with position: it is largest where the valley is wide (centre) and smallest where it narrows into the wells (ends).

Adding the tube restraint sharpens this picture. Even a modest spring constant squeezes the turquoise cloud that surrounds the magenta path (Fig. 27) into a narrow ribbon, forcing configurations to slide along the curve instead of short-cutting through CV space. The transverse spread collapses, $\sigma$ diffusion slows just a touch, and - critically - the bias converges faster because no Gaussians are wasted off the path. Crank the force constant much beyond $100 - 200 \frac{\text{kJ}}{\text{mol}}$, however, and you over-quench orthogonal fluctuations; recrossing stall and the whole algorithm becomes sluggish.

With 20 transition nodes bracketed by 20 trailing nodes on each side, the fixed-path metadynamics run produces a free-energy profile that is both smooth and fully converged: two wells at $\sigma \approx 0$ and $1$ with depths of about $- 200 \frac{\text{kJ}}{\text{mol}}$, separated by a $70-80 \frac{\text{kJ}}{\text{mol}}$ barrier centred at $\sigma \approx 0.5$. Because the geometry of the path itself is locked, that slice is the exact line integral you would obtain by integrating the full Müller- Brown surface along the magenta poly-line- confirmation that the one-dimensional bias has recovered the thermodynamics of the original $2D$ system without ever leaving the adaptive loop.

2.5.5 Adaptive PMD

The moment we let the nodes move we step into the world of adaptive PMD (Fig. 32). In this variant the curve is no longer a rigid ruler but a living, breathing object that keeps a time-weighted diary of every configuration it encounters. The single knob that controls that diary is HALFLIFE - the number of MD steps after which the influence of a data point decays to $50$%. Put differently, HALFLIFE sets how far into the past the path can “remember”.

A short memory ($\approx 10^3$ steps) makes the path almost mercurial. It discards mistakes, chases density, and snaps into the first low-energy corridor the bias uncovers. That is perfect for the exploratory phase when you just want the curve to abandon a bad straight-line guess and latch onto any plausible transition tube.

Dial HALFLIFE up to $\geq 10^4$ steps and the character flips. Now each node behaves like a cautious statistician, averaging over many recrossings before it agrees to shift. The jitter from individual Gaussians is ironed out, and what survives is the mean transition pathway weighted by true flux, not by fleeting noise.

The sweet spot is to combine both moods in a two-stage workflow: 1. Fast scouting.
Start with generous Gaussians, a permissive or even absent tube restraint, and HALFLIFE $\approx 1000$. In a few hundred thousand steps the bias will volley the walker back and forth until both basins are well-trodden and the path has bent into the right valley. 2. Precision carving.
Freeze the simulation, read the last set of nodes with INFILE, shrink SIGMA, tighten the tube spring, and raise HALFLIFE two orders of magnitude. From this point on the nodes drift only when multiple independent excursions agree, while well-tempered metadynamics quietly files the residual bumps in the one-dimensional free-energy profile to below $\approx kT$.

2.5.5.1 Dealing with multiple product basins

Real chemical reactions rarely follow a simple, linear path from state A $\to$ B. Often, a reaction can end up in multiple distinct final states (for example, two crystallographically different forms of product B). If an adaptive path simulation begins with only one initial guess, it typically diverges early, settling into whichever final state it encounters first. To avoid this limitation, two straightforward methods can be used. First, you can introduce a second reaction coordinate, for instance by running two parallel paths starting from the same reactant but ending in separate product states, or by using an additional coordinate that identifies which final state you're approaching. Alternatively, you can initiate several independent simulations (walkers) on separate GPUs, each slightly biased toward different product states, and afterward merge their resulting data once both have reached convergence.

Either strategy keeps PMD agnostic and lets the free energy, not a lucky first hop, decide which channel prevails.

Fig. 32 First $38 \text{ ps}$ of adaptive PMD. The magenta poly-line - now free to bend - has already abandoned the straight scaffold and is curling into the dark, low-energy gutter that arcs between the reactant well (large blue $X$ at $Cv.x \approx 0.6$ , $Cv.y \approx 0$) and the upper-left basin. Because HALFLIFE is still short, each newly visited configuration (turquoise dot) pulls strongly on the nearest node, giving the curve its kinked, exploratory shape. High-energy ridges (yellow) are skirted, while the path hugs the reddish trough, indicating that the algorithm has begun to recognize and follow the true transition tube within a few tens of picoseconds.

2.6 Multiple-Walker Path‐Metadynamics

That topic isn’t a concern for me right now, but I may explore it in the future.
These were all the exercises, accompanied by my explanations and insights from Professor Bernd Ensing, who - with his team - developed these methods and put them into practice.

3. Case Study: Glycine-1-13C

“With the Müller–Brown example completed, we are ready to apply the adaptive path approach to a real chemical problem: the stepwise formation of glycine in water, as reported by Francisco Carrascoza et al 2023. In a forthcoming post, I plan to import the reactant and product snapshots from that study, sketch an initial multi-CV path, and let our two-stage adaptive protocol reveal the molecular choreography—how formaldehyde, carbon monoxide, and a proton stitch together into the simplest amino acid (glycine). I will also outline a three-module simulation scheme designed to mirror the radical reaction pathway illustrated in article. Fig. 33 Reaction pathway for glycine formation (from Francisco Carrascoza et al 2023). The reaction starts from carbon monoxide and formaldimine in water and proceeds through several intermediates (not detailed here) to form glycine. (This figure illustrates the kind of multi-step reaction the adaptive path approach would need to capture.)

4. Appendix

Topics encountered while studying Prof. Ensing’s lectures / exercises

4.1 Path Collective Variables (Path-CV) & Reaction Coordinates

Foundations - Cluster model; defining path collective variables - Defining reaction coordinates; different reactant states - Two stable states → path A → B → average transition path; different channels - From A to B in free-energy space: can I say where I am between A & B? (projection)
Algorithmic workflow - Start from a guess path → bias dynamics along the path → move nodes to mean density (transition flux) → maintain equidistant nodes - Expressed in structures: compute RMSD for reactant structures, optimise those structures to refine the path - Method with torsion angles → density sampled → free-energy samples → bottleneck somewhere here
Path-CV mathematics - s = position on the path (Euclidean projection) - z = distance from the path - Distance to mean density; two nodes + trailing nodes
Simulation strategies - Single simulation → reaction path & free-energy profile - Step-wise path - 8 simultaneous simulations, each ≈ 10 ns - Torsion-angle rotation has little barrier → zipper-like mechanism - Leads to 3 factorial pathways - Sub-linear convergence - Repellers simulation: move paths in different directions
Path-CV project tools - Path-map creation - Initialise Path-CV: equilibrate, compute CV-averages at endpoints, build initial path (linear interpolation or previous trajectory) - Choose biasing method (steered MD); enable multiple walker; monitor path & free-energy profile evolution; tube potential; scale Gaussians by hand
Specialised methods - Path metadynamics (σ function; more CVs) - Single-walker PMD, multiple-walker PMD, multi-PMD - MuWaMuPaMetaDyn – Multiple-Walker Multiple-Path Metadynamics

4.2 Enhanced-Sampling & Rare-Event Techniques

General concept – enhanced sampling - “All possible bonds are produced and broken” → explore full free-energy landscape
Temperature-based acceleration - Raising temperature to boost barrier crossing - Parallel tempering / replica exchange - Temperature-accelerated dynamics (mix low-T & high-T)
Metadynamics family - No metadynamics ▸ plain metadynamics ▸ well-tempered metadynamics - Assessing metadynamics accuracy: decrease hills upon every recrossing for convergence - Flood the landscape with hills; localise minimum FE path, then switch to 1 D to converge further - “Six collective variables is not doable” ▸ biased six CVs (all-at-once vs independent) - 8-dimensional landscape - not just 2-D projections - Lowest free-energy pathway; find hidden profile (reaction coordinates)
Rare-event frameworks - Transition-interface sampling - Rare-event methods (general) - Committor probability: where is the TS? - Halfway between A and B → iso-committor surface → hyperplane 0 → 0.5 → 1.0 - Average transition pathway (in CV space); symmetric channel; painfully expensive
Accuracy & diagnostics - How to calculate the projection? - What if underlying landscape unknown → how to set σ? - Defining diffusivity in metadynamics

4.3 Reaction-Path Sampling & Free-Energy Methods

Finding / sampling a reaction path; SN2 reaction example
Thermodynamic integration
Flooding vs path-based approaches
Reaction channels & paths: good reaction coordinate often hard to define
Rare events in real life; time-scales: - Proton transfer, enzymatic reactions, organic reactions, phase transitions, protein folding, polymer relaxation - Electronic excitation; intra-molecular vibrations; H₂O rotation in liquid water
Ab-Initio, Force-Fields & Electronic Processes
- DFT / defining force fields
- Optimisation strategies in ab initio methods
- Electron-transfer / proton-transfer cascades → many CVs
- Force-field simulations using Path-CV
Collective-Variable Selection & Machine Learning
- “As inputs you need good collective variables.”
- ML for CV discovery:
  - Find CVs with ML → test with Path-CV → sample → better data
- Typical CVs: distances within molecule; torsion angles (periodic, high-energy crossings)
- Ramachandran plot for alanine peptide in vacuum (torsion angles periodic → high-energy crossings)
Simulation Paradigms & Model Hierarchies
- Quantum methods
- MD / MC
- Coarse-grained models
- Fluid-mechanics & stochastic networks
Practical Tips
- Transition-interface sampling
- Mimic environment for single particle (Langevin bath)
  - Two-particle distance; κ = force constant
- Lennard-Jones units (restrain everything); defining temperature in energy units
- PLUMED-Nest example: polyproline (Alberto) – ID of the egg
- Using Gnuplot on Ubuntu – full command syntax for plotting

Molecular dynamics (MD) is a computational simulation technique that models the physical movements of atoms and molecules over time using Newtonian mechanics. It allows researchers to study the structural, thermodynamic, and dynamic properties of molecular systems at the atomic level. ↩↩↩
PLUMED is an open-source plugin that can be interfaced with many molecular dynamics engines (e.g., GROMACS, AMBER, LAMMPS). It provides advanced methods for enhanced sampling, free-energy calculations, and analysis of molecular simulations. ↩↩↩
Masterclasses are intensive workshops or seminars led by experts, designed to deliver in-depth training on specialized topics; here, PLUMED masterclasses focus on advanced sampling methods, free-energy calculations and analysis techniques in molecular dynamics. ↩↩↩
Metadynamics is an enhanced sampling method in molecular dynamics that adds a history-dependent bias to overcome energy barriers, enabling exploration of rare events and accurate free-energy landscapes. ↩↩↩
CPMD (Car–Parrinello Molecular Dynamics) is an ab initio molecular dynamics method that integrates electronic structure calculations (via density functional theory) “on the fly” with nuclear motion, allowing for reactive simulations without predefined force fields. ↩↩↩
Free energy simulations are computational techniques (e.g., umbrella sampling, metadynamics, thermodynamic integration) that estimate the free energy differences between molecular states or along reaction coordinates, providing quantitative insights into reaction mechanisms and stabilities. ↩↩↩
Reaction pathways are sequences of molecular configurations that connect reactants to products through a potential energy surface, often identified by minimum‐energy paths or transition states, and used to understand the mechanism and kinetics of chemical reactions. ↩↩↩
Path-metadynamics (PMD) is an enhanced-sampling method that applies metadynamics bias along a predefined path, allowing the system to explore and optimize transition pathways between states. ↩↩↩
Minimum free-energy path (MFEP) is the trajectory on the multidimensional free-energy surface that requires the least work to move from reactants to products, often passing through saddle points (transition states). ↩↩↩
MD engine (molecular dynamics engine) is a software package that numerically integrates Newton’s equations of motion for atoms and molecules - handling force evaluations, neighbor lists, thermostats/barostats, and trajectories - to simulate the time evolution of molecular systems. ↩↩↩
Collective variables (CVs) are low-dimensional descriptors - functions of the full atomic coordinates - that capture the essential progress of a molecular process (e.g., a bond distance, an angle, or a combination thereof), allowing enhanced-sampling methods to focus on the most relevant motions. ↩↩↩
Free energy surface (FES) is a multidimensional landscape defined over collective variables that assigns to each point $z$ the free energy $F(z) = -k_BT \ln P(z)$, where $P(z)$ is the probability of observing the system at $z$. It summarizes the thermodynamic stability and possible transition pathways between states. ↩↩↩
Canonical equilibrium distribution describes the probability density $P(q) = \frac{1}{Z} e^{-\beta V(q)},$where $V(q)$ is the potential energy, $\beta = 1/(k_B T)$, $k_B$ is Boltzmann’s constant, $T$ the temperature, and $Z$ the partition function ensuring normalization. ↩↩↩
Standard smooth switching function is a continuous, differentiable function
$s(r) = \frac{1-\left( \frac{r-r_{0}}{r_{c}-r_{0}} \right)^n}{1-\left( \frac{r-r_{0}}{r_{c}-r_{0}} \right)^m}$ used to count contacts by smoothly turning “on” when interatomic distance $r$ is below cutoff $r_0$ and “off” above it, with exponents $n<m$ controlling the sharpness of the switch. ↩↩↩
Gaussian kernels in metadynamics are bias potentials of the form $G(z; t') = w \exp\!\bigl(-\tfrac{\|z - z(t')\|^2}{2\sigma^2}\bigr),$ where $w$ is the height, $\sigma$ the width of each Gaussian deposited at past CV value $z(t')$, ensuring smooth, localized biasing of the free-energy surface. ↩↩↩
Jerk in the context of molecular dynamics refers to the rate of change of acceleration (third derivative of position with respect to time), quantifying sudden changes in forces that can introduce numerical instability. ↩↩↩
Nudged elastic band is a method for finding saddle points and minimum energy paths between known reactants and products. The method works by optimizing a number of intermediate images along the reaction path. Each image finds the lowest energy possible while maintaining equal spacing to neighbouring images. This constrained optimization is done by adding spring forces along the band between images and by projecting out the component of the force due to the potential perpendicular to the band. ↩↩↩
Committor-based sampling relies on the committor function $p_{B}(q)$, the probability that a configuration $q$ will reach product state $B$ before reactant state $A$. By launching multiple short unbiased MD trajectories from points along a trial path and estimating their committor values, one identifies configurations with $p_{B}(q) \approx 0.5$ (transition states) to refine the reaction pathway. ↩↩↩
High transition flux refers to regions along the progress coordinate where the rate (or probability per unit time) of reactive trajectories crossing a given hypersurface is maximal, indicating the most probable transition pathways between metastable states. ↩↩↩
Iso-committor planes are hypersurfaces in configuration space where the committor function $p_{B}(q)$ (the probability of reaching product before reactant) is constant. They are orthogonal to the ideal reaction coordinate and partition configurations by equal “commitment” to product or reactant basins. ↩↩↩
Well-tempered metadynamics is a variant in which the deposited bias height at time $t$ scales as
$H_{\text{eff}}(t) = H\exp\left( -\frac{V(s,t)}{k_{B}\Delta T} \right)$ where $V(s,t)$ is the bias already added at collective variable $s$, $\Delta T$ is a “bias-temperature” parameter, and $k_{B}$ is Boltzmann’s constant - ensuring gradual convergence of the free-energy estimate. ↩↩↩
Multiple-walker metadynamics is an approach where several independent simulations (“walkers”) run in parallel, each depositing bias on a shared global history-dependent potential. This coordination speeds up exploration of the free-energy surface by combining sampling efforts across all walkers. ↩↩↩
Grid in CV-space refers to a discretized mesh covering the high-dimensional space of collective variables (CVs), where each grid point corresponds to a specific set of CV values. Sampling or biasing on this full grid becomes infeasible as the number of CVs grows, due to the “curse of dimensionality.” ↩↩↩
Umbrella sampling is a biased-simulation technique where the system is restrained to sample overlapping “windows” along a chosen reaction coordinate by applying harmonic potentials (umbrellas). By combining the histograms from each window via the Weighted Histogram Analysis Method (WHAM) or similar, one reconstructs the unbiased free-energy profile. ↩↩↩
Steered MD (SMD) is a method where an external time-dependent bias (e.g., a harmonic spring moving at constant velocity) is applied to selected atoms or collective variables, “pulling” the system along a predetermined path; analysis of the work performed yields insights into free-energy barriers and mechanical response. ↩↩↩
Müller–Brown potential is a two-dimensional analytic model PES defined as a sum of Gaussian wells. It features multiple minima and saddle points, making it a standard testbed for exploring and validating enhanced-sampling and path-finding algorithms. ↩↩↩
Dissipative loss refers to the loss of mechanical energy (kinetic and potential) due to frictional or viscous forces in the system, which is converted into heat and removed from the sampled degrees of freedom, thus damping the motion. ↩↩↩
Multi-kilojoule barriers are energy barriers on the order of tens of kilojoules per mole (e.g., 15–30 kJ/mol). For comparison, at room temperature $k_BT\approx2.5\,$kJ/mol, so a 20 kJ/mol barrier is roughly eight times the thermal energy, making spontaneous crossing extremely rare without enhanced sampling. ↩↩↩
Distorting the physics refers to altering the system’s natural ensemble and dynamics - elevated temperature or reduced friction can change reaction pathways, violate detailed balance, and sample configurations that are not representative of the target thermodynamic conditions, leading to inaccurate kinetics and equilibrium properties. ↩↩↩

2025-03-24
in Physical-Chemistry
25 min read

Exploring Water's Phase Space with TIP4P and Umbrella Sampling

How can we capture the freezing and melting of water in a computer simulation? In this post, I dive into how I mapped out the phase diagram of water using molecular dynamics simulations (with the TIP4P water model in GROMACS) and what I learned about free energy, enthalpy–entropy compensation, and advanced sampling techniques like Umbrella Sampling. The journey revealed subtle thermodynamic tricks that water plays and how we can coax our simulations to explore all the nooks and crannies of phase space.

Gibbs Free Energy and Phase Stability – A Quick Refresher

When studying phase transitions (like ice melting into liquid water), Gibbs free energy is our guiding star. Gibbs free energy (often just called $\text{free energy}$, $\text{F}$ or $\text{G}$) combines a system's energy and entropy in one value:

$F = H - TS$

where $H$ is enthalpy (total heat content or internal energy + $PV$ work) and $S$ is entropy (degree of disorder). At constant pressure and temperature, a lower $\text{F}$ means a more stable state. In fact, a fundamental thermodynamic principle tells us that the Gibbs free energy of a system must be at a minimum at equilibrium – in other words, nature favors states that minimize $\text{F}$.

Let's break down those terms in plain language:

Free Energy (F) – Think of this as the happiness metric for a state of the system. It balances energy vs. disorder. A state with lower free energy is more thermodynamically comfortable and stable under given conditions.
Enthalpy (H) – This is the internal energy of the system plus the energy related to pressure-volume work ($H =U + PV$). Low enthalpy often means a tightly bound, low-energy configuration (e.g. a solid crystal with strong bonds). High enthalpy means the system has absorbed more energy (e.g. heating up or breaking bonds).
Entropy (S) – This measures disorder or freedom in the system. Multiply it by temperature ($TS$) and you get the "entropy term" in the free energy. A high entropy (especially at high $T$) gives a favorable contribution (since it's subtracted, $H - TS$) – it's like a reward for disorder. More entropy can sometimes compensate for having a higher enthalpy.

At a phase transition, what typically happens is that two different states have the same free energy – that's why the system can flip between them. For example, at 0°C and 1 atm, ice and liquid water have equal Gibbs free energy, so they can coexist (that's essentially the melting/freezing point). If conditions change slightly, whichever phase has lower free energy will be the stable one.

Enthalpy–Entropy Compensation: Hiding Big Changes in Plain Sight

During my simulations, I noticed a fascinating phenomenon known as enthalpy–entropy compensation. In simple terms, whenever the system's enthalpy $H$ increased, the entropy term $TS$ tended to increase as well, almost offsetting the enthalpy change. It's like the system finds a way to pay for an energy expense with a disorder coupon!

Why is this interesting? Imagine heating ice: you put in energy (raising enthalpy) to break the rigid hydrogen-bond network. If that were the whole story, the free energy $F = H - TS$ would shoot up (because you added a lot of energy). But breaking the ice structure also frees the water molecules to wiggle around more – entropy goes up, which subtracts from $F$. The result is that the free energy doesn't change as drastically as the enthalpy alone would suggest. In my simulation data, increases in enthalpy were largely canceled out by increases in $TS$. This means just looking at enthalpy changes can be misleading – the system might absorb a bunch of energy, but if it gains enough entropy in return, the net free energy change is small. Water's melting is a perfect example: it requires energy (heat) to melt ice, but once melted, the water has higher entropy which makes the liquid state thermodynamically competitive with solid.

Bottom line: water cleverly balances energy and disorder. So, to identify a phase change, we can't just watch the energy (enthalpy) – we have to consider entropy and free energy as well.

Free Energy as a Phase Transition Indicator

Since free energy $F$ encapsulates the competition between enthalpy and entropy, it's a natural thing to look at for signs of a phase transition. In our simulations, I analyzed how free energy changed with different conditions (like temperature and pressure). If you see a sudden shift or non-linearity in $F$ as you vary temperature or pressure, that's a clue that the system might be transitioning to a different phase. For example, if water goes from solid to liquid, the free energy curve might show a kink or an abrupt change at the melting point.

From this part of the analysis, I gleaned a few practical heuristics:

Sudden changes in the $F$-$H$ relationship signal a phase transition: If you plot free energy vs. enthalpy (or observe how they change together) and see a sharp jump or change in trend, it often means the system moved into a new phase. In my case, around the freezing point the relationship between $F$ and $H$ shifted noticeably, reflecting the ice-to-liquid change.
Enthalpy and entropy jumping together suggests a melting (liquid) phase: When I saw enthalpy rise accompanied by a rise in entropy (remember, more entropy offsets some enthalpy in $F$), it was a strong hint that the system had entered a liquid-like state. This makes sense: melting ice requires absorbing energy (higher H) but gives molecules more freedom (higher S).
Free energy alone isn't a full-proof phase identifier – corroborate with other signs: It's important to stress that while free energy analysis gives hints, I needed additional evidence to conclusively identify the phase transition. In practice, researchers will also check density changes (liquids and solids have different densities), look at radial distribution functions (RDFs) (which show how structured the molecular arrangement is), or even visually inspect the simulation trajectory (is it liquid water or ice?). In my case, I went on to analyze heat capacity and a structural order parameter, which we'll get to shortly.

In short, a free energy vs. enthalpy plot or $\text{F(T,P)}$ analysis is a great starting point: if something interesting (like a phase change) is happening, it will usually leave a footprint there. But to be confident, we should cross-check with other metrics.

Mapping the Phase Diagram from Simulation Data

One exciting outcome of this project was a phase diagram I constructed from the simulation data. Picture a graph with temperature on the x-axis and pressure on the y-axis, and each point colored by the free energy of the system (blue for low $F$ = very stable, red for high $F$ = less stable). What emerged was super cool (no pun intended):

Ice region: At lower temperatures and relatively higher pressures, there was a broad blue region – that's the stability domain of the ice Ih structure in the simulation. The system has low free energy there, meaning solid water (ice) is the favored state in those cold, high-pressure conditions.
Liquid region: Near 0°C (around 273 K) and around 1 atm pressure (roughly 0.1 MPa, moderate on our plot), a band of low free energy indicated the liquid water phase. This corresponds exactly to the conditions where we expect liquid water to be stable.
Metastable liquid (supercooled/superlow-pressure water): Interestingly, the diagram showed a region of relatively low free energy extending to below 0 pressure (yes, negative pressure – meaning the water is under tension) at sub-zero temperatures. This suggests the presence of a metastable liquid state – basically supercooled water that exists below 0°C without freezing, aided by the fact that it's also under tension (which discourages crystallization and boiling). In reality, water can indeed remain liquid below its normal freezing point under the right conditions (like in tiny droplets or if pressure is lowered to avoid nucleating ice or vapor). Our simulation seems to capture a slice of that weird state.
The triple point: Perhaps the most gratifying part was spotting a point where ice, liquid, and gas-like states all seemed to converge. In the phase diagram, this showed up near 0°C and very low pressure (just above 0 MPa). This is consistent with water's known triple point (0.01°C and 0.006 MPa in real life). In that vicinity, the free energy landscape is such that solid, liquid, and gas phases can coexist. Seeing a hint of the triple point in a simulation (especially one using a specific water model like TIP4P) is a nice validation that the model isn't completely off base!

Figure 1: Phase diagram of TIP4P water showing stability regions for different phases

To summarize visually: the simulation's free energy contour map delineated the stable regions for ice and liquid water, and even indicated the coexistence line between them (where free energy barriers appear as red ridges). The fact that it lines up with real-world expectations (ice stable at cold temps, liquid around 0°C at 1 atm, etc.) is encouraging. It tells us our approach of using free energy data to map phase boundaries is working.

The Sampling Challenge: Not All States Are Visited Equally

While constructing that phase diagram was exciting, it also revealed a challenge: our simulation doesn't naturally sample all these states evenly. In fact, it was clear that the system had favorite haunts (the free energy minima) and avoided other areas (the higher $F$ regions) like the plague. This is completely expected behavior – after all, the simulation follows the laws of thermodynamics just like the real world: it wants to stay in low-energy (low free energy) states. If there's a significant energy barrier separating two states, the system might never hop over during the timescale of the simulation.

Figure 2: Scatter plot showing the non-uniform sampling of phase space

In practical terms, I observed that the trajectory in $(T,P)$ space was not uniform. There were dense clusters of points in certain regions (corresponding to, say, the system staying liquid around 1 atm and 273 K for a long time) and almost no points in others (like the middle of a phase transition where things are unstable). If I plotted a histogram of visited states or a scatter of the trajectory, you'd see blobs in the stable basins and sparse coverage elsewhere. This is a hallmark of free energy barriers: the system gets trapped in one basin (e.g., all liquid or all solid) and doesn't easily transition.

Figure 3: Trajectory plot in phase space demonstrating trapped behavior

This uneven sampling isn't just a data quirk – it's telling us something important: our simulation, as is, might miss important transitions because it's not exploring the full phase space freely. In other words, it's not ergodic over the whole space of interest on the simulation time we ran. For example, if we start with liquid water at 1 atm and 270 K, it might take a very long time (maybe longer than we actually simulated) for it to spontaneously freeze into ice, because that requires crossing a free energy hump (nucleating ice) which is statistically rare. Similarly, once it's ice, it might stick as ice and not melt back unless we really wait or change conditions.

Recognizing this problem is the first step toward a solution: we need some tricks to help the simulation sample those rarely visited states (like the top of the barrier or the other side of it). Enter enhanced sampling methods – and a favorite of mine is Umbrella Sampling.

Umbrella Sampling to the Rescue: Enhancing Exploration of Phase Space

Umbrella Sampling is a clever technique to tackle exactly the issue we just described. The idea is to add a biasing "umbrella" potential that gently forces the simulation into regions it would normally avoid, and then later correct for that bias to recover the true free energy landscape. It's like putting an imaginary hand into the simulation box and nudging the system when it hesitates to go over a barrier.

Figure 4: Illustration of umbrella sampling technique

A classic analogy: imagine you have a solid at its melting point. There are two states it can be in – solid (very ordered, let's say an order parameter $Q$ is high) and liquid (disordered, $Q$ is low). Both states are actually pretty low in energy at the melting point (they're about equally stable). However, the halfway state (medium $Q$ – kind of slushy half-melted) is high in energy – it's unstable. So if you run a normal simulation at the melting point, what happens? The system might stay solid for a long time because it doesn't want to climb that hill to become liquid, or vice versa stays liquid because it doesn't want to form a solid nucleus. Umbrella sampling provides stepping stones over that hill.

In practice, how do we do it? We choose a collective variable (CV) – a parameter that describes progress along the transition. It could be something like that order parameter $Q$ (for solid vs liquid structure), or even temperature or pressure or density, depending on what transition we're interested in. Then we run a series of simulations, each with a bias potential that restrains the system around a certain range of that CV. It's as if we place multiple umbrellas along the path over the mountain: each umbrella holds the system in a specific region that normally would be too "uphill" to linger in. By doing so, we force the simulation to explore states it would skip over.

For example, one umbrella might encourage the system to sample that slushy intermediate (preventing it from relaxing fully into solid or liquid), another umbrella might focus on slightly more liquid-like states, and another on slightly more solid-like states. Each such simulation on its own doesn't give the true picture because we've distorted the physics with the bias. But here's the magic: we can then use techniques like Weighted Histogram Analysis Method (WHAM) or other reweighting algorithms to stitch all these biased simulations together. The end result is a comprehensive free energy profile of the transition, as if the system had climbed up and over the barrier freely. We essentially remove the bias mathematically and recover the unbiased free energy landscape, having successfully sampled it via this patchwork approach.

In the context of my water phase simulation, I didn't fully implement umbrella sampling (I first tried analyzing the unbiased simulation data), but these findings strongly suggest it would be beneficial. For instance, the tetrahedral order parameter $q_4$ (which I'll discuss soon) could serve as an excellent CV for umbrella sampling the ice to water transition. We could set up umbrellas to sample configurations from high $q_4$ (ice-like) through intermediate $q_4$ (partially melted) to low $q_4$ (liquid-like). This would help the simulation visit those in-between states that it barely touched on its own. The beauty is that umbrella sampling can significantly improve sampling without an exorbitant cost – you run multiple shorter simulations under bias rather than one insanely long unbiased simulation hoping to catch a rare event.

To sum up: Umbrella sampling is our cheat code to overcome free energy barriers. It's not magic (you have to pick a good collective variable and do some work to combine data), but when done right, it can reveal the full phase space and free energy landscape that a straightforward simulation would miss.

From Probability to Free Energy: Boltzmann to the Rescue

Let's circle back to our original unbiased simulation and how I extracted free energy information from it. This involves a bit of statistical mechanics, but it's straightforward: if a system is in equilibrium, the probability of finding it in a given state is related to the free energy of that state. Mathematically, it follows the Boltzmann distribution. The relation is:

$F(\text{state}) = -k_{B}T\ln P(\text{state}) + \text{constant}$,

where $P(\text{state})$ is the probability of the system being in that state (at temperature $T$), and $k_{B}$ is Boltzmann's constant. The constant just sets the reference point (we often choose the lowest free energy state as zero).

Figure 5: Free energy profiles derived from simulation data

What this means in plain terms: states that the simulation frequents are low in free energy, and states it rarely visits are high in free energy. A difference of, say, $\Delta F = 1.36 \frac{\text{kcal}}{mol}$ corresponds to about a factor of 10 difference in probability at room temperature (since $e^{(-1.36)/(0.593)} \approx 0.1$, using $k_{B}T \approx 0.593 \frac{kcal}{mol}$ at 298 K; at 273 K, $k_{B}T \approx 0.54 \frac{kcal}{mol}$. So even small free energy differences can lead to big sampling biases.

In my analysis, I took histograms of the simulation data – for example, how often the system had a certain temperature or pressure – and converted those into free energy profiles using the formula $F = -k_{B}T\ln P$. Since the simulation was roughly around 273 K, I used $T \approx 273K$ (where $k_{B}T \approx 0.542 \frac{kcal}{mol}$). By doing this for temperature and pressure distributions, I got plots of "free energy vs temperature" and "free energy vs pressure." These are essentially like 1D slices of the free energy landscape.

Figure 6: Diagram of the Boltzmann distribution

Why do this? Because it can highlight where the stable states and barriers are. The trick is that a probability distribution peaked at some value will translate into a free energy minimum at that value (since $\ln P$ will be highest there, making $F$ lowest). Conversely, if there's a range of values that the system almost never visits (a dip in the histogram), that will show up as a free energy maximum (a barrier) in the profile.

This is a powerful way to interpret simulation data: rather than just saying "the system mostly stayed around 1 atm", we can quantify that as "the free energy minimum is at 1 atm, and it costs X $\frac{kcal}{mol}$ to push the system to 2 atm, etc."

What the Free Energy Profiles Revealed

Now to the juicy results from those profiles:

Temperature Free Energy Profile: Even though the simulation thermostat was set around 273 K, the instantaneous temperature of the system fluctuated (as is normal in NPT simulations). When I plotted the free energy $F(T)$ derived from the temperature histogram, I found a clear minimum around 272–273 K. In other words, the most probable (lowest $F$) temperatures during the run were right around the freezing point. That's a hint that the system was often sitting at a condition where ice and water were at equilibrium. Notably, the profile steeply rose on either side of that minimum, indicating it was unfavorable for the system to stray too far below or above that temperature for long. This aligns with the idea that if it got much colder, everything would lock into ice (and maybe the thermostat would then heat it back), or if it got much hotter, it would prefer to be liquid but the thermostat would cool it back – either way, 273 K was the comfy spot given the latent heat exchange during the phase change.
Pressure Free Energy Profile: Similarly, the pressure in the NPT simulation fluctuates around the setpoint (1 atm $\approx$ 0.1013 MPa). The free energy as a function of pressure $F(P)$ showed a minimum near ~1 atm, with steep rises if pressure deviated significantly. This tells us 1 atm was indeed the most stable pressure for the system (no surprise, that was the control pressure). But interestingly, the shape of $F(P)$ can hint at phase behavior too. For example, if the system tried to enter a gas-like state (very low pressure) or a very dense state (higher pressure), those would register as high free energy (i.e. low probability) regions. My results showed clearly defined minima and then an increase in $F$ as you move away, which is consistent with having a stable liquid phase at around 1 atm and barriers towards entering a different phase (like gas at lower pressure or some dense form at higher pressure).
Energy Barriers: Both the temperature and pressure free energy curves had sections that we can interpret as energy barriers. For instance, around the edges of the stable basin, $F$ would ramp up. These correspond to the transition states – the conditions the system finds hard to occupy because it would mean partially melting or partially evaporating. The fact that we see these barriers empirically in the data is confirmation that the simulation indeed experienced the resistance to change phase. It's like seeing the silhouette of a wall by observing where people never walk on a field – the absence of data in those regions spoke volumes.

In numbers, I also calculated the Shannon entropy of the temperature and pressure distributions (this is different from thermodynamic entropy of water; it's more about the spread/uncertainty in those parameters during the sim). They were tiny: on the order of $0.008 \frac{kcal}{\frac{mol}{K}}$. That essentially quantifies how tightly the simulation was sticking to its favorite state. The low values told me the system wasn't wandering far – reinforcing that, without enhanced sampling, it mostly stayed put in one phase. In a broad, flat free energy landscape you'd see a big spread (high entropy of distribution), but here it was sharply peaked.

Key takeaway: The free energy profiles gave a consistent picture – our system has well-defined stable states (minima) and significant barriers (maxima) separating them, which matches what we expect for distinct phases like ice and water. This sets the stage for why we would consider umbrella sampling, and also it validates that our interpretation of the simulation behavior in terms of phases is reasonable.

But can we be even more confident that the "blip" we see at 273 K in free energy is really a phase transition? To solidify that (no pun intended), I turned to two additional analyses: heat capacity and a structural order parameter.

Heat Capacity: A Sharp Peak at the Phase Transition

One of the hallmark signatures of a first-order phase transition (like melting/freezing) is a spike in heat capacity ($C_{p}$). Heat capacity essentially measures how much heat the system needs to absorb to raise its temperature. At a melting point, if you add heat, instead of raising the temperature, the heat goes into changing phase (melting the ice). In an ideal infinite system, this shows up as an infinite spike (latent heat) at exactly the transition temperature. In a finite simulation, you'll see a very large but finite peak.

In my simulation analysis, I computed how the system's energy changed with temperature and from that got an effective heat capacity as a function of T. The result was clear as day: a sharp peak in $C_{p}$ centered at $\approx$ 273.2 K, with a maximum value around $0.31 \frac{kJ}{mol \times K}$. On either side of that temperature, the heat capacity was much lower (basically a flat baseline away from the transition). This is exactly what we'd expect:

Below 272 K or so, the system is solid ice; heating it a little just warms the ice normally (no surprises, low heat capacity).
At 273.2 K, we hit the melting point: the curve shoots up – the system is gulping down energy to convert solid to liquid, absorbing latent heat. That's why the heat capacity skyrockets there.
After the peak, above $\approx$ 274 K, the system is now fully liquid water; adding heat now just warms the liquid (again relatively steady, normal heat capacity).

Figure 7: Heat capacity analysis showing a phase transition peak

The shape of that $C_{p}$ vs $T$ plot is a textbook indicator of a phase change. And the fact that it peaks right around 0°C is gratifying because that's water's known melting point. In our case, the simulation's melting happened at 273.2 K, which is almost spot-on the experimental 273.15 K – a nice coincidence (or perhaps a sign that the TIP4P model we used is doing a good job for this property).

Thermodynamically, this behavior confirms a first-order phase transition: the system took in a bunch of energy without a significant rise in temperature – all that energy went into breaking the ice's hydrogen-bond structure (instead of increasing kinetic energy/temperature). Once the phase transition completed, any further energy just increased the temperature of the now-liquid water normally.

I also looked at derivatives of the free energy with respect to temperature and pressure. Those are related to entropy and volume changes. During the transition, those derivatives fluctuated wildly (crossing threshold values we set to identify a significant change). In simpler terms, the slope of the free energy curve vs $T$ or $P$ changed abruptly right at the transition, which is another mathematical confirmation that "yep, something big happened here (the phase transition)".

Coupling Heat Capacity and Density: Two Sides of the Same Coin

To double-confirm the phase change, I compared the heat capacity behavior to another observable: density. In water's unusual case, when ice melts, the density increases (liquid water is denser than ice, which is why ice floats). I plotted the system's density alongside the heat capacity as the simulation temperature evolved. As expected, exactly at the temperature where $C_{p}$ spiked, the density of the system showed a dramatic change. Specifically, around 273 K, density jumped up – indicating the collapse of the open ice structure into a more tightly packed liquid structure. After the transition, the density leveled off to the typical density of liquid water at that temperature.

Seeing these two properties change in sync is satisfying: the heat capacity peak (energy absorption) coincided with the structural collapse (density increase). Before the transition, the density was lower (characteristic of ice's open crystal lattice), and after, the density was higher and stable (liquid water). This is yet another piece of evidence that around 273 K the system underwent a phase transition from ice to liquid. It's always good when multiple indicators line up like this, because it gives a complete picture: thermodynamics ($C_{p}$) and structure (density) telling the same story.

Tetrahedral Order Parameter $q_{4}$: A Molecule's-Eye View of Structure

Thus far, we've been talking about fairly macroscopic or bulk indicators (free energy, heat capacity, density). But what about the molecular structure? How do we know the water molecules rearranged themselves? For that, I turned to the tetrahedral order parameter, commonly denoted $q_{4}$.

Water is famous for forming a tetrahedral arrangement in its hydrogen-bond network (each water molecule can ideally bond to four neighbors in a tetrahedral geometry – that's what happens in ice). The parameter $q_{4}$ quantifies how tetrahedral the local environment of a water molecule is, on a scale from 0 to 1. If $q_{4} = 1$, it means a perfect tetrahedron; if $q_{4} = 0$, it means complete disorder (like an ideal gas where neighbors' positions are random relative to each other).

Figure 8: Distribution of tetrahedral order parameter values

I analyzed the distribution of $q_{4}$ values in my simulation across different temperature regimes:

In the Ice phase (T < 273 K): The $q_{4}$ values were consistently high. Most $q_{4}$ were above 0.8, and a large fraction were around 0.9. This indicates a near-perfect tetrahedral arrangement – just what we expect for solid ice Ih. In fact, when I took the average $q_{4}$ for clearly ice-state configurations, it was about 0.87, with a very narrow spread (standard deviation ~0.05). That tight distribution means the structure was uniformly ordered throughout – again, like a perfect crystal.
During the Transition (around 273–275 K): $q_{4}$ plummeted. I saw $q_{4}$ values dropping from ~0.9 down to around 0.5 within this window. That represents the breaking of the tetrahedral ice lattice. The distribution of $q_{4}$ widened a lot too – some parts of the system were still ice-like (high $q_{4}$), while others became liquid-like (lower $q_{4}$), giving a broad mix. This is exactly what happens when ice is melting: some local structures are still intact, others are collapsing, so you get a wide range of order metrics.
In the Liquid phase (T > 280 K): $q_{4}$ settled to a lower average value, roughly in the 0.55–0.60 range. In my data, fully liquid configurations averaged around 0.66, but with a larger variance (~0.09). The slight discrepancy in values (0.55 vs 0.66) can come from how we define the phase or averaging window, but the key point is: liquid water has a substantially lower $q_{4}$ than ice, reflecting its more disordered (but not totally random) structure. An average $q_{4} \approx 0.6$ means water still has a locally tetrahedral tendency (water isn't completely random; it forms fleeting structures), but it's far from the near-perfect $q_{4} \approx 0.9$ of ice.

For completeness, if the system were to go to a gas phase, $q_{4}$ would drop even further. In fact, any molecules that were essentially isolated (gas-like) in the simulation showed $q_{4}$ near 0.2–0.3 on average, with huge variability (because with no structured neighbors, $q_{4}$ becomes almost meaningless). We did see a few low-$q_{4}$ outliers corresponding to very low-density states, which aligns with gas-phase behavior. But primarily, our focus was on the ice vs liquid distinction.

The tetrahedral order parameter turned out to be one of the most sensitive and insightful measures for the phase transition:

It provided a clear microscopic signature of the melting: a dramatic drop in $q_{4}$.
It allowed us to distinguish phases quantitatively (I can literally put a threshold like "if $q_{4}$>0.8 , it's ice; if $q_{4}$<0.5, it's likely gas; intermediate is liquid").
It reinforced the story told by free energy and heat capacity from a structural standpoint. We see the molecules go from ordered to disordered at the same temperature where thermodynamics indicated a phase change.

This structural perspective is super important because sometimes free energy or thermodynamic data alone might hint at something, but seeing the actual structural change confirms it. It also gives us confidence in choosing order parameters for something like umbrella sampling – clearly $q_{4}$ is a good collective variable to distinguish ice and water.

Cross-Checking All the Clues: Consilience of Evidence

By now, you've seen multiple angles on this water phase transition – thermodynamic, structural, statistical. The beautiful thing is they all line up consistently:

The free energy analysis suggested that around 273 K, the system moves into a different state (with enthalpy-entropy compensation smoothing the change in free energy).
The phase diagram and probability distribution showed that ice and liquid regions are separated by a barrier (the system doesn't sample the in-between easily).
The heat capacity spiked at 273 K, screaming "phase change happening here!" in the thermodynamic language.
The density did its expected jump, confirming a macroscopic property change consistent with melting.
The tetrahedral order parameter q4q_4 dropped sharply at the same point, providing the molecular-level confirmation of the structural transformation.

When you have this many pieces of evidence, you can be quite confident in your conclusions. So, what are the big takeaways from this simulation study?

Conclusions and Key Takeaways

Multiple lines of evidence confirm the phase transition: The suspected ice-to-liquid transition in our $\text{TIP4P}$ water simulation (around 0°C) was independently confirmed through free energy analysis, a heat capacity anomaly, and a structural order parameter shift. No single metric gives the full story, but together they painted a coherent picture of the phase change.
Enthalpy–entropy compensation can mask transitions if you look at only one variable: We saw that $H$ and $TS$ often moved in tandem, meaning the free energy change was subtle. This highlights why you need sensitive indicators like $C_{p}$ or $q_{4}$ – they exposed the change that wasn't obvious from enthalpy alone. The $q_{4}$ analysis, in particular, provided a molecular-level mechanism for what was happening: as ice melted, the loss of tetrahedral order (lower $q_{4}$) explained the macroscopic changes in energy and entropy.
Simulation can match reality closely (with the right model and analysis): The phase transition we observed happened at ~273.2 K, which is remarkably close to real water's 273.15 K melting point. This quantitative precision lends credibility to our simulation setup (the $TIP4P$ model and methods). It's encouraging that a molecular simulation can capture such an essential property of water so well.
Advanced sampling methods are crucial for exploring phase space efficiently: Achieving ergodic sampling of both solid and liquid phases in one simulation is tough – the system gets stuck in one phase due to energy barriers. Techniques like Umbrella Sampling (or others like Metadynamics, Replica Exchange, etc.) are invaluable to overcome these barriers. By biasing along a smart collective variable (like $q_{4}$ or something similar) we could map out the free energy landscape of the phase transition in detail without requiring impractically long simulations. In essence, we can reach equilibrium between phases faster by being clever in how we sample.
Thermodynamic integration of analysis methods: Using thermodynamic measurements (free energy, heat capacity) in tandem with structural metrics (order parameters, density) provides a powerful one-two punch in understanding phase behavior. One gives you the "when and how much" in terms of energy, and the other gives you the "how and why" in terms of molecular arrangement. Together, they offer a complete understanding of the phase transition phenomenon.

Finally, a broader perspective: Water is an amazing and quirky substance, and even with a relatively simple model like $TIP4P$, we were able to capture some of its key phase behavior. The exercise of validating our simulation against known properties (like the melting point, density change, etc.) is a great way to build trust in our model before using it to explore unknown territory. And if we wanted to push further – say, simulate supercooling to extreme levels or nucleation of ice – we now know what tools and order parameters could be useful to make that happen.

In summary, analyzing free energy landscapes can indeed highlight phase transitions, but it's the combination of multiple analyses that really clinches the case. With those in hand, we can chart out phase diagrams and understand the underlying molecular story, even for something as complex and crucial as water.

Next Post: Diving Deeper into Liquid Water's Structure

In the next installment, I'll shift focus to examine the local structure in liquid water in more detail. We've already seen how the tetrahedral order parameter $q_{4}$ characterizes the overall order/disorder, but there are other ways to describe what's going on in liquid water's hydrogen bond network. I'll discuss insights from a fascinating study, "Characterization of the Local Structure in Liquid Water by Various Order Parameters" (J. Phys. Chem. B 2015, DOI: 10.1021/acs.jpcb.5b03976), which uses a variety of order parameters to paint a nuanced picture of water's structure at the molecular level. How does water structure fluctuate? Are there hidden symmetries or patterns in liquid water? Stay tuned – we'll explore what these order parameters reveal and how it connects back to our simulation learnings!

Code and Data Availability

All the code, simulation data, parameters, and analysis scripts used in this study are available in my GitHub repository at https://github.com/kGorze/TIP4P-Water-Model. Feel free to explore, reproduce, or build upon this work. The repository contains the complete workflow for running TIP4P water simulations, analysis tools for extracting thermodynamic properties, and visualization scripts used to generate the figures in this post.

Parameter	Effect	Suggested Change
Temperature	Increases barrier hops	Raise \(T\) to \(1.0\) or \(1.5\)
Friction	Reduces damping	Lower \(\gamma\) to \(0.01-0.05\)
Timestep	Speeds up exploration	Increase \(\Delta t\) \((0.002–0.005)\) ps
Simulation length	More sampling	Double or triple `nstep`
Random seed / start	Different noise / basin	Change `idum`, `posx_init`
By increasing the temperature or reducing friction, the particle gains more kinetic energy and fewer dissipative losses²⁷ - making barrier crossings more probable. Extending the run length or using a larger timestep further improves the odds of hopping to other minima.
These modifications are reflected in the input file.
`nstep 5000000 ; # (was 500000) temperature 1.5 ; # (was 0.5) friction 0.05 ; # (was 0.1) Δt 0.002 ; # (slightly increased from 0.001) ... (remaining inputs unchanged) ...`