Single pH Microstate Analysis in Lysozyme

Background

What Is MCCE Microstate?

In MCCE, a microstate is a complete specification of a system’s state, defining the charge and position of all residues and ligands, including their conformations.

What Is MCCE Protonation Microstate?

A protonation microstate specifies the protonation (charge) state of every acidic and basic residue. Multiple protonation microstates can share the same total charge but differ in the distribution of protons among residues (tautomers).

Microstate analysis enables:

  • Provide the possible charge state of each ionizable residue in a given set of microstates
  • Quantification of long-range electrostatic coupling

Parameter File (*.crgms) Overview

We ran Microstate analysis here using the default version of the params.crgms parameter file. You can control the analysis to which residues are analyzed, how correlations are computed, and what outputs are generated in params.crgms parameter file.

Key Parameters

Input ms analysis file name:

1. Microstate file directory:

cd ms_out/pH7.00eH0.00ms.txt

The filename typically encodes the pH and Eh values used in the MCCE run. For pH 7 and Eh 0, the file name pH7.00eH0.00ms.txt If you want to change the input microstate file name, you can change it in the following section in params.crgms file. Generally, it is needed if you run your analysis at a different pH or Eh. For example, the file name for the pH 5 would be:

msout_file = pH5.00eH0.00ms.txt

2. Output directory name: You can change the output directory name in the following line of the params.crgms file:

output_dir = crgms_corr

3. Residues included in correlation analysis or residues of interest

Explicit list of residues used for correlation analysis. Only residues listed here will be included in the correlation matrix. But the default is to include all residues if their correlation cutoff value is >= 0.02 (default).

correl_resids = [LYSA0001_, GLUA0007_, HISA0015_, ASPA0018_, TYRA0020_, GLUA0035_]

4. cut off:

You can set the correlation cutoff by changing the following line, which is set to 0.02 by default in the params file.

cut_off = 0.02

5. Occupancy

You can set the minimum occupancy by changing the following command which is set to 0.01 by default in the params file

Occupany = 0.01

6. n_top (optional)

Limits the number of most-populated unique protonation microstates returned. The following setup will populate top 500 microstates.

n_top = 500

7. residue_kinds:

Filters for which residue types are included when constructing protonation microstates. If omitted, commented out, or empty, all ionizable residues are included.

residue_kinds = [ASP, PL9, LYS, GLU, HIS, TYR, NTR, CTR]

Understanding the Outputs

Data outputs: The following outputs will be in the output directory

all_crg_count_resoi.csv
all_res_crg_status.csv
corr.png
crg_count_res_of_interest.csv
crgms_logcount.png
enthalpy_dist.png
fixed_res_of_interest.csv

What does each individual output file contain

1. Charge microstate file:

all_crg_count_resoi.csv

Shows the protonation statistics for all residues

An example output CSV file given with the top three microstates is as follows for the 4lzt:

NTRA0001_	LYSA0001_	HISA0015_	TYRA0020_	GLUA0035_	ASPA0048_	ASPA0052_	TYRA0053_	ASPA0066_	ASPA0101_	LYSA0116_	GLUA0007_	LYSA0013_	ASPA0018_	TYRA0023_	LYSA0033_	ASPA0087_	LYSA0096_	LYSA0097_	ASPA0119_	CTRA0129_	Count	Occupancy	SumCharge
1	1	0	0	-1	-1	-1	0	-1	-1	1	-1	1	-1	0	1	-1	1	1	-1	-1	598417	0.498681	8
0	1	0	0	-1	-1	-1	0	-1	-1	1	-1	1	-1	0	1	-1	1	1	-1	-1	285831	0.238193	7
1	1	1	0	-1	-1	-1	0	-1	-1	1	-1	1	-1	0	1	-1	1	1	-1	-1	169041	0.140868	9

2. Residues of interest charge ms file:

crg_count_res_of_interest

Shows the protonation statistics for the selected residues only. This will help visualize the microstate charge states of the residues, using only a handful of microstates, thereby reducing the combinatorial complexity of larger protein systems (e.g., Complex I; PDB ID: 4HEA).

An example output CSV file given with the top three microstates is as follows for the 4lzt:

LYSA0001_	HISA0015_	TYRA0020_	GLUA0035_	Count	Occupancy
1	0	0	-1	888068	0.740057
1	1	0	-1	263094	0.219247
1	0	0	0	32098	0.026749

3. Correlation Heatmap

corr.png

This plot shows the weighted Pearson correlation between the two correlated residues:

  • Positive correlation → protonation states rise and fall together
  • Negative correlation → one protonates while the other deprotonates For lysozyme, Asp52 is typically protonated, whereas Glu35 remains deprotonated at near-neutral pH, reflecting their catalytic roles.

An example heat map for lysozym is given as follows:

corr

4. Charge Microstate Distributions

crgms_logcount.png

show:

  • How many unique protonation microstates exist
  • How microstates are distributed by energy
  • Separation between the lowest-energy and most-probable states

An example dot plot for lysozym is given as follows:

crgms_logcount

5. Energy distribution plot:

enthalpy_dist.png

This is the energy distribution plot for all accepted microstates during the MC simulation. An example figure for 4lzt analysis is given as follows:

enthalpy_dist

More about MCCE Microstate

What is MCCE Microstate?

A microstate In MCCE, a microstate defines both residue and ligand charge and position. One exact assignment of protonation, tautomer, and side-chain conformer states.

What Is a MCCE Protonation Microstate?

Protonation microstates, which define the charge of every acidic and basic residue, will exist in many conformational states. The charge state identifies the net, total charge in the microstate.

Tautomers:

Protonation microstates with the same net charge but different proton locations. Proteins, therefore, exist not in a single protonation configuration, but in an ensemble of protonation microstates whose distribution depends on pH, ligands, redox state, and local electrostatics

Why Protonation Microstates Matter?

  • Presence of low-probability but functionally relevant higher energy states
  • Coupling between protonation events at distant residues
  • Distinction between lowest-energy and highest-probability states

Microstate analysis enables:

  • Provide the possible charge state of each ionizable residue in a given set of microstates
  • Mapping of proton transfer pathways
  • Quantification of long-range electrostatic coupling

How MCCE Generates Protonation Microstates?

Degrees of Freedom

  • The protein backbone is fixed
  • Each titratable residue is assigned multiple conformers
  • Different proton positions (e.g., His tautomers)
  • Optional rotamers and explicit waters
  • Each conformer has a precomputed energy
  • A microstate selects one conformer per residue.

Monte Carlo Sampling: MCCE uses grand-canonical Monte Carlo (GCMC) sampling:

  • Randomly select residues and trial conformers
  • Accept or reject moves using the Metropolis–Hastings criterion
  • Millions of trial microstates are explored; only accepted microstates contribute to the Boltzmann ensemble.

Storage of Microstates (ms_out Files): Because storing every microstate explicitly is infeasible, MCCE uses a ticker-tape representation.

Constructing Protonation Microstates The ms_protonation tool performs the following reduction

  • Fixed residues: Residues with a single conformer are assigned a constant charge
  • Free residues: Charge states extracted from conformer identity
  • Charge vectors: Each microstate is mapped to a vector of charges
  • Aggregation: Conformational microstates with identical charge vectors are grouped
  • Weighting: Each protonation microstate is weighted by its MC acceptance count
  • The result is a unique protonation microstate ensemble, each with: Net charge, Probability, and Underlying conformational degeneracy

Weighted Correlation Analysis of Protonation States:

Protonation of residues is often not independent. Electrostatic coupling means that protonation at one site can stabilize or destabilize protonation at another. Positive correlation is residues protonate together, Negative correlation is protonation of one disfavors the other, and Near zero mean independent behavior

Reference:

Common Pitfalls

❌ Running ms_protonation without head3.lst

❌ Over-restricting residue_kinds in large systems