Single pH Microstate Analysis in Lysozyme
Background
What Is MCCE Microstate?
In MCCE, a microstate is a complete specification of a system’s state, defining the charge and position of all residues and ligands, including their conformations.
What Is MCCE Protonation Microstate?
A protonation microstate specifies the protonation (charge) state of every acidic and basic residue. Multiple protonation microstates can share the same total charge but differ in the distribution of protons among residues (tautomers).
Microstate analysis enables:
- Provide the possible charge state of each ionizable residue in a given set of microstates
- Quantification of long-range electrostatic coupling
Parameter File (*.crgms) Overview
We ran Microstate analysis here using the default version of the params.crgms parameter file. You can control the analysis to which residues are analyzed, how correlations are computed, and what outputs are generated in params.crgms parameter file.
Key Parameters
Input ms analysis file name:
1. Microstate file directory:
cd ms_out/pH7.00eH0.00ms.txt
The filename typically encodes the pH and Eh values used in the MCCE run. For pH 7 and Eh 0, the file name pH7.00eH0.00ms.txt If you want to change the input microstate file name, you can change it in the following section in params.crgms file. Generally, it is needed if you run your analysis at a different pH or Eh. For example, the file name for the pH 5 would be:
msout_file = pH5.00eH0.00ms.txt
2. Output directory name: You can change the output directory name in the following line of the params.crgms file:
output_dir = crgms_corr
3. Residues included in correlation analysis or residues of interest
Explicit list of residues used for correlation analysis. Only residues listed here will be included in the correlation matrix. But the default is to include all residues if their correlation cutoff value is >= 0.02 (default).
correl_resids = [LYSA0001_, GLUA0007_, HISA0015_, ASPA0018_, TYRA0020_, GLUA0035_]
4. cut off:
You can set the correlation cutoff by changing the following line, which is set to 0.02 by default in the params file.
cut_off = 0.02
5. Occupancy
You can set the minimum occupancy by changing the following command which is set to 0.01 by default in the params file
Occupany = 0.01
6. n_top (optional)
Limits the number of most-populated unique protonation microstates returned. The following setup will populate top 500 microstates.
n_top = 500
7. residue_kinds:
Filters for which residue types are included when constructing protonation microstates. If omitted, commented out, or empty, all ionizable residues are included.
residue_kinds = [ASP, PL9, LYS, GLU, HIS, TYR, NTR, CTR]
Understanding the Outputs
Data outputs: The following outputs will be in the output directory
all_crg_count_resoi.csv
all_res_crg_status.csv
corr.png
crg_count_res_of_interest.csv
crgms_logcount.png
enthalpy_dist.png
fixed_res_of_interest.csv
What does each individual output file contain
1. Charge microstate file:
all_crg_count_resoi.csv
Shows the protonation statistics for all residues
An example output CSV file given with the top three microstates is as follows for the 4lzt:
NTRA0001_ LYSA0001_ HISA0015_ TYRA0020_ GLUA0035_ ASPA0048_ ASPA0052_ TYRA0053_ ASPA0066_ ASPA0101_ LYSA0116_ GLUA0007_ LYSA0013_ ASPA0018_ TYRA0023_ LYSA0033_ ASPA0087_ LYSA0096_ LYSA0097_ ASPA0119_ CTRA0129_ Count Occupancy SumCharge
1 1 0 0 -1 -1 -1 0 -1 -1 1 -1 1 -1 0 1 -1 1 1 -1 -1 598417 0.498681 8
0 1 0 0 -1 -1 -1 0 -1 -1 1 -1 1 -1 0 1 -1 1 1 -1 -1 285831 0.238193 7
1 1 1 0 -1 -1 -1 0 -1 -1 1 -1 1 -1 0 1 -1 1 1 -1 -1 169041 0.140868 9
2. Residues of interest charge ms file:
crg_count_res_of_interest
Shows the protonation statistics for the selected residues only. This will help visualize the microstate charge states of the residues, using only a handful of microstates, thereby reducing the combinatorial complexity of larger protein systems (e.g., Complex I; PDB ID: 4HEA).
An example output CSV file given with the top three microstates is as follows for the 4lzt:
LYSA0001_ HISA0015_ TYRA0020_ GLUA0035_ Count Occupancy
1 0 0 -1 888068 0.740057
1 1 0 -1 263094 0.219247
1 0 0 0 32098 0.026749
3. Correlation Heatmap
corr.png
This plot shows the weighted Pearson correlation between the two correlated residues:
- Positive correlation → protonation states rise and fall together
- Negative correlation → one protonates while the other deprotonates For lysozyme, Asp52 is typically protonated, whereas Glu35 remains deprotonated at near-neutral pH, reflecting their catalytic roles.
An example heat map for lysozym is given as follows:
4. Charge Microstate Distributions
crgms_logcount.png
show:
- How many unique protonation microstates exist
- How microstates are distributed by energy
- Separation between the lowest-energy and most-probable states
An example dot plot for lysozym is given as follows:
5. Energy distribution plot:
enthalpy_dist.png
This is the energy distribution plot for all accepted microstates during the MC simulation. An example figure for 4lzt analysis is given as follows:
More about MCCE Microstate
What is MCCE Microstate?
A microstate In MCCE, a microstate defines both residue and ligand charge and position. One exact assignment of protonation, tautomer, and side-chain conformer states.
What Is a MCCE Protonation Microstate?
Protonation microstates, which define the charge of every acidic and basic residue, will exist in many conformational states. The charge state identifies the net, total charge in the microstate.
Tautomers:
Protonation microstates with the same net charge but different proton locations. Proteins, therefore, exist not in a single protonation configuration, but in an ensemble of protonation microstates whose distribution depends on pH, ligands, redox state, and local electrostatics
Why Protonation Microstates Matter?
- Presence of low-probability but functionally relevant higher energy states
- Coupling between protonation events at distant residues
- Distinction between lowest-energy and highest-probability states
Microstate analysis enables:
- Provide the possible charge state of each ionizable residue in a given set of microstates
- Mapping of proton transfer pathways
- Quantification of long-range electrostatic coupling
How MCCE Generates Protonation Microstates?
Degrees of Freedom
- The protein backbone is fixed
- Each titratable residue is assigned multiple conformers
- Different proton positions (e.g., His tautomers)
- Optional rotamers and explicit waters
- Each conformer has a precomputed energy
- A microstate selects one conformer per residue.
Monte Carlo Sampling: MCCE uses grand-canonical Monte Carlo (GCMC) sampling:
- Randomly select residues and trial conformers
- Accept or reject moves using the Metropolis–Hastings criterion
- Millions of trial microstates are explored; only accepted microstates contribute to the Boltzmann ensemble.
Storage of Microstates (ms_out Files): Because storing every microstate explicitly is infeasible, MCCE uses a ticker-tape representation.
Constructing Protonation Microstates The ms_protonation tool performs the following reduction
- Fixed residues: Residues with a single conformer are assigned a constant charge
- Free residues: Charge states extracted from conformer identity
- Charge vectors: Each microstate is mapped to a vector of charges
- Aggregation: Conformational microstates with identical charge vectors are grouped
- Weighting: Each protonation microstate is weighted by its MC acceptance count
- The result is a unique protonation microstate ensemble, each with: Net charge, Probability, and Underlying conformational degeneracy
Weighted Correlation Analysis of Protonation States:
Protonation of residues is often not independent. Electrostatic coupling means that protonation at one site can stabilize or destabilize protonation at another. Positive correlation is residues protonate together, Negative correlation is protonation of one disfavors the other, and Near zero mean independent behavior
Reference:
Common Pitfalls
❌ Running ms_protonation without head3.lst
❌ Over-restricting residue_kinds in large systems