Exercise #2: pKₐ Calculation

In this exercise, we will run your first MCCE4 simulation to perform a pKₐ calculation (pH titration + pKₐ fitting) using a real protein file!


Background

Lysozyme is a small enzyme that dissolves bacterial cell walls, thus killing bacteria. It was discovered as the first antibiotic to inhibit bacterial growth in food — before penicillin. For this example will be using a lysozyme structure from PDB ID: 4LZT.

Experimental pKₐ values of all lysozyme residues are provided at the bottom of this tutorial.
Here, we will focus on two residues in the active site with perturbed pKₐ values:

  • GLU 35 acts as a proton donor.
  • ASP 52 stabilizes the charged intermediate.

For lysozyme to attack the glucose molecule of the substrate:

  • GLU 35 needs a high pKₐ to remain protonated and donate a proton to the glycosidic oxygen.
  • ASP 52 needs a low pKₐ to remain deprotonated and stabilize the reaction intermediate.

Reference: Jens Erik Nielsen and J. Andrew McCammon, Protein Sci. 2003 Sep; 12(9): 1894–1901


0. Pre-requisite:

Ensure you have the conda environment for mc4 activated.

conda activate mc4

1. Prepare the Calculation

Enter the working directory for this exercise:

cd mcce_workflows
mkdir ex2; cd ex2

Download the PDB file for 4LZT:

getpdb 4lzt

A successful download should display the following message:

[ INFO ] Download completed: 4lzt.pdb

We strongly recommend to run p_info to inspect an unfamiliar PDB file and verify if it is compatiblity with MCCE4 prior to performing a simulation.

p_info 4lzt.pdb

2. Perform pKₐ calculation using run_mcce4

The easiest way to run a MCCE4 simulation is with run_mcce4. It is preset to run a full simulation (ending with a titration) and return the pKₐs of ionizable residues into one of its output files called pK.out upon successful completion.

run_mcce4 4lzt.pdb
  • The conformer occupancies are in file fort.38.
  • The net charge is in file sum_crg.out.
  • The calculated pKₐs are in file pK.out

See here for more details on what’s exactly happening when running this run_mcce4 or customizing runs!

3. Interpret pKₐ results

A trimmed version of the pKₐ/Eₘ report is in file pK.out, which contains the calculated pKₐ/Eₘ values for titratable side chains. You can see the full report in pK_extended.out.

  • pKa/Em : Calculated MCCE4 pKₐ/Eₘ values
  • n (slope) : Slope of titration curve (extrapolated from fort.38) and the Henderson-Hasselbalch equation.
  • 1000×chi2 : 1000 times the chi-squared value. Higher the number, the less accurate the result.
head -n 1 pK.out && tail -n +2 pK.out | sort -k1.1,1.14 
 pH             pKa/Em  n(slope) 1000*chi2
ARG+A0005_       12.839     0.887     0.079
ARG+A0014_       13.576     0.951     0.017
ARG+A0021_       13.333     0.762     0.286
ARG+A0045_       12.880     0.872     0.880
ARG+A0061_       13.879     0.818     0.003
ARG+A0068_        >14.0
ARG+A0073_       12.951     0.890     0.247
ARG+A0112_       13.099     0.905     0.025
ARG+A0114_       13.279     0.957     0.014
ARG+A0125_       13.155     0.891     0.079
ARG+A0128_       13.548     0.921     0.005
ASP-A0018_        2.264     0.915     0.033
ASP-A0048_        1.468     0.980     0.080
ASP-A0052_        2.389     0.987     0.209
ASP-A0066_        2.849     0.938     0.011
ASP-A0087_        2.105     0.981     0.013
ASP-A0101_        4.227     0.995     0.022
ASP-A0119_        3.684     0.995     0.087
CTR-A0129_        2.586     0.893     0.059
GLU-A0007_        3.319     0.961     0.004
GLU-A0035_        5.569     0.987     0.069
HIS+A0015_        6.488     1.010     0.012
LYS+A0001_        9.554     0.997     0.036
LYS+A0013_       11.194     0.958     0.033
LYS+A0033_       10.671     0.959     0.037
LYS+A0096_       10.758     0.786     0.399
LYS+A0097_       10.863     0.880     0.030
LYS+A0116_        9.544     0.953     0.087
NTR+A0001_        7.418     0.995     0.036
TYR-A0020_       13.615     0.592     0.249
TYR-A0023_       10.679     0.881     0.025
TYR-A0053_        >14.0

From the result we can see the MCCE4 calculated GLU 35 (pKₐ = 5.6) with a higher pKₐ than ASP 52 (pKₐ = 2.4).


So how do we calculate these pKₐ values?

Let’s take a look at the two other output files produced:

  1. fort.38
    • Table of the occupancy of each side-chain conformer over the course of the titration. From the titration curve of this lysine in 4lzt, the pKₐ can be determined by identifying the midpoint of the curve where the occupancy is 0.5. This is where the lysine is 50% ionized and 50% non ionized. Projecting this midpoint onto the pH axis yields the pKₐ value.

Screenshot 2026-01-16 at 10 56 39 AM

  1. sum_crg.out
    • Reports the net charge of every residue at each pH. Plotting the charges of this lysine against their respective pH’s and again drawing a line from the midpoint of the titration curve to the pH axis can determine it’s pKa.

Screenshot 2026-01-16 at 10 56 05 AM


Why are the leading factors that contribute to these calculated pKₐ values?

To analyze the ionization factors of an ionizable residue at it’s midpoint with a pairwise energy cutoff of 0.1, we can use the Mean-field energy (MFE) tool:

mfe.py ASP-A0052_ -c 0.1

To analyze the ionization factors of this residue at pH 7 with a pairwise energy cutoff of 0.1:

mfe.py ASP-A0052_  -p 7 -c 0.1
  • The -c flag is to set the minimum pairwise energy cutoff. Thus only terms >cutoff will be outputted.

To learn more about how pKa’s shift, see the mean field energy tutorial!


MFE tutorial: Extra Tool to help analyze pK.out

Continuing the anlaysis of lysozyme, we’re going to use the results of a pKa calculation to perform a more detailed analysis of a specific residue, GLUA35, in Lysozyme. The goal is to move beyond raw pKa values and examine the energetic factors that determine the protonation state of this residue and how the pkA shifts according to the environment inside the protien.

What does MFE do?

MFE (mean field energy) calculates the mean field ionization energy on an ionizable residue at a specific pH/eH. MFE provides the energy interctions of ionized residues versus its neutral form and quantifies the factors that determine which form is favored at a specific pH/eH.

0 - Files needed

Files needed to run this tool is fort.38, head3.lst, pK.out, and sum_crg.out.

1 - Usage

If you have succesfully installed the MCCE-Tools you should be able to call the tool from any directory. To use the program correctly you can look up the name of the residue of interest in the sum_charge.out and paste it in the command line

 mfe.py -p 7 -c 0.05 GLU-A0035_

The -p flags defines at what pH the analysis is done. The -c flag is the energy cutoff that defines the energy minimum for residues interaction.

2 - Output

Output for the analysis should look like this for GLUA35 in Lysozyme. Each term in the row has been defined in a short sentence:

 Residue GLU-A0035_ pKa/Em=5.862 (mcce calculated pKa of the residue interest)
=================================
Terms          pH     meV    Kcal
---------------------------------
vdw0        -0.02   -1.11   -0.03 (interactions of the residue with itself)
vdw1         0.07    3.90    0.09 (side chain interaction with the backbone)
tors        -0.02   -1.37   -0.03 (torsion energy)
ebkb        -1.32  -76.40   -1.80 (backbone energy)
dsol         2.64  153.26    3.60 (solvation energy loss)
offset      -0.22  -12.77   -0.30 (modifiable energy term)
pH&pK0      -2.25 -130.60   -3.07 (effect of pH in ionizatoin)
Eh&Em0       0.00    0.00    0.00 (effect of eH on redox reactions)
-TS          0.20   11.66    0.27 (entropy term)
residues    -0.18  -10.49   -0.25 (sum of pairwise energy interactoins)
*********************************
TOTAL       -1.10  -63.91   -1.50  sum_crg (ionizations state of neighboring residues)
*********************************
LYSA0033_   -0.12   -7.21   -0.17    1.00
SERA0036_   -0.14   -8.27   -0.19    0.00
ASNA0044_   -0.08   -4.67   -0.11    0.00
ARGA0045_   -0.05   -3.03   -0.07    1.00
ASPA0048_    0.19   11.02    0.26   -1.00
ASPA0052_    0.79   46.07    1.08   -1.00
GLNA0057_   -0.09   -5.16   -0.12    0.00
ASNA0059_   -0.05   -3.08   -0.07    0.00
ARGA0061_   -0.13   -7.51   -0.18    1.00
ASPA0066_    0.09    5.38    0.13   -1.00
ARGA0112_   -0.25  -14.62   -0.34    1.00
ARGA0114_   -0.21  -12.13   -0.28    1.00
=================================

Total is sum of all energy terms described above

3 - Understanding the Ouptut

MFE determines the energy interaction of the ionized and neutral form of the selected conformer (ionized form energy - neutral form energy).

Columns

Each column represents the same energy interaction but in different units (pH, meV and kcal/mol)

Conversion of units for each energy term

At 298 K (room temp):

Energy in meV (single proton): • ΔE(meV) ≈ 59.16 * ΔpH

Energy in kcal/mol: • ΔG(kcal/mol) ≈ 1.364 * ΔpH

In depth definitions of terms on rows

All these values are determined bu subtracting the

1) Vdw0: vdW interactions within the conformer (a side chain or ligand) + interaction with implicit solvent, obtained from head3.lst

2) Vdw1: vdw interaction of this conformer with the protein backbone, obtained from head3.lst

3) tors: torsion energy of the conformer, obtained from head3.lst

4) ebkb: energy of the backbone. obtained from head3.lst

5) dsol: Loss of solvation energy compared with this conformer in solution. It should be positive as it is a loss in energy, obtained from head3.lst

6) offset: energy term that can be freely modified, obtained from head3.lst

7) pH&pK0: Solution pH effect on ionization. It is the environmental pressure on residue ionization. For an acid, low solution pH makes ionization (releasing a proton) easy, so it contributes favorable energy. For a base, low pH makes ionization harder. When pH equals the residue’s solution pKa, the environment pH is at a balance point, where the contribution is 0. obtained from head3.lst and pK.out

8) Eh&Em0: Environment Eh effect on redox reaction. This works similarly to pHpK0. obtained from pK Out and head3.lst

9) -TS: Entropy term.The number of rotamers of neutral and ionized residues generated by MCCE may differ. The effect of different rotamer counts on the two ionization states acts like entropy. obtained from pk.out and head3.lst

10) residues: Total pairwise interaction from other residues. obtianed from from fort.38 and head3.lst Other residues may shift the ionization free energy depending on their dipole orientation and charge.

11) total:Total pairwise interaction from other residues.

4 - Interpretation of the data

At the top of the MFE output, the pKₐ of GLUA35 is reported as 5.862, whereas the intrinsic solution pKₐ from head3.lst is 4.75, indicating an upward pKₐ shift. This shift arises from environmental effects within the protein. The positive desolvation (dsol) term favors the neutral (protonated) form by penalizing charge burial. However, the pH-dependent and interaction energy terms favor the ionized state. As a result, the total free energy is negative, indicating that the deprotonated form is overall more stable in this environment.

A general rule of thumb is that if the total energy is negative the ionized conformer is favored at that pH, if positive the neutral conformer is favored.

5 - What if we look at another pH point?

Lets calculate the MFE for the same residue at pH 3 and see if the negative or the neutral form of GLUA35 is preffered

mfe.py -p 3 -c 0.05 GLU-A0035_

Residue GLU-A0035_ pKa/Em=5.846
=================================
Terms          pH     meV    Kcal
---------------------------------
vdw0        -0.02   -1.11   -0.03
vdw1         0.07    3.90    0.09
tors        -0.02   -1.36   -0.03
ebkb        -1.30  -75.46   -1.77
dsol         2.62  152.07    3.57
offset      -0.22  -12.77   -0.30
pH&pK0       1.75  101.57    2.39
Eh&Em0       0.00    0.00    0.00
-TS          0.20   11.45    0.27
residues    -0.69  -40.19   -0.94
*********************************
TOTAL        2.38  138.10    3.25 

Here, the pH favorability shifts from negative to positive, indicating stabilization of the neutral (protonated) form of Glu at this pH. At lower pH, where the proton concentration is higher, protonation of Glu is thermodynamically favored, leading to retention of the proton on the residue.

Benchmark pKas for Lysozyme

There are 20 experimentally measured pKas in hen white lysozyme.

Experimental pKas of residues in Lysozyme

Residue pKₐ
ASP 101 4.08
ASP 119 3.2
ASP 18 2.66
ASP 48 1.6
ASP 52 3.68
ASP 66 0.9
ASP 87 2.07
CTR 2.75
GLU 35 6.2
GLU 7 2.85
HIS 15 5.36
LYS 116 10.2
LYS 13 10.5
LYS 33 10.4
LYS 96 10.8
LYS 97 10.3
TYR 20 10.3
TYR 23 9.8
LYS 1 10.8

References:

How do the experimental pKa’s compare to the calculated theoretical pKa’s?

image

These experimental pKa values have been used to benchmark MCCE and other programs that calculate pKas. For example:


What if

If you want to run calculations with different parameters such as:

  • using an alternate MCCE executable
  • setting the protein dielectric constant to 8
  • retaining explicit water molecules

Check out in customizing MCCE4 simulations here! Customizing Runs!