Exercise #2: pKₐ Calculation
In this exercise, we will run your first MCCE4 simulation to perform a pKₐ calculation (pH titration + pKₐ fitting) using a real protein file!
Background
Lysozyme is a small enzyme that dissolves bacterial cell walls, thus killing bacteria. It was discovered as the first antibiotic to inhibit bacterial growth in food — before penicillin. For this example will be using a lysozyme structure from PDB ID: 4LZT.
Experimental pKₐ values of all lysozyme residues are provided at the bottom of this tutorial.
Here, we will focus on two residues in the active site with perturbed pKₐ values:
- GLU 35 acts as a proton donor.
- ASP 52 stabilizes the charged intermediate.
For lysozyme to attack the glucose molecule of the substrate:
- GLU 35 needs a high pKₐ to remain protonated and donate a proton to the glycosidic oxygen.
- ASP 52 needs a low pKₐ to remain deprotonated and stabilize the reaction intermediate.
Reference: Jens Erik Nielsen and J. Andrew McCammon, Protein Sci. 2003 Sep; 12(9): 1894–1901
0. Pre-requisite:
Ensure you have the conda environment for mc4 activated.
conda activate mc4
1. Prepare the Calculation
Enter the working directory for this exercise:
cd mcce_workflows
mkdir ex2; cd ex2
Download the PDB file for 4LZT:
getpdb 4lzt
A successful download should display the following message:
[ INFO ] Download completed: 4lzt.pdb
We strongly recommend to run
p_infoto inspect an unfamiliar PDB file and verify if it is compatiblity with MCCE4 prior to performing a simulation.p_info 4lzt.pdb
2. Perform pKₐ calculation using run_mcce4
The easiest way to run a MCCE4 simulation is with run_mcce4. It is preset to run a full simulation (ending with a titration) and return the pKₐs of ionizable residues into one of its output files called pK.out upon successful completion.
run_mcce4 4lzt.pdb
- The conformer occupancies are in file
fort.38. - The net charge is in file
sum_crg.out. - The calculated pKₐs are in file
pK.out
See here for more details on what’s exactly happening when running this run_mcce4 or customizing runs!
3. Interpret pKₐ results
A trimmed version of the pKₐ/Eₘ report is in file pK.out, which contains the calculated pKₐ/Eₘ values for titratable side chains. You can see the full report in pK_extended.out.
- pKa/Em : Calculated MCCE4 pKₐ/Eₘ values
- n (slope) : Slope of titration curve (extrapolated from
fort.38) and the Henderson-Hasselbalch equation. - 1000×chi2 : 1000 times the chi-squared value. Higher the number, the less accurate the result.
head -n 1 pK.out && tail -n +2 pK.out | sort -k1.1,1.14
pH pKa/Em n(slope) 1000*chi2
ARG+A0005_ 12.839 0.887 0.079
ARG+A0014_ 13.576 0.951 0.017
ARG+A0021_ 13.333 0.762 0.286
ARG+A0045_ 12.880 0.872 0.880
ARG+A0061_ 13.879 0.818 0.003
ARG+A0068_ >14.0
ARG+A0073_ 12.951 0.890 0.247
ARG+A0112_ 13.099 0.905 0.025
ARG+A0114_ 13.279 0.957 0.014
ARG+A0125_ 13.155 0.891 0.079
ARG+A0128_ 13.548 0.921 0.005
ASP-A0018_ 2.264 0.915 0.033
ASP-A0048_ 1.468 0.980 0.080
ASP-A0052_ 2.389 0.987 0.209
ASP-A0066_ 2.849 0.938 0.011
ASP-A0087_ 2.105 0.981 0.013
ASP-A0101_ 4.227 0.995 0.022
ASP-A0119_ 3.684 0.995 0.087
CTR-A0129_ 2.586 0.893 0.059
GLU-A0007_ 3.319 0.961 0.004
GLU-A0035_ 5.569 0.987 0.069
HIS+A0015_ 6.488 1.010 0.012
LYS+A0001_ 9.554 0.997 0.036
LYS+A0013_ 11.194 0.958 0.033
LYS+A0033_ 10.671 0.959 0.037
LYS+A0096_ 10.758 0.786 0.399
LYS+A0097_ 10.863 0.880 0.030
LYS+A0116_ 9.544 0.953 0.087
NTR+A0001_ 7.418 0.995 0.036
TYR-A0020_ 13.615 0.592 0.249
TYR-A0023_ 10.679 0.881 0.025
TYR-A0053_ >14.0
From the result we can see the MCCE4 calculated GLU 35 (pKₐ = 5.6) with a higher pKₐ than ASP 52 (pKₐ = 2.4).
So how do we calculate these pKₐ values?
Let’s take a look at the two other output files produced:
fort.38- Table of the occupancy of each side-chain conformer over the course of the titration. From the titration curve of this lysine in 4lzt, the pKₐ can be determined by identifying the midpoint of the curve where the occupancy is 0.5. This is where the lysine is 50% ionized and 50% non ionized. Projecting this midpoint onto the pH axis yields the pKₐ value.
sum_crg.out- Reports the net charge of every residue at each pH. Plotting the charges of this lysine against their respective pH’s and again drawing a line from the midpoint of the titration curve to the pH axis can determine it’s pKa.
Why are the leading factors that contribute to these calculated pKₐ values?
To analyze the ionization factors of an ionizable residue at it’s midpoint with a pairwise energy cutoff of 0.1, we can use the Mean-field energy (MFE) tool:
mfe.py ASP-A0052_ -c 0.1
To analyze the ionization factors of this residue at pH 7 with a pairwise energy cutoff of 0.1:
mfe.py ASP-A0052_ -p 7 -c 0.1
- The -c flag is to set the minimum pairwise energy cutoff. Thus only terms >cutoff will be outputted.
To learn more about how pKa’s shift, see the mean field energy tutorial!
MFE tutorial: Extra Tool to help analyze pK.out
Continuing the anlaysis of lysozyme, we’re going to use the results of a pKa calculation to perform a more detailed analysis of a specific residue, GLUA35, in Lysozyme. The goal is to move beyond raw pKa values and examine the energetic factors that determine the protonation state of this residue and how the pkA shifts according to the environment inside the protien.
What does MFE do?
MFE (mean field energy) calculates the mean field ionization energy on an ionizable residue at a specific pH/eH. MFE provides the energy interctions of ionized residues versus its neutral form and quantifies the factors that determine which form is favored at a specific pH/eH.
0 - Files needed
Files needed to run this tool is fort.38, head3.lst, pK.out, and sum_crg.out.
1 - Usage
If you have succesfully installed the MCCE-Tools you should be able to call the tool from any directory. To use the program correctly you can look up the name of the residue of interest in the sum_charge.out and paste it in the command line
mfe.py -p 7 -c 0.05 GLU-A0035_
The -p flags defines at what pH the analysis is done. The -c flag is the energy cutoff that defines the energy minimum for residues interaction.
2 - Output
Output for the analysis should look like this for GLUA35 in Lysozyme. Each term in the row has been defined in a short sentence:
Residue GLU-A0035_ pKa/Em=5.862 (mcce calculated pKa of the residue interest)
=================================
Terms pH meV Kcal
---------------------------------
vdw0 -0.02 -1.11 -0.03 (interactions of the residue with itself)
vdw1 0.07 3.90 0.09 (side chain interaction with the backbone)
tors -0.02 -1.37 -0.03 (torsion energy)
ebkb -1.32 -76.40 -1.80 (backbone energy)
dsol 2.64 153.26 3.60 (solvation energy loss)
offset -0.22 -12.77 -0.30 (modifiable energy term)
pH&pK0 -2.25 -130.60 -3.07 (effect of pH in ionizatoin)
Eh&Em0 0.00 0.00 0.00 (effect of eH on redox reactions)
-TS 0.20 11.66 0.27 (entropy term)
residues -0.18 -10.49 -0.25 (sum of pairwise energy interactoins)
*********************************
TOTAL -1.10 -63.91 -1.50 sum_crg (ionizations state of neighboring residues)
*********************************
LYSA0033_ -0.12 -7.21 -0.17 1.00
SERA0036_ -0.14 -8.27 -0.19 0.00
ASNA0044_ -0.08 -4.67 -0.11 0.00
ARGA0045_ -0.05 -3.03 -0.07 1.00
ASPA0048_ 0.19 11.02 0.26 -1.00
ASPA0052_ 0.79 46.07 1.08 -1.00
GLNA0057_ -0.09 -5.16 -0.12 0.00
ASNA0059_ -0.05 -3.08 -0.07 0.00
ARGA0061_ -0.13 -7.51 -0.18 1.00
ASPA0066_ 0.09 5.38 0.13 -1.00
ARGA0112_ -0.25 -14.62 -0.34 1.00
ARGA0114_ -0.21 -12.13 -0.28 1.00
=================================
Total is sum of all energy terms described above
3 - Understanding the Ouptut
MFE determines the energy interaction of the ionized and neutral form of the selected conformer (ionized form energy - neutral form energy).
Columns
Each column represents the same energy interaction but in different units (pH, meV and kcal/mol)
Conversion of units for each energy term
At 298 K (room temp):
Energy in meV (single proton): • ΔE(meV) ≈ 59.16 * ΔpH
Energy in kcal/mol: • ΔG(kcal/mol) ≈ 1.364 * ΔpH
In depth definitions of terms on rows
All these values are determined bu subtracting the
1) Vdw0: vdW interactions within the conformer (a side chain or ligand) + interaction with implicit solvent, obtained from head3.lst
2) Vdw1: vdw interaction of this conformer with the protein backbone, obtained from head3.lst
3) tors: torsion energy of the conformer, obtained from head3.lst
4) ebkb: energy of the backbone. obtained from head3.lst
5) dsol: Loss of solvation energy compared with this conformer in solution. It should be positive as it is a loss in energy, obtained from head3.lst
6) offset: energy term that can be freely modified, obtained from head3.lst
7) pH&pK0: Solution pH effect on ionization. It is the environmental pressure on residue ionization. For an acid, low solution pH makes ionization (releasing a proton) easy, so it contributes favorable energy. For a base, low pH makes ionization harder. When pH equals the residue’s solution pKa, the environment pH is at a balance point, where the contribution is 0. obtained from head3.lst and pK.out
8) Eh&Em0: Environment Eh effect on redox reaction. This works similarly to pHpK0. obtained from pK Out and head3.lst
9) -TS: Entropy term.The number of rotamers of neutral and ionized residues generated by MCCE may differ. The effect of different rotamer counts on the two ionization states acts like entropy. obtained from pk.out and head3.lst
10) residues: Total pairwise interaction from other residues. obtianed from from fort.38 and head3.lst Other residues may shift the ionization free energy depending on their dipole orientation and charge.
11) total:Total pairwise interaction from other residues.
4 - Interpretation of the data
At the top of the MFE output, the pKₐ of GLUA35 is reported as 5.862, whereas the intrinsic solution pKₐ from head3.lst is 4.75, indicating an upward pKₐ shift. This shift arises from environmental effects within the protein. The positive desolvation (dsol) term favors the neutral (protonated) form by penalizing charge burial. However, the pH-dependent and interaction energy terms favor the ionized state. As a result, the total free energy is negative, indicating that the deprotonated form is overall more stable in this environment.
A general rule of thumb is that if the total energy is negative the ionized conformer is favored at that pH, if positive the neutral conformer is favored.
5 - What if we look at another pH point?
Lets calculate the MFE for the same residue at pH 3 and see if the negative or the neutral form of GLUA35 is preffered
mfe.py -p 3 -c 0.05 GLU-A0035_
Residue GLU-A0035_ pKa/Em=5.846
=================================
Terms pH meV Kcal
---------------------------------
vdw0 -0.02 -1.11 -0.03
vdw1 0.07 3.90 0.09
tors -0.02 -1.36 -0.03
ebkb -1.30 -75.46 -1.77
dsol 2.62 152.07 3.57
offset -0.22 -12.77 -0.30
pH&pK0 1.75 101.57 2.39
Eh&Em0 0.00 0.00 0.00
-TS 0.20 11.45 0.27
residues -0.69 -40.19 -0.94
*********************************
TOTAL 2.38 138.10 3.25
Here, the pH favorability shifts from negative to positive, indicating stabilization of the neutral (protonated) form of Glu at this pH. At lower pH, where the proton concentration is higher, protonation of Glu is thermodynamically favored, leading to retention of the proton on the residue.
Benchmark pKas for Lysozyme
There are 20 experimentally measured pKas in hen white lysozyme.
Experimental pKas of residues in Lysozyme
| Residue | pKₐ |
|---|---|
| ASP 101 | 4.08 |
| ASP 119 | 3.2 |
| ASP 18 | 2.66 |
| ASP 48 | 1.6 |
| ASP 52 | 3.68 |
| ASP 66 | 0.9 |
| ASP 87 | 2.07 |
| CTR | 2.75 |
| GLU 35 | 6.2 |
| GLU 7 | 2.85 |
| HIS 15 | 5.36 |
| LYS 116 | 10.2 |
| LYS 13 | 10.5 |
| LYS 33 | 10.4 |
| LYS 96 | 10.8 |
| LYS 97 | 10.3 |
| TYR 20 | 10.3 |
| TYR 23 | 9.8 |
| LYS 1 | 10.8 |
References:
- Kuramitsu, S. and K. Hamaguchi, J Biochem (Tokyo), 1980. 87(4): p. 1215-9
- Takahashi, T., H. Nakamura, and A. Wada, Biopolymers, 1992. 32: p. 897-909
How do the experimental pKa’s compare to the calculated theoretical pKa’s?
These experimental pKa values have been used to benchmark MCCE and other programs that calculate pKas. For example:
- Sham, Y. Y., I. Muegge, and A. Warshel. 1999. Simulating proton trans- locations in proteins: probing proton transfer pathways in the Rhodobacter sphaeroides reaction center. Proteins. 36:484–500
- You, T. J., and D. Bashford. 1995. Conformation and hydrogen ion titration of proteins: a continuum electrostatic model with conformational flexi- bility. Biophys. J. 69:1721–1733
- Antosiewicz, J., J. A. McCammon, and M. K. Gilson. 1996. The determi- nants of pKa’s in proteins. Biochemistry. 35:7819–7833
What if
If you want to run calculations with different parameters such as:
- using an alternate MCCE executable
- setting the protein dielectric constant to 8
- retaining explicit water molecules
Check out in customizing MCCE4 simulations here! Customizing Runs!