Exercise #5: Obtaining H-bonding pairs and microstates data
In this exercise, we will run a custom simulation on 4LZT and process its microstates ensemble to produce hydrogen-bond networks using MCCE4.
0. Pre-requisite:
Ensure you have the conda environment for mc4 activated.
conda activate mc4
Ensure you have installed MCCE4-Tools. If not, please follow these steps.
Tool: ms_hbnets
This tools processes the msout file for H-bonding microstates and H-bonding pairs, outputting their count and occupancies in their respective files, e.g. hb_pairs_pH7eH0.csv and hb_states_pH7eH0.csv.
- H-bonding microstates
- The Monte Carlo sampling microstates that contain any of the structural Hydrogen-bonding pairs
- H-bonding pairs
- The pairs of Hydrogen- donor and acceptor conformers
Help on the tool: ms_hbnets -h
usage: ms_hbnets
ms_hbnets -ph 5
ms_hbnets -n_states 30000
Gather the H-bonding conformer pairs and states occupancies
from the microstates file given a mcce dir, pH & Eh.
options:
-h, --help show this help message and exit
-mcce_dir MCCE_DIR MCCE run directory; Default: .
-ph PH Titration pH; Default: 7
-eh EH Titration Eh; Default: 0
-n_states N_STATES Number of hb states to return, possibly; Default: 25000
-v, --verbose To output more details; Default: False
1. Prepare the directory:
Enter the working directory for this exercise:
cd mcce_workflows
mkdir ex5; cd ex5
2. Get the pdb file & run p_info:
getpdb 4lzt
p_info 4lzt.pdb
3. Run a full simulation at pH7:
run_mcce4 4lzt.pdb -initial 7 -n 1 --ms
4. Run detect_hbonds (another tool in MCCE4-Tools):
ms_hbnets uses the output of detect_hbonds, which works on a pdb (step2out.pdb by default), and therefore, returns all _structural H-bonding donor and acceptors pairs; its main output is ‘step2_out_hah.txt’.
detect_hbonds
5. Run ms_hbnets:
Since we are using the default options, the tool will look for the ‘step2_out_hah.txt’ file in order to process it for the current pH.
ms_hbnets # backbone atoms are included by default, add flag --no_bk to exclude them
The outputs of
ms_hbnetsare pH-dependent as some conformers may not be free at all pH points.
Main outputs of ms_hbnets:
Three csv files that retain the ‘pHeH’ string of the msout file name in use.
expanded_hah_pH7eH0.csv
This file is both a reduced and an expanded version of the ‘step2_out_hah.txt’ file.
Reduced
H-bonds of backbone conformer pairs, or of backbone and always fixed conformers are removed, as well as several columns from the master file are filtered out (though the pairs xyz coordinates are retained).
Expanded
Extra columns flag whether a conformer is free or not, which leads to a mapping of conformer indices to H-bonds matrix indices (‘Mi’, ‘Mj’). These two columns, also present in the pairs file are used to update the expanded file with the pairs data. Other columns provide a way to reconcile/verify the correctness of a H-bonding network; for instance, columns ‘dina’ (“donor in acceptor list”) or ‘aind’ (“acceptor in donor list”) with values of 1 mean that in a graph these conformers would have at least two nodes.
hb_states_pH7eH0.csv
This file lists each H-bonding microstate count and occupancy.
- Column names: ‘state_id’, ‘ms_count’, ‘ms_occ’.
- Column types: string, integer, float.
- Example (shortened):
state_id,ms_count,ms_occ "(HIS+1A0015_006,THR01A0089_003),(SER01A0050_003,ASP-1A0048_010),(THRBKA0051_000,SER01A0060_005)",1827,0.0015225So, ‘state_id’ is a string of conformer identifier pairs (tuples).
hb_pairs_pH7eH0.csv
This file lists the effective count and occupancy of each structural H-bonding pairs found in the H-bond microstate ensemble.
- Columns names: ‘Mi’,’Mj’, ‘donor’, ‘acceptor’, ‘res_d’, ‘res_a’, ‘ms_count’, ‘ms_occ’.
- Column types: integer, integer, string, string, string, string , integer, float.
- Example (shortened):
Mi,Mj,donor,acceptor,res_d,res_a,ms_count,ms_occ 22,25,SER01A0060_005,THR01A0069_003,S_A60,T_A69,1169166,0.974305 33,69,SER01A0100_003,LYSBKA0096_000,S_A100,K_A96,1008940,0.8407833333333333
ms_hbnets is ‘W.I.P’ (work in progress)
The ‘Mi’,’Mj’ columns provide a ‘key’ to, for example, retrieve the coordinates of donors and acceptors, if the positions were needed in a graph (network) analysis, which is not yet included in the tool.
TODO: Contribution from Jose to show how to obtain a graph using the hb_pairs file.