Standard Molecular Dynamics (MD) Simulations

Standard MD simulations are useful tools that act as a “computational microscope” for understanding the dynamic structure and function of biomolecular systems. Through MD simulations, static structures become time-evolved snapshots of how molecules behave and interact over time. Structural and energetic analyses can be combined to describe the MD simulated biomolecular systems, including, but not limited to, the determination of protein stability and identifying key interactions promoting ligand activity. SilcsBio offers the utilities needed to set up, run, and analyze proteins and protein–ligand complexes. The analyses available through the standard MD simulation utility of the CGenFF Suite are:

  • MD simulations of protein, ligand, or protein–ligand complex (MD Simulations of a Protein and Ligand)

    • Protein backbone root mean squared deviation (RMSD)

    • Per-residue root mean squared fluctuation (RMSF)

    • Protein radius of gyration

    • Protein solvent accessible surface area (SASA)

    • Ramachandran plot

    • Ligand RMSD

    • Ligand RMSF

    • Ligand center-of-mass (COM)

    • Protein–ligand interaction energy

    • Ligand SASA

  • MD simulations of a membrane protein or membrane protein–ligand complex (MD Simulation for Membrane Proteins)

    • Protein backbone root mean squared deviation (RMSD)

    • Per-residue root mean squared fluctuation (RMSF)

    • Protein radius of gyration

    • Protein solvent accessible surface area (SASA)

    • Ramachandran plot

    • Ligand RMSD

    • Ligand RMSF

    • Ligand center-of-mass (COM)

    • Protein–ligand interaction energy

    • Ligand SASA

  • MD simulations of a protein in the presence of probe solutes (MD Simulation with Probes)

    • Protein backbone root mean squared deviation (RMSD)

    • Per-residue root mean squared fluctuation (RMSF)

    • Protein radius of gyration

    • Protein solvent accessible surface area (SASA)

    • Ramachandran plot

    • Protein pocket identification

In addition to the above listed analyses, users may also apply their own, custom analyses to the simulation trajectories output from the standard MD simulation utility of the CGenFF Suite.

MD Simulations of a Protein and Ligand

MD Simulations Using the CLI

SilcsBio provides command line interface (CLI) utilities to set up, run, and analyze standard MD simulations. The following steps describe how to set up, run, analyze, and clean standard MD simulations of a protein, ligand, or protein–ligand complex.

  1. Set up the MD simulation system:

To set up a standard MD simulation with a protein target, use the following command:

${SILCSBIODIR}/md/1_setup_md_boxes prot=<Protein PDB>

Warning

The setup program internally uses the GROMACS utility pdb2gmx. You may encounter errors while processing the protein PDB file at this stage. The most common reasons for errors at this stage involve mismatches between the expected residue names or atom names in the input PDB file and those defined in the CHARMM36 force-field.

To fix this problem, run the pdb2gmx command manually from within the 1_setup directory to get a detailed error message. The setup script already prints out the exact commands you need to run to see this message. The commands are also shown below:

cd 1_setup
pdb2gmx –f <PROT PDB NAME>_4pdb2gmx.pdb –ff charmm36 –water tip3p –o test.pdb –p test.top -ter -merge all

Once all the errors in the input PDB are corrected, rerun the setup script.

By default, 1 independent simulation system will be created. The simulation system will comprise your protein or protein–ligand structure solvated in a waterbox neutralized by addition of counter ions. The size of the waterbox will be set such that there is a 10 angstrom margin on all sides of the protein.

The setup command offers a number of options as shown below. The full list of options can be reviewed by simply entering the command without any options. These options are listed below:

  • Number of independent simulation systems:

    numsys=<# of simulations; default=1>
    

    If additional independent simulation systems are needed, specify the number of systems with the numsys keyword. For example, if 6 simulation systems are needed, include numsys=6 in the command.

  • Presence of ligand bound to the target protein:

    lig=<location and name of ligand mol2; default=empty/NULL>
    

    If the simulation system comprises of a protein–ligand complex structure, specify the name and location of the ligand Mol2 file to be bound to the protein with the lig keyword. Note that the ligand structure must be aligned appropriately with the protein structure. For users with access to the SILCS-Small Molecule Suite, if the structure of the ligand originates from SILCS-MC Docking or Pose Refinement, please refer to Best Protein–Ligand Complex Retrieval for further instructions.

    Tip

    If you want to run MD with only your ligand then use prot=none.

  • Application of weak harmonic positional restraints:

    core_restraints_string=<restrained residues, e.g., "r 17-21 | r 168-174"; default=empty/NULL>
    

    This option allows the application of weak harmonic positional restraints to all or specified C-alpha atoms during the MD simulations. The force constant of the weak harmonic positional restraints is ~0.12 kcal/mol/Å2(~50 kJ/mol/nm2). By default, this option is set to empty/NULL, meaning no harmonic positional restraints are applied.

    To specify which C-alpha atoms are restrained, enter their corresponding residue numbers. For example, if core_restraints_string="r 17-160 | r 168-174" is supplied, then the C-alpha atoms in only residues 17 to 160 and residues 168 to 174 will be restrained.

  • Re-orientation of residue side chains:

    scramblesc=<true/false; default=false>
    

    When set to true the scramblesc keyword re-orients the side chain dihedrals of the protein residues.

  • Simulation box size:

    margin=<system margin (Å); default=10>
    

    By default, the simulation system size is determined by adding a 10 Å margin to the size of the input protein structure. The size of the simulation box can be customized by using the margin keyword.

    Tip

    A larger margin size than the default 10 Å is recommended if the conformation of the protein is expected to change during the simulation.

    Warning

    Do not use a margin size smaller than 10 Å. A smaller margin size will increase the likelihood of artifacts due to “self” interaction accross the periodic boundary conditions.

  • Concentration of ions:

    ions_conc=<concentration of salt (M); default=0>
    

    The concentration of the ions in the simulation system can be specified in M units with the ions_conc keyword. The number of ions is calculated based on the number of water molecules in the simulation system and the the user-defined ion concentration relative to 55 M water. By default, the concentration of ions is set to 0. Note that, if an ion concentration is specified, these ions will be added in addition to neutralizing ions. Additionally, note that neutralizing ions will be added even if the ions_conc=0 if the protein or protein–ligand structure is charged.

  • Type of ions added to the simulation system:

    ions_type=<type of salt; options: KCl, NaCl; default=KCl>
    

    The cation included in the simulation system can be specified to be either Na or K using the ions_type keyword. For example, add ions_type=NaCl if you want to add sodium chloride ions to the simulation system. The default value is KCl.

  • Skip the GROMACS utility pdb2gmx:

    skip_pdb2gmx=<true/false; default=false>
    

    If the pdb2gmx utility has already been successfully run, you may skip rerunning this utility with the skip_pdb2gmx keyword.

For covalent ligand, the following parameters should be set in addition to the lig keyword (For more details see CGenFF for Covalent Ligands):

  • covalent=true

  • linkatom1 : Atom ID for the parent atom in the covalent bond based on atom order in your ligand Mol2 file.

  • linkatom2 : Atom ID for the child atom in the covalent bond based on atom order in your ligand Mol2 file. This is the atom that will be deleted in a covalent bond.

  • linkresname : Amino acid residue name from you protein to which covalent bond will be formed. Currently only CYS is supported.

  • linkresid : Residue ID for the amino acid residue.

  1. Run the MD simulation:

To perform MD simulations, use the following command:

${SILCSBIODIR}/md/2a_run_md prot=<Protein PDB> cycles=<number of cycles to run>

The command to run MD simulations offers a number of options as shown below. The full list of options can be reviewed by entering the command without any options.

Optional parameters:

  • Duration of the MD simulation:

    cycles=<# of production cycles with 10 ns/cycle; default=50>
    

    By default, the MD simulations are run in continuous cycles of 10 ns. The duration of the simulation can be specified using the cycles keyword. The default value is set to 50, which will correspond to a continuous 500 ns simulation split into 50 trajectory files of 10 ns each.

  • Temperature of the MD simulations:

    temp=<temperature of simulation; default=310>
    

    By default the temperature of the simulation system is set to 310 K. The temperature of the simulation system can be specified with the temp keyword. For example, use temp=300 to set the temperature of the simulation system to 300 K.

  • Manually submit simulation jobs to the queuing system:

    batch=<true/false, true: generate files but do not submit jobs; default=false>
    

    To generate the simulation input files without submitting the simulation jobs, set the batch keyword to batch=true. This keyword may be benefitial to those who wish to modify the input files before performing the simulations.

  • Number of cores per simulation:

    nproc=<# of cores per simulation; default nproc=8>
    

    If the MD simulation will be performed using only CPUs, then the number of CPUs used to run the MD simulation can be specified using the nproc keyword. For GPU+CPU nodes, GROMACS performs best using the default settings (nproc=8). Set nproc if less than 8 CPUs are available for the simulation.

  • Continue the simulation from a previous cycle

    restart=<true/false, true:- restart all runs from last cycle; default=false>
    

    Set the restart keyword to restart=true if you want to restart all the simulation runs from the last cycle. To restart a specific run, the run can be specified. For example, if only simulation number 2 should be restarted, restart=#2 should be specified.

  1. Collect the simulation trajectory:

The resulting trajectory can be collected by using the following command.

${SILCSBIODIR}/md/2b_trjcat prot=<Protein PDB>

The above command will first create a concatenated trajectory for each independent simulation system. Each trajectory will be saved as 2b_trajcat/traj_X_fit.xtc file with X indicating the system number. In the case that only one simulation system was run, there will only be a 2b_trajcat/traj_1_fit.xtc.

Once the trajectory from independent simulations are concatenated, a single trajectory is created that concatenates the protein trajectories created from independent simulations for use in analysis. The trajectory will be translated and rotated to fit to the first frame of the trajectory. traj.pdb and traj.xtc will be your final output files.

Optional parameters:

  • Number of independent simulation systems:

    numsys=<# of simulations; default based on 2a_run_md step>
    

    If additional independent simulation systems need to be collected, specify the number of systems with the numsys keyword. For example, if 6 simulation systems are to be collected, include numsys=6 in the command. The default value is based on the numsys specified previously with 2a_run_md.

  • Number of cycles (Duration of the MD simulation):

    cycles=<# of production cycles with 10 ns/cycle; default based on 2a_run_md step>
    

    To change the number of cycles from each independent simulation systems to collect, specify the cycles keyword with the number of desired cycles. The default value is based on the cycles specified previously with 2a_run_md.

  • Periodic Boundary Condition:

    pbctype=<type of periodic boundary condition treatment; res/atom/nojump/cluster/whole; default=whole>
    

    By default, the trajectory will be collected with periodic boundary condition to make molecules whole. The periodic boundary conditions used to collect the simulation trajectories can be specified using the pbctype keyword.

    Tip

    For simulations of protein–ligand complex structures in which the ligand is not covalently bound to the protien, using pbctype=nojump may be benefitial. For simulations of multi-domain proteins, using pbctype=whole may be beneficial. Further processing of the trajectory can be performed using the trjconv utility from GROMACS if desired using the following commands:

    cp traj.xtc traj_bak.xtc
    $GMXDIR/gmx trjconv -s ref.tpr -f traj_bak.xtc -pbc ${pbctype} -center -o traj.xtc
    

Additional collection parameters:

  • Keep solvent molecules

    full=<true/false, true:- collect with all atoms; default=false>
    

    By default, the trajectories will be stripped of all solvent molecules. To retain the solvent molecules in the trajectories, set the full keyword to full=true.

  • Strip hydrogens from the simulation trajectory:

    nohyd=<true/false, true: collect with all atoms except hydrogens; default=false>
    

    By default, hydrogen atoms are retained when the simulation trajectory is collected. To strip hydrogens from the trajectories, set the nohyd keyword to nohyd=true.

  1. Perform basic trajectory analyses:

The traj.xtc trajectory file will be analyzed for a number of common properties such as RMSD, RMSF, RGYR and interaction energies.

${SILCSBIODIR}/md/3_analyze prot=<Protein PDB>

The command will perform the following analyses using GROMACS:

  • Backbone root mean squared deviation (RMSD):

    rmsdbb=<calc rmsd of protein backbone; true/false; default=true>
    

    Calculate the RMSD of the protein backbone. The resulting values are in units.

  • Per-residue root mean squared fluctuation (RMSF):

    rmsfprot=<calc rmsf of protein per residue; true/false; default=true>
    

    Calculate the per-residue RMSF of the simulated protein. The resulting values are in units.

  • Protein radius of gyration (RGYR):

    rgyr=<calc radius of gyration of protein; true/false; default=true>
    

    Calculate the RGYR of the simulated protein. The resulting values are in units.

  • Protein solvent accessible surface area (SASA):

    sasaprot=<sasa of protein residues; e.g., "1" or "r 1-9 | r 16-24 & 1"; default=none>
    

    Calculate the SASA of the simulated protein. The selection terms defined in the index groups generated by the gmx make_ndx command can be used to specify which residues to calculate SASA. For example, sasaprot=1 will select all protein residues. The resulting values are in units.

  • Ramachandran plot for Phi-Psi dihedrals:

    phipsi=<calc phi-psi angles and get Ramachandran plot; "all" or "2-9,34-66"; default=false>
    

    Calculate the phi-psi dihedral distributuion of the simulated protein and generate a Ramachandran plot. Default value is false. To consider all residues, use phipsi=all. To consider only specific residues, use , and - to define the range of residues. For example, phipsi=2-9,34-66 will consider residues 2 to 9 and 34 to 66.

If a ligand is present, the following analyses will be performed:

  • Ligand root mean squared deviation (RMSD)

    rmsdlig=<calc rmsd of ligand; true/false; default=true>
    

    Calculate the RMSD of the simulated ligand. The resulting values are in units.

  • Ligand root mean squared fluctuation (RMSF)

    rmsflig=<calc rmsf of ligand residue; true/false; default=true>
    

    Calculate the per-residue RMSF of the simulated ligand. The resulting values are in units.

  • Center-of-mass (COM) of the ligand:

    comlig=<calc center of mass of ligand; true/false; default=true>
    

    Calculate the COM the simulated ligand. The resulting values are in units.

  • Interaction energy between the protein and ligand:

    IEprotlig=<calc interaction energy between protein and ligand; true/false; default=false>
    

    Calculate the interaction energy between the protein and the ligand. The resulting values are in units.

    Note

    The interaction energy represents the strength of interaction between two molecules (in vacuum) and does not account for polar interaction solvation energy. Thus, the interaction energy should not be confused with binding affinity. Interaction energy and its decomposition into van der Waals and electrostatic components can give insights into the driving forces of protein–ligand binding.

  • Solvent accessible surface area (SASA) of the simulated ligand:

    sasalig=<calc sasa of ligand; true/false; default=false>
    

Additional Parameters:

  • Analysis output directory:

    anlyzdir=<name of analysis output directory; default=3_analyze>
    

    The output data and plots will be collected in the 3_analyze directory (or the user-specified directory). This directory can be downloaded to a local machine for visualization and analysis.

  • Python executable path:

    python=<path to python executable for plotting; default=>
    

    This script uses the Matplotlib python package for plotting the data. The path to the python executable with the matplotlib module installed must be specified. The default python path can be checked using the which python command. Please refer to Python 3 Requirement for more information.

Note

The plots use boxplots to show the distribution of data. The boxplot is a standardized way of displaying the distribution of data based on the five number summary: minimum, first quartile, median, third quartile, and maximum. More information can be found at https://en.wikipedia.org/wiki/Box_plot

  1. Clean and compress MD data:

After performing the analyses the output data may be cleaned to save disk space by removing unnecessary files and compressing the trajectories. SilcsBio provides the 4_cleanup utility to facilitate the data cleanup:

${SILCSBIODIR}/md/4_cleanup prot=<Protein PDB>

The command 4_cleanup will delete unnecessary files in the 2a_run_md directory and compress the raw trajectory data. Individual run directories can be deleted by re-running the command with the delete=true keyword included; the default value is delete=false:

${SILCSBIODIR}/md/4_cleanup prot=<Protein PDB> delete=true

MD Simulation for Membrane Proteins

MD Simulations for Membrane Proteins Using the CLI

SilcsBio provides the utility to run standard MD simulations of membrane proteins only in the command line interface (CLI). The following steps describe how to set up, run, analyze, and clean standard MD simulations of a membrane protein.

  1. Set up the MD simulation system:

To set up a standard MD simulation with a membrane protein target, you may use either a bare protein structure or your own pre-built protein/bilayer system with MD module.

If you are beginning with a bare protein, SilcsBio provides a command line utility to embed transmembrane proteins, such as GPCRs, in a bilayer of 9:1 POPC/cholesterol for subsequent MD simulations:

${SILCSBIODIR}/md/memb/1a_fit_protein_in_bilayer prot=<Protein PDB>

This command will align the first principal axis of the protein with the bilayer normal (Z-axis) and translate the protein center of mass to the center of the bilayer (Z=0).

If the protein is already oriented as desired, orient_principal_axis=false will suppress automatic principal axis-based alignment.

${SILCSBIODIR}/md/memb/1a_fit_protein_in_bilayer prot=<Protein PDB> orient_principal_axis=false

By default the system size is set to 120 Å along the X and Y dimensions. To change the system size, use the bilayer_x_size and bilayer_y_size options.

${SILCSBIODIR}/md/memb/1a_fit_protein_in_bilayer prot=<Protein PDB> bilayer_x_size=<X dimension> bilayer_y_size=<Y dimension>

By default, the script fits the bare protein into the lipid bilayer such that the center of mass of the protein is aligned with the center of mass of the bilayer. This may not be the best choice for your membrane proteins, and you may therefore want to adjust the position of your protein according to a recommendation from an external program such as the OPM server. In this case you can adjust the relative position of the protein with respect to the position of the bilayer by using the offset_z option to specify the distance by which the center of mass of the protein should be offset in Z-direction from the center of the bilayer.

${SILCSBIODIR}/md/memb/1a_fit_protein_in_bilayer prot=<Protein PDB> offset_z=<distance_in_Z-direction>

Note

To use the OPM server (https://opm.phar.umich.edu/ppm_server) output as a guide to build a transmembrane system for MD simulations, please see How do I fit my membrane protein in a bilayer as suggested by the OPM server?

The resulting protein/bilayer system will be output in a PDB file with suffix _popc_chol. It is important to use molecular visualization software at this stage to confirm correct orientation of the protein and size of the bilayer before using this output as the input in the next step.

Alternatively, you may use a protein/bilayer system that has been built using other software tools.

The following command will prepare the SILCS simulation box by adding water and neutralizing ions to the protein/bilayer system:

${SILCSBIODIR}/md/memb/1b_setup_md_with_prot_bilayer prot=<Protein/bilayer PDB>

By default, 1 independent simulation system will be created. The setup command offers a number of options. The full list of options can be reviewed by simply entering the command without any options. The options are same as described in step 1 for MD simulations of globular proteins MD Simulations Using the CLI.

Tip

You may use batch=false option to run this step on your head node. Note that it can sometimes take few hours to finish this step for very large systems.

  1. Run the MD simulation:

To perform MD simulations, use the following command.

${SILCSBIODIR}/md/2a_run_md prot=<Protein/bilayer PDB> cycles=<number of cycles to run>

By default, this command detects memb=true as this is membrane protein.

For other parameters, see step 2 of MD Simulations Using the CLI

  1. Collect the simulation trajectory:

The resulting trajectory can be collected by using the following command.

${SILCSBIODIR}/md/2b_trjcat prot=<Protein/bilayer PDB>

For parameters, see step 3 of MD Simulations Using the CLI

  1. Perform basic trajectory analyses:

The traj.xtc trajectory file will be analyzed for RMSD, RMSF, RGYR, Interaction energies and SASA.

${SILCSBIODIR}/md/3_analyze prot=<Protein/bilayer PDB>

For parameters, see step 4 of MD Simulations Using the CLI

The output data and plots will be collected in the 3_analyze directory (or the user-specified directory). This directory can be downloaded to a local machine for visualization and analysis.

  1. Clean and compress MD data:

After the analyses you can clean the output data for unnecessary files and compress the trajectories.

${SILCSBIODIR}/md/4_cleanup prot=<Protein/bilayer PDB>

The command will delete the un-necessary files in 2a_run_md directory and compress the raw trajectory data for you.

You can then delete the individual run directories by re-running the command with delete=true. Default value is false.

${SILCSBIODIR}/md/4_cleanup prot=<Protein/bilayer PDB> delete=true

MD Simulation with Probes

Background

MD simulations encompassing proteins in the presence of organic probes sample the interactions between small functional groups and proteins and are widely used in various computer-aided drug discovery methods.

The addition of hydrophobic probes in MD simulations can open and stabilize the openings of cryptic pockets that may otherwise collapse during a simulation. Cryptic pockets are a type of binding pocket that are typically not apparent in a crystal structure, but can be induced to open when a ligand is nearby. It is hypothesized that the hydrophobic side chains that often encompass the cryptic pocket and surround the opening of the pocket must undergo conformational changes to fully open the cryptic pocket. Such conformational changes can be facilitated in MD simulations through the inclusion of apolar organic solutes such as benzene or propane.

MD-with-Probe Simulations Using the CLI

SilcsBio provides the utility to run MD simulations of proteins in the presence of probe molecules. The following steps describe how perform MD simulations with probes using the command line interface (CLI).

  1. Set up the MD-with-probe simulation:

To set up a standard MD simulation with a protein target and probe molecules, use the following command.

${SILCSBIODIR}/md/probe/1_setup_md_boxes prot=<Protein PDB> probe=<probe to be added>

By default, 6 independent simulation systems will be created with 10 angstrom margin on all sides and neutralized by addition of counter ions.

The setup command offers same options as shown in step 1 of MD Simulations Using the CLI with additional options shown below. The full list of options can be reviewed by simply entering the command without any options.

  • The probe molecule. By default, benx (benzene) will be used as probe. SilcsBio also supports the use of prpx (propane), form (formamide), dmee (dimethylether), meoh (methanol), and imia (imidazole):

    probe=<probe to be added; default=benx>
    
  • Concentration of the probe (in M). The concentration of the probe within the simulation system can be specified using the conc keyword followed by the desired concentration in M. For example, add conc=0.25 for a probe concentration of 0.25 M. The default value is 1:

    conc=<concentration of probe; default=1 M>
    
  • Different concentration of the probe across independent simulations (in M). For MD simulations of a protein in the presence of different concentration of the probe, the diffconc=true flag will prompt the SilcsBio software to assign different concentrations to each simulation system. For example, including diffconc=true in conjunction with probe=benx, numsys=6, and conc="0 0.2 0.4 0.6 0.8 1" will result in 6 independent simulation systems of the protein in (1) the absence of benzene probes, and in the presence of (2) 0.2 M benzene, (3) 0.4 M benzene, (4) 0.6 M benzene, (5) 0.8 M benzene, and (6) 1 M benzene.

    diffconc=<true/false, true: numsys=6 & conc="0 0.2 0.4 0.6 0.8 1"; default=false>
    
  1. Run the MD simulation:

To perform MD simulations, use the following command.

${SILCSBIODIR}/md/2a_run_md prot=<Protein PDB> cycles=<number of cycles to run>

Optional parameter:

  • Duration of the MD simulation:

    cycles=<# of production cycles with 10 ns/cycle; default=20>
    

    By default, the MD simulations are run in continuous cycles of 10 ns. The duration of the simulation can be specified using the cycles keyword. The default value is set to 20, which will correspond to a continuous 200 ns simulation split into 20 trajectory files of 10 ns each.

For other parameters, see step 2 of MD Simulations Using the CLI. All parameters can also be viewed by entering ${SILCSBIODIR}/md/2a_run_md with no inputs.

  1. Collect the simulation trajectory:

The resulting MD simulation trajectory can be collected by using the following command.

${SILCSBIODIR}/md/2b_trjcat prot=<Protein PDB>

For parameters, see step 3 of MD Simulations Using the CLI. The parameters can also be viewed by entering ${SILCSBIODIR}/md/2b_trjcat with no inputs.

  1. Perform basic trajectory analyses:

The traj.xtc trajectory file will be analyzed to output the root mean squared deviation (RMSD) of the protein, per-residue root mean squared fluctuation (RMSF) of the protein, radius of gyration (RGYR) of the protein, interaction energies of the protein to the probe molecules, and solvent accessible surface area (SASA) of the protein.

${SILCSBIODIR}/md/3_analyze prot=<Protein PDB>

For information on the parameters and available trajectory analyses, see step 4 of MD Simulations Using the CLI. All available parameters and trajectory analysis can also be viewed by entering ${SILCSBIODIR}/md/3_analyze with no inputs.

The output data and plots will be collected in the 3_analyze directory (or the user-specified directory). This directory can be downloaded to a local machine for visualization and analysis.

  1. Identify pockets:

To quickly identify a pocket, use the mdpocket tool that comes with an open source project fpocket.

mdpocket --trajectory_file traj.xtc --trajectory_format xtc -f traj.pdb

The above command creates several files, however the output file, mdpout_dense_grid.dx, will contain all data necessary for identifying pockets. The mdpout_dense_grid.dx file contains the probability of an open cavity existing in each voxel. This information can conveniently be visualized using PyMOL or VMD.

For PyMOL, use the following commands:

load traj.pdb
load mdpout_dens_grid.dx
isosurface pocket, mdpout_dens_grid, <cutoff>

The higher the <cutoff> value is the volume becomes smaller as those voxels are more open during the simulation. Typically using 2 or 3 should give more transient pockets and 5 or 6 should give more open pocket.

  1. Clean and compress MD data:

To clean and compress MD data, see step 5 of MD Simulations Using the CLI.