SILCS Simulations ----------------- Background ^^^^^^^^^^ .. _fundamental_idea: Fundamental Idea Behind SILCS ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The design of small molecules that bind with optimal specificity and affinity to their biological targets, typically proteins, is based on the idea of complementarity between the functional groups in a small molecule and the binding site of the target. Traditional approaches to ligand identification and optimization often employ the one binding site/one ligand approach. While such an approach might be straightforward to implement, it is limited by the resource-intensive nature of screening and evaluating the affinity of large numbers of diverse molecules. Functional group mapping approaches have emerged as an alternative, in which a series of maps for different classes of functional groups encompass the target surface to define the binding requirements of the target. Using these maps, medicinal chemists can focus their efforts on designing small molecules that best match the maps. Site Identification by Ligand Competitive Saturation (SILCS) offers rigorous free energy evaluation of functional group affinity pattern for the entire 3D space in and around a macromolecule (globular protein :cite:`Guvench:2009`, membrane protein, or RNA :cite:`Kognole:2022`). The SILCS method yields functional group free energy maps, or FragMaps, which are precomputed and then used to rapidly facilitate ligand design. FragMaps are generated by molecular dynamics (MD) simulations that include macromolecule flexibility and explicit solvent/solute representation, thus providing an accurate, detailed, and comprehensive set of data that can be used in database screening, fragment-based drug design, and lead optimization of small molecules. In the context of biological therapeutics, the comprehensive nature of the FragMaps is of utility for excipient design as all possible binding sites of all possible excipients and buffers can be identified and quantified. .. _intro_silcs_fragmap: .. figure:: images/intro_silcs_fragmap.png :width: 80% Illustration of fragment density maps. :numref:`intro_silcs_fragmap` illustrates FragMaps generation. Two MD simulations of a probe molecule (e.g., benzene) are performed in the presence of a protein and in aqueous solution (without protein), resulting in two occupancy maps, :math:`O^\mathrm{target}` and :math:`O^\mathrm{bulk}`, respectively. The GFE (grid free energy) FragMaps are then derived by the following formula :cite:`Raman:2011`. .. math:: \mathrm{GFE}^T_{x,y,z} = -RT \log \frac{O^\mathrm{target}_{x,y,z}}{\left< O^\mathrm{bulk} \right>} By generating FragMaps using various small solutes (probes) that include a diversity of atoms and chemical functional groups, it is possible to map the functional group affinity pattern of a protein. FragMaps encompass the entire protein such that all FragMap types are in all regions. Thus, information on the affinity of all the different types of functional groups is available in and around the full 3D space of the protein or other target molecule. A Standard SILCS simulation uses benzene, propane, methanol, imidazole, formamide, dimethylether, methylammonium, acetate, and water as probes to generate Standard SILCS FragMaps. A Halogen SILCS simulation uses fluorobenzene, chlorobenzene, bromobenzene, fluoroethane, chloroethane, and trifluoroethane as probes to generate Halogen SILCS FragMaps. Halogen SILCS FragMaps are used to augment the information provided by Standard SILCS FragMaps. FragMaps may be used in a qualitative fashion to facilitate ligand design by allowing the medicinal chemist to readily visualize regions where the ligand can be modified or functional groups added to improve affinity and specificity. As the FragMaps include protein flexibility, they indicate regions of the target protein that can "open" thereby identifying regions under the protein surface accessible for ligand binding. In addition to FragMaps, an Exclusion Map is generated based on regions of the system that are not sampled at all by any probe molecules, including water, during the MD simulations. Therefore, the Exclusion Map represents a strictly inaccessible surface. Many proteins contain binding sites that are partially or totally inaccessible to the surrounding solvent environment that may require partial unfolding of the protein for ligand binding to occur. SILCS sampling of such deep or inaccessible pockets is facilitated by the use of a Grand Canonical Monte Carlo (GCMC) sampling technique in conjunction with MD simulations :cite:`Lakkaraju:2014`. Thus, the SILCS technology is especially well-suited for targeting the deep or seemingly inaccessible binding sites found in targets like GPCRs and nuclear receptors. Once a set of ligand-independent SILCS FragMaps are produced, they can be used for various purposes that rapidly rank ligand binding in a highly computationally efficient manner for multiple ligands **WITHOUT** recalculating the FragMaps. Applications of SILCS methodology include: * Binding site identification * Binding pocket searching with known ligands * Binding pocket identification via pharmacophore generation * Binding pocket identification via fragment screening * Database screening * SILCS 3D pharmacophore models * Ligand posing using available techniques (Catalyst, etc) * Ligand ranking * Fragment-based ligand design * Identification of fragment binding sites * Estimation of ligand affinity following fragment linking * Expansion of fragment types * Ligand optimization * Qualitative ligand optimization by FragMaps visualization * Quantitative evaluation of ligand atom contributions to binding * Quantitative estimation of relative ligand affinities * Quantitative estimation of large numbers of ligand chemical transformations SILCS generates 3D maps of interaction patterns of functional group with your target molecule, called FragMaps. This unique approach simultaneously uses multiple different small solutes with various functional groups in explicit solvent MD simulations that include target flexibility to yield 3D FragMaps that encompass the delicate balance of target-functional group interactions, target desolvation, functional group desolvation and target flexibility. The FragMaps may then be used to both qualitatively direct ligand design and quantitatively to rapidly evaluate the relative affinities of large numbers of ligands following the precomputation of the FragMaps. This chapter goes over the workflow for generating SILCS FragMaps. .. _FragMap_names_and_underlying_probe_atoms: FragMap names and underlying probe atoms ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Each FragMap generated from a SILCS simulation is the result of computing occupancies for one or more kinds of atoms associated with the SILCS probes. The tables below provide the complete lists for Standard SILCS and for Halogen SILCS simulations. .. list-table:: Standard SILCS FragMaps :widths: 25 50 :header-rows: 1 * - FragMap name - generated from probe atom(s) * - acec - acetate carboxylate carbon * - apolar - generic apolar carbons (benc+prpc) * - benc - benzene carbons * - dmeo - dimethyl ether oxygen * - forn - formamide nitrogen * - foro - formamide oxygen * - hbacc - generic hydrogen bond acceptors (foro+dmeo+imin) * - hbdon - generic hydrogen bond donors (forn+iminh) * - imin - imidizole nitrogen (h-bond acceptor) * - iminh - imidizole nitrogen (h-bond donor) * - mamn - methylamine nitrogen * - meoo - methanol oxygen * - prpc - propane carbon * - tipo - water oxygen .. list-table:: Halogen SILCS FragMaps :widths: 25 50 :header-rows: 1 * - FragMap name - generated from probe atom * - brbx - bromobenzene bromine * - clbx - chlorobenzene chlorine * - clex - chloroethane chlorine * - fetx - fluoroethane fluorine * - flbc - fluorbenzene fluorine * - tfec - trifluoroethane carbon bonded to fluorines The FragMap with the name "excl" is a special case. It is the SILCS Exclusion Map, and it enumerates all voxels where no probe molecule (including water) sampling occurred. For Globular Proteins ^^^^^^^^^^^^^^^^^^^^^ SILCS Simulations Using the SilcsBio GUI ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Please see :ref:`SILCS_simulation_from_the_GUI` as described in :doc:`../quickstart/quickstart_gui` for a step-by-step description for using the SilcsBio GUI for SILCS simulations. FragMap generation using SILCS begins with a well-curated protein structure file (without ligand) in PDB file format. We recommend keeping only those protein chains that are necessary for the simulation, removing all unnecessary ligands, renaming non-standard residues, filling in missing atomic positions, and, if desired, modeling in missing loops. Please contact support@silcsbio.com if you need assistance with protein preparation. A typical SILCS simulation can produce output totaling in excess of 100 GB, so please select a "Project Directory" folder on your server with appropriate free space. The default is determined by the "Project Directory" setting during :ref:`remote_server_setup`. The setup process automatically builds the topology of the simulation system, creating metal-protein bonds if metal ions are present, rotates side chain orientations to enhance sampling, and places probe molecules around the protein. If missing loops were not modeled into the input PDB file, the amino acids on either side of each missing loop will be automatically capped with neutral functional groups. .. TIP:: For users with access to both the SILCS-Small Molecule Suite and the CGenFF Suite, running standard MD simulation can be helpful to determine the stability of or refine the protein structure to be used as an initial structure for SILCS simulations. Specifically, this may be very helpful if the user has obtained the protein structure from homology modeling or alphafold. For more information on standard MD simulations with the CGenFF Suite, please refer to :doc:`../cgenff/md`. During setup, you will need to decide whether to run a Standard SILCS Simulation, Standard + Halogen SILCS Simulations, or a Halogen SILCS Simulation (with another project's Standard SILCS FragMaps). All functionality in the SILCS platform requires FragMaps from a Standard SILCS Simulation. FragMaps from Halogen SILCS Simulations can supplement Standard SILCS FragMaps for :doc:`../silcs/silcs-mc-optimization` and for :doc:`../silcs/silcs-mc-docking`. Visualization of Halogen SILCS FragMaps can be very useful for qualitatively informing ligand design (see :ref:`visualize_fragmaps`). See :ref:`FragMap_names_and_underlying_probe_atoms` for a complete listing of both Standard and Halogens SILCS probe molecules and the resulting FragMaps. .. note:: Running a Halogen SILCS simulation requires the same computation time as running a Standard SILCS simulation, so selecting "Standard + Halogen SILCS Simulations" will consume double the computational resources as selecting "Standard SILCS Simulation". The default is to generate and run 10 systems each for a Standard SILCS simulation and for a Halogen SILCS simulation. Therefore, if your computing resource has 10 nodes, it will take twice as much wall time to complete a "Standard + Halogen SILCS Simulations" job as to complete a "Standard SILCS Simulation". However, if you have access to 20 nodes simultaneously, the wall time will be the same, since each individual system is run independently of the others. While the same identical input protein coordinates are used for constructing each of these systems, each system is unique because the probe molecules are randomly placed. Additional diversity is added by default to initial coordinates by scrambling the sidechains of exposed amino acids. .. TIP:: Advanced Settings options during setup allow you to change the scrambling of sidechains (default: on), the number of systems (default: 10), the simulation temperature (default 298 K), and which residues are subjected to restraints on C-alpha atoms (default: all residues). We STRONGLY urge you to use these default settings, which have been selected based on extensive experience. It is also possible to choose an existing setup folder or an existing trajectory folder at this time. .. figure:: images/silcs-gui-new-project-with-pdb-advanced-settings_marked_border.png :width: 100% Once the systems are set up, you will need to start the SILCS simulations. There are separate tabs for running the Standard SILCS simulation and the Halogen SILCS simulation, and you will need to use each tab and the "Run..." button underneath that tab to run the corresponding simulation type. .. TIP:: Advanced Settings options allow you to change the number of systems that are run, as well as the number of processors used for each run and the length of each run. Leaving these fields blank will result in default values being used. We STRONGLY urge you to use the default settings. If you have concerns about the number of processors per run, please contact your system administrator. .. figure:: images/silcs-gui-simulation-advanced-settings_marked_border.png :width: 100% .. TIP:: If a simulation stops prematurely, the progress bar will turn red and show a "Restart" option. When the job is restarted, completed cycles will be automatically skipped and the progress will update to where the job had stopped. That is to say, restarting a failed or stopped job will *not* cause you to lose existing job progress. .. _SILCS_simulation_using_cli: SILCS Simulations Using the CLI ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Please see :ref:`SILCS_simulation_from_the_command_line` as described in :doc:`../quickstart/quickstart_cli` for a step-by-step description for using the Command Line Interface (CLI) for SILCS simulations. This section covers additional parameters and options that may be included in your SILCS simulations. FragMap generation using SILCS begins with a well-curated protein structure file (without ligand) in PDB file format. We recommend keeping only those protein chains that are necessary for the simulation, removing all unnecessary ligands, renaming non-standard residues, filling in missing atomic positions, and, if desired, modeling in missing loops. Please contact support@silcsbio.com if you need assistance with protein preparation. 1. **Set up the SILCS simulations:** :: ${SILCSBIODIR}/silcs/1_setup_silcs_boxes prot= .. warning:: The setup program internally uses the GROMACS utility ``pdb2gmx``, which may have problems processing the protein PDB file. The most common reasons for errors at this stage involve mismatches between the expected residue name/atom names in the input PDB and those defined in the CHARMM force-field. To fix this problem: Run the ``pdb2gmx`` command manually from within the ``1_setup`` directory for a detailed error message. The setup script already prints out the exact commands you need to run to see this message. :: cd 1_setup pdb2gmx –f _4pdb2gmx.pdb –ff charmm36 –water tip3p –o test.pdb –p test.top -ter -merge all Once all the errors in the input PDB are corrected, rerun the setup script. The setup command offers a number of options. The full list of options can be reviewed by simply entering the command without any input. *Required parameter*: * Path and name of the input protein PDB file: :: prot= .. TIP:: For users with access to both the SILCS-Small Molecule Suite and the CGenFF Suite, running standard MD simulation can be helpful to determine the stability of or refine the protein structure to be used as an initial structure for SILCS simulations. Specifically, this may be very helpful if the user has obtained the protein structure from homology modeling or alphafold. For more information on standard MD simulations with the CGenFF Suite, please refer to :doc:`../cgenff/md`. *Optional parameters*: * Number of independent simulation systems: :: numsys=<# of simulations; default=10> SILCS simulations, by default, consist of 10 replicate simulations, each with different initial velocities, to enhance sampling. The number of simulations can be adjusted using the ``numsys`` keyword. * Skip the ``pdb2gmx`` GROMACS utility: :: skip_pdb2gmx= * Path and name of bound cofactor Mol2: :: lig= If the input structure contains a cofactor and you wish to compute your FragMaps in the presence of the cofactor, remove the cofactor from the input PDB file and provide a separate Mol2 file of the cofactor in the correct position and orientation with respect to the target protein. Provide the path and name of the cofactor Mol2 file using the ``lig`` keyword .. note:: If your input structure contains a metal ion, please see :ref:`faq_setup_metal` * Application of weak harmonic positional restraints: :: core_restraints_string= By default, a weak harmonic positional restraint, with a force constant of ~0.1 kcal/mol/Å\ :superscript:`2`\ , is applied to all C-alpha atoms during SILCS simulations. This option lets you modify this default. For example, if ``core_restraints_string="r 17-160 | r 168-174"`` is supplied, then only the C-alpha atoms in residues 17 to 160 and residues 168 to 174 will be restrained. * Re-orientation of residue side chains: :: scramblesc= When set to ``true`` the ``scramblesc`` keyword re-orients the side chain dihedrals of the protein residues. * Simulation box size: :: margin= By default, the simulation system size is determined by adding a 15 Å margin to the size of the input protein structure. The size of the simulation box can be customized by using the ``margin`` keyword. * SILCS probes with halogen-containing functional groups: :: halogen= SILCS simulations can be performed with SILCS probes that contain halogen-containing functional groups by including ``halogen=true`` in the commandline. See :ref:`silcs_fragmap_halogen` for more details. 2. **Submit the SILCS GCMC/MD jobs to the queue:** Following completion of the setup, run 10 GCMC/MD jobs using the following script: :: ${SILCSBIODIR}/silcs/2a_run_gcmd prot= This script will submit 10 jobs to the predefined queue. Each job runs out 100 ns of GCMC/MD with the protein and the solute molecules. This command offers a number of options. The full list of options can be reviewed by simply entering the command without any input. *Required parameter*: * Path and name of the input protein PDB file: :: prot= *Optional parameters*: * Simulation temperature: :: temp= The input simulation temperature should be entered in Kelvin units. By default, the temperature is set to 298 K. * Number of cores per simulation: :: nproc=<# of cores per simulation; default=8> * Generate input files without submitting jobs to the queue: :: batch * Use halogen probes: :: halogen= The SILCS simulation will be performed with halogen probes when ``halogen=true``. For more information, please refer to :ref:`silcs_fragmap_halogen`. .. TIP:: SILCS simulations are run at a temperature of 298 K by default, and most published work with SILCS has used this default temperature. It is possible to set a custom temperature by adding the ``temp=`` option to the ``2a_run_gcmd`` command. For example, ``temp=310`` would run the SILCS simulations at 310 K, which may enhance protein conformational sampling. Once the simulations are complete, the ``2a_run_gcmd/[1-10]`` directories will contain ``*.prod.100.rec.xtc`` trajectory files. If these files are not generated, then your simulations are either still running or stopped due to a problem. Look in the log files within these directories to diagnose problems. To check job progress, use the following command: :: ${SILCSBIODIR}/silcs/check_progress This will provide a summary list of the SILCS jobs consisting of the job path, the job number, the task ID, the job status, and the current/total number of GCMC/MD cycles. Job status values are: Q [queued], R [running], E [successfully ended], F [failed], NA [not submitted]. 3. **Generate FragMaps:** Once your simulations are done, generate the FragMaps using the following command: :: ${SILCSBIODIR}/silcs/2b_gen_maps prot= *Required parameter*: * Path and name of the input protein PDB file: :: prot= *Optional parameters*: * Simulation box size: :: margin= * Generate halogen maps: :: halogen= If the SILCS simulations were performed with halogen probes, then use ``halogen=true``. For more information, please refer to :ref:`silcs_fragmap_halogen`. * Reference structure: :: ref= By default, the FragMaps will be oriented around the protein structure that was used as the initial structure for the SILCS simulations. The FragMaps can also be oriented around a different structure by specifying it with ``ref=``. * Generate protein side chain maps: :: protmap= The probability distribution of protein side chains can also be calculated and stored in protein side chain maps. When ``protmap=true``, this information will be calculated and stored. * Spacing between voxels: :: spacing= The probabiliy of each functional group occupying a voxel is used to calculate GFE values. By default, the spacing between each voxel is 1.0 Å. * First cycle number to use for map generation: :: begin= * Last cycle number to use for map generation: :: end= * Collect and merge trajectory: :: traj= * Specification that v2023.x or older was used for ``2a_run_gcmd``: :: oldversion= * Rewrite ``.xtc`` files acording to selection: :: rewritextc= The above command will submit 10 single-core jobs that will build occupancy maps (FragMaps) spanning the simulation box for select probe molecule atoms representing different functional groups. For a ~50K atom simulation system, this step takes approximately 10-20 minutes to complete. The FragMaps have a default grid spacing of 1 Å. The next step is to combine the occupancy FragMaps generated from individual simulations and convert them into GFE FragMaps. Each voxel in a GFE FragMap contains a Grid Free Energy (GFE) value for the probe type that was used to create the corresponding occupancy FragMap. :: ${SILCSBIODIR}/silcs/2c_fragmap prot= *Required parameter*: - Path and name of the input protein PDB file: :: prot= *Optional parameters*: * Generate protein side chain maps: :: protmap= The probability distribution of protein side chains can also be calculated and stored in protein side chain maps. When ``protmap=true``, this information will be calculated and stored. * Skip overlap coefficient calculation: :: skipoc= * Generate halogen maps: :: halogen= If the SILCS simulations were performed with halogen probes, then use ``halogen=true``. For more information, please refer to :ref:`silcs_fragmap_halogen`. * Generate maps in CNS format: :: cns= * First cycle number to use for map generation: :: begin= * Last cycle number to use for map generation: :: end= GFE FragMaps will be created in the ``silcs_fragmap_/maps`` directory, and PyMOL and VMD scripts to load these file will be created in the ``silcs_fragmap_`` directory. Additionally, the convergence of the SILCS simulations will be assessed by calculating the overlap coefficient between the occupancy maps generated from the first and second halves of the runs. The output is written to the ``overlap_coeff.dat`` file. 4. **Cleanup:** Raw trajectory output files are large. To reduce disk usage due to these files, use the cleanup command: :: ${SILCSBIODIR}/silcs/2d_cleanup prot= *Required parameter*: - Path and name of input protein PDB file: :: prot= *Optional parameters*: - Number of simulation systems: :: numsys=<# of simulations; default=10> - Specification that halogen probes were used in ``2a_run_gcmd``: :: halogen= - Delete original files: :: delete= The ``2d_cleanup`` command deletes unnecessary files, such as those from the equilibration phase, and deletes hydrogen atoms from production trajectory files for each run directory, ``1/``, ``2/``, etc. It then creates a ``.tar.gz`` file of that directory. That is to say, the contents of ``1.tar.gz`` and ``1/`` after running cleanup are identical. You may choose to remove either one of those without loss of information, though we strongly recommend comparing the output of ``tar tvzf .tar.gz`` with the output of ``find -type f``, where you replace ```` with ``1``, ``2``, etc. and confirm the contents are identical before deleting anything. .. _silcs_fragmap_halogen: Setup with halogen probes ~~~~~~~~~~~~~~~~~~~~~~~~~ SILCS probes with halogen-containing functional groups can be useful for extending the diversity of chemical space covered by FragMaps. While a common approach is to consider halogen atoms as non-polar groups, research on non-covalent "halogen bonding" suggests halogen atoms warrant special treatment. The CHARMM General Force Field CGenFF v.2.3.0+ allows halogen atoms to be treated with lonepairs to achieve directionality in polar interactions. These force field improvements are included in SILCS simulations from version 2020.1 of the SilcsBio software. Halogen-protein interactions are determined using a set of halogenated probes in SILCS simulations: chlorobenzene, fluorobenzene, bromobenzene, chloroethane, fluoroethane, and trifluoroethane. To use these halogenated SILCS probes in your simulation instead of the standard SILCS probes, add the ``halogen=true`` keyword: :: ${SILCSBIODIR}/silcs/1_setup_silcs_boxes prot= halogen=true ${SILCSBIODIR}/silcs/2a_run_gcmd prot= halogen=true ${SILCSBIODIR}/silcs/check_progress_x ${SILCSBIODIR}/silcs/2b_gen_maps prot= halogen=true ${SILCSBIODIR}/silcs/2c_fragmap prot= halogen=true ${SILCSBIODIR}/silcs/2d_cleanup prot= halogen=true This will create folders ending with ``_x`` (i.e., ``1_setup_x/``, etc.) to denote SILCS-Halogen simulation systems, as opposed to standard SILCS simulations. The resulting halogen FragMaps will be included in the ``silcs_fragmaps_/maps`` folder. To use halogen FragMaps in SILCS-MC jobs, refer to the ``halogen=true`` option in :doc:`silcs-mc-docking`. .. note:: For SILCS-membrane, ``${SILCSBIODIR}/silcs/1_setup_silcs_boxes`` will be replaced with ``${SILCSBIODIR}/silcs-memb/1a_fit_protein_in_bilayer`` and ``${SILCSBIODIR}/silcs-memb/1b_setup_silcs_with_prot_bilayer``. If you encounter a problem, please contact support@silcsbio.com. Resuming stopped SILCS jobs ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Situations such as exceeding workload manager (job queue) walltime limits can cause SILCS jobs to stop before running to completion. Resuming such jobs from where they stopped is straightforward. On the server where the job was running, go to the directory ``/2a_run_gcmd/i``, where ``i`` is an integer from 1 to 10 and corresponds to the stopped job among the ten SILCS GCMC/MD jobs that were being run for that particular target. In that directory, directly use the file ``job_mc_md.cmd`` to submit the job to the workload manager, for example ``qsub job_mc_md.cmd`` for Sun Grid Engine (SGE) and ``sbatch job_mc_md.cmd`` for Slurm. To ensure a successful restart, make sure to issue any required extra setup commands (for example, as listed in the "Extra setup" field under :menuselection:`Settings and remote server configuration` from the SilcsBio GUI Home screen) such as ``export LD_LIBRARY_PATH=`` or ``module load `` before submitting the job. .. _silcs_gpcr_usage: For Membrane Proteins ^^^^^^^^^^^^^^^^^^^^^ SILCS-Membrane Simulations Using the SilcsBio GUI ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Running SILCS-Membrane using the SilcsBio GUI requires a bare protein structure for best results. SilcsBio provides a utility to embed transmembrane proteins, such as GPCRs, in a bilayer of 9:1 POPC:cholesterol for subsequent SILCS simulations. If you wish to use your own pre-built protein–bilayer system with SILCS, please run SILCS with the CLI. 1. **Begin a New SILCS-Membrane project:** To begin a new SILCS-Membrane project, select :guilabel:`New SILCS project` from the Home page as described in :ref:`SILCS_simulation_from_the_GUI` and select "Membrane Protein". 2. **Enter a project name, select the remote server, and upload input files:** Enter a project name and select the remote server where the SILCS-Membrane simulation jobs will run. Additionally, provide the protein input file. You may choose the protein input file from the computer where you are running the SilcsBio GUI ("localhost") or from any server you have previously configured, as described in :ref:`file_and_directory_selection`. .. figure:: images/silcs-membrane-gui-new-project_marked_border.png :width: 100% 3. **Build the membrane protein system:** Once all information is entered correctly, press the "Build Membrane Protein" button at the bottom of the page. This will execute the utility to build the protein/bilayer system, with the bilayer consisting of 9:1 POPC:chelesterol. By default, this utility will align the first principal axis of the protein with the bilayer normal (Z-axis) and translate the protein center of mass to the center of the bilayer (Z=0). Additionally, the default system size is set to 120 Å along the X and Y dimensions. These settings can be changed under "Build Settings". After the protein is inserted into the bilayer, the resulting protein/bilayer system will be output in a PDB file with suffix ``_popc_chol``, and the page will update. Click on the green "Setup Successful" button to return to the GUI. .. TIP:: Build Settings options during setup allow you to change the alignment of the first principal axis of the protein to the Z-axis before fitting the protein to the bilayer (default: on), the bilayer size in the X dimension (default 120 Å), the bilayer size in the Y dimension (default 120 Å), and the offset of the relative position of the protein with respect to the position of the bilayer (default 0 Å). The offset of the protein may be selected in accordance with a recommendation from an external program such as the OPM server. Please see :ref:`faq_silcs-memb_offset_z` for additional information. .. figure:: images/silcs-membrane-initial-setup-successful_marked_border.png :width: 100% 4. **Review the built membrane protein:** After the protein is inserted into the bilayer, the GUI will prompt you to review the membrane protein. Click the "Review membrane protein" button to proceed. The GUI will then show you the modeled protein/bilayer system through a visualization window. Please confirm that the protein/bilayer system is correct. After reviewing the membrane protein, click "Setup" to proceed .. figure:: images/silcs-membrane-gui-review-membrane-protein_marked_border.png :width: 100% .. figure:: images/silcs-membrane-gui-review-membrane-protein-view_marked_border.png :width: 100% 5. **Set up the SILCS simulations:** After reviewing the protein/bilayer system, the GUI will allow you to finalize setting up the SILCS simulations. During this step, you will need to decide whether to run a Standard SILCS Simulation, Standard + Halogen SILCS Simulations, or a Halogen SILCS Simulation (with another project's Standard SILCS FragMaps). All functionality in the SILCS platform requires FragMaps from a Standard SILCS Simulation. FragMaps from Halogen SILCS Simulations can supplement Standard SILCS FragMaps for :doc:`../silcs/silcs-mc-optimization` and for :doc:`../silcs/silcs-mc-docking`. Visualization of Halogen SILCS FragMaps can be very useful for qualitatively informing ligand design (see :ref:`visualize_fragmaps`). See :ref:`FragMap_names_and_underlying_probe_atoms` for a complete listing of both Standard and Halogens SILCS probe molecules and the resulting FragMaps. .. note:: Running a Halogen SILCS simulation requires the same computation time as running a Standard SILCS simulation, so selecting "Standard + Halogen SILCS Simulations" will consume double the computational resources as selecting "Standard SILCS Simulation". The default is to generate and run 10 systems each for a Standard SILCS simulation and for a Halogen SILCS simulation. Therefore, if your computing resource has 10 nodes, it will take twice as much wall time to complete a "Standard + Halogen SILCS Simulations" job as to complete a "Standard SILCS Simulation". However, if you have access to 20 nodes simultaneously, the wall time will be the same, since each individual system is run independently of the others. To proceed to preparing the simulation box, click "Setup". This will submit a job on the server to prepare the SILCS simulation box by adding water and probe molecules to the protein/bilayer system. The GUI will indicate if the job was successfully launched on the server with a "Setup Successful" button. Click the "Setup Successful" button to proceed. .. figure:: images/silcs-membrane-gui-setup-view_marked_border.png :width: 100% .. figure:: images/silcs-membrane-gui-setup-submission-successful_marked_border.png :width: 100% The SILCS simulation setup progress from the server will be displayed on the GUI. Once the SILCS simulation box is successfully set up, and the job is complete, a green checkmark will appear next to the progress bar. Upon completion of the SILCS simulation setup, click the "Proceed for Standard SILCS" button to continue. .. figure:: images/silcs-membrane-gui-setup-jobcomplete_marked_border.png :width: 100% 6. **Launch SILCS simulations:** The SILCS GCMC/MD simulations can now be launched by clicking the "Run Standard SILCS Simulation" button. The SILCS GCMC/MD will begin with a 6-step pre-equilibration to slowly relax the bilayer followed by 10 ns of relaxation of the protein into the bilayer prior to the production simulation. The GUI will indicate when the SILCS jobs are successfully launched on the server with a green "SILCS Simulations Submitted" button. At this point, the jobs will continue running on the server, and it is safe to exit the GUI. .. TIP:: Membrane protein simulation systems are typically large and may require significant storage space (> 200 GB) and simulation time (>14 days with GPU). .. figure:: images/silcs-membrane-gui-runstandard_marked_border.png :width: 100% .. figure:: images/silcs-membrane-gui-submitted_marked_border.png :width: 100% FragMap generation from the GCMC/MD simulations data follows the same protocol as for non-bilayer systems (i.e., for globular proteins). Refer to previous instructions as described in :ref:`SILCS_simulation_from_the_GUI`. If you encounter a problem, please contact support@silcsbio.com. SILCS-Membrane Simulations Using the CLI ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. **Prepare the input protein/bilayer system:** You may use either a bare protein structure or your own pre-built protein/bilayer system with SILCS. If you are beginning with a bare protein, SilcsBio provides a command line utility to embed transmembrane proteins, such as GPCRs, in a bilayer of 9:1 POPC/cholesterol for subsequent SILCS simulations: :: ${SILCSBIODIR}/silcs-memb/1a_fit_protein_in_bilayer prot= This command will align the first principal axis of the protein with the bilayer normal (Z-axis) and translate the protein center of mass to the center of the bilayer (Z=0). The full list of parameters can be reviewed by running the command without any arguments. *Required parameter*: - Path and name of the input PDB file: :: prot= *Optional parameters*: - Apply principal axis-based alignment: :: orient_principal_axis= When set to true, first principal axis of the protein is aligned to Z-axis before fitting the protein in bilayer. If the protein is already oriented as desired, ``orient_principal_axis=false`` will suppress the automatic principal axis-based alignment. - Simulation system size: :: bilayer_x_size= bilayer_y_size= By default the system size is set to 120 Å along the X and Y dimensions. To change the system size, use the ``bilayer_x_size`` and ``bilayer_y_size`` parameters. - Position of protein with respect to the bilayer: :: offset_z= By default, SILCS fits the bare protein into the lipid bilayer such that the center of mass of the protein is aligned with the center of mass of the bilayer. This may not be the best choice for your membrane proteins, and you may therefore want to adjust the position of your protein according to a recommendation from an external program such as the OPM server. In this case you can adjust the relative position of the protein with respect to the position of the bilayer by using the ``offset_z`` option to specify the distance by which the center of mass of the protein should be offset in Z-direction from the center of the bilayer. .. note:: For example, to use OPM server (https://opm.phar.umich.edu/ppm_server) output as your guide to build a transmembrane system for SILCS simulation, please see :ref:`faq_silcs-memb_offset_z` - Distance cutoff defining steric clashes: :: cutoff= To avoid unfavorable steric clashes between the protein and the modeled bilayer, bilayer molecules within the distance defined by the ``cutoff`` parameter of the protein atoms will be deleted. The resulting protein/bilayer system will be output in a PDB file with suffix ``_popc_chol``. It is important to use molecular visualization software at this stage to confirm correct orientation of the protein and size of the bilayer before using this output as the input in the next step. Alternatively, you may use a protein/bilayer system that has been built using other software tools. 2. **Prepare the SILCS simulation box:** The following command will prepare the SILCS simulation box by adding water and probe molecules to the protein/bilayer system: :: ${SILCSBIODIR}/silcs-memb/1b_setup_silcs_with_prot_bilayer prot= .. TIP:: You may use ``batch=false`` option to run this step on your head node. Note that it can sometimes take few hours to finish this step for very large systems. *Required parameter*: - Path and name of the input PDB file: :: prot= *Optional parameters*: - Number of independent simulation systems: :: numsys=<# of simulations; default=10> - Skip the ``pdb2gmx`` GROMACS utility: :: skip_pdb2gmx= - Path and name of bound cofactor Mol2: :: lig= If the input structure contains a cofactor and you wish to compute your FragMaps in the presence of the cofactor, remove the cofactor from the input PDB file and provide a separate Mol2 file of the cofactor in the correct position and orientation with respect to the target protein. Provide the path and name of the cofactor Mol2 file using the ``lig`` parameter. .. note:: If your input structure contains a metal ion, please see :ref:`faq_setup_metal` - Application of weak harmonic positional restraints: :: core_restraints_string= By default, a weak harmonic positional restraint, with a force constant of ~0.1 kcal/mol/Å2, is applied to all C-alpha atoms during SILCS simulations. This option lets you modify this default. For example, if ``core_restraints_string="r 17-160 | r 168-174"`` is supplied, then only the C-alpha atoms in residues 17 to 160 and residues 168 to 174 will be restrained. - Re-orientation of residue side chains: :: scramblesc= When set ``scramblesc=true``, the side chain dihedrals of the protein residues will be re-oriented. - SILCS probes with halogen-containing functional groups: :: halogen= SILCS simulations can be performed with SILCS probes that contain halogen-containing functional groups by including ``halogen=true``. See :ref:`silcs_fragmap_halogen` for more details. - Submit job to queuing system: :: batch= This step is submitted to the queuing system by default. To run this step on the head node, use ``batch=false``. 3. **Run SILCS-Membrane GCMC/MD simulations:** SILCS GCMC/MD simulations can now be launched with the following command: :: ${SILCSBIODIR}/silcs/2a_run_gcmd prot= memb=true *Required parameter*: * Path and name of the input protein PDB file: :: prot= * Follow SILCS-simulation protocol for membrane proteins: :: memb=true The ``memb`` option must be set to ``true`` for SILCS simulations of membrane proteins. *Optional parameters*: * Simulation temperature: :: temp= The input simulation temperature should be entered in Kelvin units. By default, the temperature is set to 298 K. * Number of cores per simulation: :: nproc=<# of cores per simulation; default=8> * Generate input files without submitting jobs to the queue: :: batch * Use halogen probes: :: halogen= The SILCS simulation will be performed with halogen probes when ``halogen=true``. For more information, please refer to :ref:`silcs_fragmap_halogen`. The SILCS GCMC/MD will begin with a 6-step pre-equilibration to slowly relax the bilayer followed by 10 ns of relaxation of the protein into the bilayer prior to the production simulation. .. TIP:: Membrane protein simulation systems are typically large and can require significant storage space (> 200 GB) and simulation time (> 14 days with GPUs). 3. **Generate FragMaps:** FragMap generation from the GCMC/MD simulations data follows the same protocol as for non-bilayer systems. Refer to previous instructions for ``$SILCSBIODIR/silcs/2b_gen_maps`` and ``$SILCSBIODIR/silcs/2c_fragmap``. If you encounter a problem, please contact support@silcsbio.com. For RNA ^^^^^^^ Background ~~~~~~~~~~ SILCS-RNA has been developed to tailor the SILCS methodology towards targeting RNA molecules with small molecules. The fundamental idea of complementarity between the functional groups in a small molecule and the binding site of the target, originally developed for protein targets, has been further developed and validated for RNA targets. Extensions to the method include an enhanced oscillating excess chemical potential protocol for the Grand Canonical Monte Carlo calculations and individual simulations of the neutral and charged solutes from which the SILCS functional group affinity maps (FragMaps) are calculated for subsequent binding site identification, pharmacophore, docking etc. calculations. Development and validation of SILCS-RNA has taken into consideration the complexities of RNA tertiary structure and the effects of the highly negatively charged phosphate backbone on sampling of neutral and charged probes. Full details have been published in :cite:`Kognole:2022`. SILCS-RNA Simulations Using the SilcsBio GUI ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The steps for running SILCS-RNA with SilcsBio GUI are: 1. **Begin a new SILCS-RNA project:** To begin a new SILCS-RNA project, select :guilabel:`New SILCS project` from the Home page as described in :ref:`SILCS_simulation_from_the_GUI` and select "RNA". 2. **Enter a project name, select the remote server, and upload input files:** Enter a project name and select the remote server where the SILCS-RNA simulation jobs will run. Additionally, provide the RNA input file. You may choose the RNA input file from the computer where you are running the SilcsBio GUI ("localhost") or from any server you have previously configured, as described in :ref:`file_and_directory_selection`. If the GUI detects missing non-hydrogen atoms, non-standard residue names, or non-contiguous residue numbering, it will inform the user and provide a button labeled "Fix?". If this button is clicked a new PDB file with these problems fixed and with the suffix ``_fixed`` added to the base name will be created and used. The SilcsBio GUI will not model residues that are completely missing from the PDB or loops. .. warning:: SILCS-RNA requires a clean input RNA PDB file for the best results. The input RNA PDB file should meet the following criteria: * The input file should NOT include ion, solvent, or water molecules * The input file should only contain one MODEL. For example, PDB files from NMR studies may contain multiple models. In this case, the user must remove unwanted models and keep only one. * The input file should NOT contain any alternate conformations of residues. .. figure:: images/silcs-rna-gui-new-project-with-pdb_marked_border.png :width: 100% 3. **Set up SILCS simulations:** Click the "Setup" button at the bottom of the page. The GUI will contact the remote server and perform the SILCS GCMC/MD setup process. During setup, the program automatically performs several steps: The topology of the simulation system is built and probe molecules are inserted around the RNA target. This process will produce 20 PDB files required as initial structures for the simulations and can take up to 10 minutes depending of the system size. In contrast to standard SILCS with protein targets, SILCS-RNA simulations split the GCMC/MD simulations into two separate sets, one having neutral probes and the other having charged probes. A green "Setup Successful" button will appear once the process has been successfully completed. Click the green "Setup Successful" button to proceed. .. figure:: images/silcs-rna-gui-setup_marked_border.png :width: 100% 4. **Launch SILCS simulations:** Your SILCS-RNA GCMC/MD simulations can now be started by clicking the "Run SILCS Simulation" button. This will submit 20 compute jobs to the queueing system on the server. The GUI will indicate when the SILCS jobs are successfully launched on the server with a green "SILCS Simulations Submitted" button. At this point, the jobs will continue running on the server, and it is safe to exit the GUI. .. TIP:: Before running your SILCS simulations, you may wish to double check that you have selected the desired file folder on the remote server and that it has 100+ GB of storage space available. .. figure:: images/silcs-rna-gui-simulation_marked_border.png :width: 100% .. figure:: images/silcs-rna-gui-simulation-submitted_marked_border.png :width: 100% 5. **Generate FragMaps:** Once your SILCS-RNA simulations are finished, the GUI can be used to create FragMaps and visualize them. Green progress bars indicate that the simulations have been successfully completed. Once all progress bars are green, the "Prepare FragMap" button will appear. .. figure:: images/silcs-rna-gui-simulation-completed_marked_border.png :width: 100% Press the "Prepare FragMap" button, and on the next screen, confirm your "Reference PDB File". FragMaps will be created in the coordinate reference frame of this file. Generally, this reference PDB file is the same as the initial input PDB file. A different PDB file of the same RNA structure may be used to generate FragMaps relative to that orientation. Click "Generate FragMap" to generate FragMaps and download them to the local computer for visualization. .. figure:: images/silcs-rna-gui-generate-fragmaps_marked_border.png :width: 100% .. note:: SILCS-RNA currently does not support Halogen SILCS simulations. Processing the GCMC-MD trajectories will take 10-20 minutes, with the GUI providing updates on progress from the server during this process. Once completed, the GUI will automatically copy the files from the server onto the local computer and load them for visualization. Please see :ref:`visualize_fragmaps` for details on how to use the SilcsBio GUI, as well as external software (MOE, PyMol, VMD), to visualize SILCS FragMaps. As with FragMaps for protein targets, your SILCS-RNA FragMaps for RNA targets can be used for detecting hotspots and performing fragment-based drug design (:doc:`../silcs/silcs-hotspots`), creating pharmacophore models (:doc:`../silcs/silcs-pharm`), docking ligands and refining existing docked poses (:doc:`../silcs/silcs-mc-docking`), and optimizing lead compounds (:doc:`../silcs/silcs-mc-optimization`). SILCS-RNA Simulations Using the CLI ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. **Prepare the input RNA PDB file:** SILCS-RNA requires a clean input RNA PDB file for best results. The input RNA PDB file should meet the following criteria: * The input file should NOT include ion, solvent, or water molecules * The input file should only contain one MODEL. For example, PDB files from NMR studies may contain multiple models. In this case, the user must remove unwanted models and keep only one. * The input file should NOT contain any alternate conformations of residues. 2. **Set up the SILCS-RNA simulations:** :: ${SILCSBIODIR}/silcs-rna/1_setup_silcs_boxes prot= To determine if the setup is complete, check that the 20 PDB files required for the simulations are available using the command ``ls 1_setup_*/*_silcs.*.pdb`` It is also a good idea to visualize these files to confirm the presence of the RNA molecule and the SILCS solute molecules (probes) within a box of water molecules. In contrast to standard SILCS with protein targets, SILCS-RNA simulations split the GCMC/MD simulations into two separate sets, one having neutral probes (``1_setup_neutral``) and the other having charged probes (``1_setup_charged``). The setup command offers a number of options. The full list of parameters can be reviewed by running the command without any arguments. *Required parameter*: * Path and name of the input RNA PDB file: :: prot= *Optional parameters*: * Number of independent simulation systems per set of probes: :: numsys=<# of simulations; default=10> SILCS simulations, by default, consist of 10 replicate simulations per set of probes, each with different initial velocities, to enhance sampling. SILCS-RNA simulations consist of two sets of probes, one set of uncharged probes and one set of charged probes. Thus, with both sets of probes having 10 replicate simulations, by default, 20 simulation systems will be set up (2 set of probes * 10 simulations = 20 simulations). The number of simulations can be adjusted using the ``numsys`` keyword, and the resulting number of simulations will be 2 * ``numsys``. * Skip the ``pdb2gmx`` GROMACS utility: :: skip_pdb2gmx= * Sets of probe molecules: :: sets= Due to the highly negatively charged phosphate backbone of RNAs, SILCS-RNA uses two sets of probe molecules (one set of charged probes and one set of neutral probes) by default. The use of only one set of probe molecules may be specified with the ``sets`` keyword. * Path and name of bound ligand Mol2: :: lig= If the input structure contains a bound ligand and you wish to compute your FragMaps in the presence of the ligand, remove the ligand from the input PDB file and provide a separate Mol2 file of the ligand in the correct position and orientation with respect to the target protein. Provide the path and name of the ligand Mol2 file using the ``lig`` keyword. * Simulation box size: :: margin= By default, the simulation system size is determined by adding a 15 Å margin to the size of the input protein structure. The size of the simulation box can be customized by using the ``margin`` keyword. .. note:: As in standard SILCS simulations, a weak harmonic positional restraint, with a force constant of ~0.1 kcal/mol/Å\ :superscript:`2`\ , is applied to all MG, C1', URA-N3, CYT-N3, ADE-N1 and GUA-N1 atoms during simulations. **The option to change this default is not available in the current version of SILCS-RNA**. .. warning:: The setup program internally uses the GROMACS utility ``pdb2gmx``, which may have problems processing the RNA PDB file. The most common reasons for errors at this stage involve mismatches between the expected residue name/atom names in the input PDB and those defined in the CHARMM force-field. To fix this problem: Run the ``pdb2gmx`` command manually from within the ``1_setup_neutral`` AND the ``1_setup_charged`` directories for detailed error messages. For guidance on syntax, the setup script already prints out the exact commands you need to run: :: cd 1_setup_neutral pdb2gmx –f _4pdb2gmx.pdb –ff charmm36 –water tip3p –o test.pdb –p test.top -ter -merge all cd ../1_setup_charged pdb2gmx –f _4pdb2gmx.pdb –ff charmm36 –water tip3p –o test.pdb –p test.top -ter -merge all cd .. Once all errors in the input PDB are corrected, rerun the setup script with ``skip_pdb2gmx=true``. .. note:: SILCS-RNA currently does not support Halogen SILCS simulations. 3. **Submit the SILCS-RNA GCMC/MD simulations:** Following completion of the setup, start the 20 GCMC/MD jobs: :: ${SILCSBIODIR}/silcs-rna/2a_run_gcmd prot= *Required parameter*: - Path and name of the input RNA PDB file: :: prot= *Optional parameters*: - Simulation temperature: :: temp= The input simulation temperature should be entered in Kelvin units. By default, the temperature is set to 298 K. - Number of cores per simulation: :: nproc=<# of cores per simulation; default=8> - Generate input files without submitting jobs to the queue: :: batch This script will submit 20 jobs to the predefined queue. Each job runs 100 ns of GCMC/MD with the RNA and the solute molecules; 10 jobs are with neutral solute molecules (probes), and 10 are with charges solute molecules (probes). .. TIP:: SILCS simulations are run at a temperature of 298 K by default. It is possible to set a custom temperature by adding the ``temp=`` option to the ``2a_run_gcmd`` command. For example, ``temp=310`` would run the SILCS simulations at 310 K. Once the simulations are complete, the ``2a_run_gcmd_neutral/[1-10]`` AND ``2a_run_gcmd_charged/[1-10]`` directories will contain ``*.prod.100.rec.xtc`` trajectory files. If these files are not generated, then your simulations are either still running or stopped due to a problem. Look in the log files within these directories to diagnose problems. To check job progress, use the following command: :: ${SILCSBIODIR}/silcs-rna/check_progress This will provide a summary list of the SILCS jobs consisting of the job path, the job number, the task ID, the job status, and the current/total number of GCMC/MD cycles. Job status values are: Q [queued], R [running], E [successfully ended], F [failed], NA [not submitted]. 4. **Generate FragMaps:** Once your simulations are done, generate the SILCS FragMaps using the following command: :: ${SILCSBIODIR}/silcs-rna/2b_gen_maps prot= *Required parameter*: - Path and name of the input RNA PDB file: :: prot= *Optional parameters*: - Simulation box size: :: margin= - Reference structure: :: ref= By default, the FragMaps will be oriented around the protein structure that was used as the initial structure for the SILCS simulations. The FragMaps can also be oriented around a different structure by specifying it with ``ref=``. - Spacing between voxels: :: spacing= The probabiliy of each functional group occupying a voxel is used to calculate GFE values. By default, the spacing between each voxel is 1.0 Å. - First cycle number to use for map generation: :: begin= - Last cycle number to use for map generation: :: end= This script will submit 20 jobs to the predefined queue. Each job runs 100 ns of GCMC/MD with the RNA and the solute molecules; 10 jobs are with neutral solute molecules (probes), and 10 are with charges solute molecules (probes). This will submit 20 single-core jobs that will build occupancy maps (FragMaps) spanning the simulation box for select solute molecule atoms representing different functional groups. For a ~50K atom simulation system, this step takes approximately 10-20 minutes to complete. The FragMaps have a default grid spacing of 1 Å. The next step is to combine the occupancy FragMaps generated from individual simulations and convert them into GFE FragMaps. Each voxel in a GFE FragMap contains a Grid Free Energy (GFE) value for the probe type that was used to create the corresponding occupancy FragMap. :: ${SILCSBIODIR}/silcs-rna/2c_fragmap prot= *Required parameter*: - Path and name of the input RNA PDB file: :: prot= *Optional parameters*: - Simulation temperature: :: temp= If the simulation temperature was specified for ``2a_run_gcmd``, please specify the same temperature for this step using the ``temp`` parameter. The input simulation temperature should be entered in Kelvin units. By default, the temperature is set to 298 K. - Skip overlap coefficient calculation: :: skipoc= - Generate maps in CNS format: :: cns= - First cycle number to use for map generation: :: begin= - Last cycle number to use for map generation: :: end= GFE FragMaps will be created in the ``silcs_fragmap_/maps`` directory, and PyMOL and VMD scripts to load these file will be created in the ``silcs_fragmap_`` directory. Please see :ref:`visualize_fragmaps` for details on how to use the SilcsBio GUI, as well as external software (MOE, PyMol, VMD), to visualize SILCS FragMaps. As with FragMaps for protein targets, your SILCS-RNA FragMaps for RNA targets can be used for detecting hotspots and performing fragment-based drug design (:doc:`../silcs/silcs-hotspots`), creating pharmacophore models (:doc:`../silcs/silcs-pharm`), docking ligands and refining existing docked poses (:doc:`../silcs/silcs-mc-docking`), and optimizing lead compounds (:doc:`../silcs/silcs-mc-optimization`). 5. **Cleanup:** Raw trajectory output files are large. To reduce disk usage due to these files, use the cleanup command: :: ${SILCSBIODIR}/silcs-rna/2d_cleanup prot= *Required parameter*: - Path and name of input RNA PDB file: :: prot= *Optional parameters*: - Number of simulation systems: :: numsys=<# of simulations; default=10> - Delete original files: :: delete= The ``2d_cleanup`` command deletes unnecessary files, such as those from the equilibration phase, and deletes hydrogen atoms from production trajectory files for each run directory, ``1/``, ``2/``, etc. It then creates a ``.tar.gz`` file of that directory. That is to say, the contents of ``1.tar.gz`` and ``1/`` after running cleanup are identical. You may choose to remove either one of those without loss of information, though we strongly recommend comparing the output of ``tar tvzf .tar.gz`` with the output of ``find -type f``, where you replace ```` with ``1``, ``2``, etc. and confirm the contents are identical before deleting anything.