Protein Effective Charge Estimator

Protein charge contributes to a range of solution behaviors including protein aggregation and solution viscosity. The charge of a protein is a system property, dependant on environmental factors such as pH, buffer composition, and ionic strength. Conventionally, protein charge is computationally estimated using sequence data and a set of pKa values for proton dissociation from ionizable groups. However, this does not account for the contributions of bound solute molecules and ions. The SilcsBio software provides the utility to estimate the effective charge of a protein accounting for bound excipients, buffers, and/or ions. Currently, the monoatomic ions supported are sodium, potassium, chloride, fluoride, bromide, and iodide.

  1. Run SILCS-Hotspots calculations with ion-specific params-file:

    To estimate the effective charge of a protein, the user must first identify binding sites of the excipients, buffers, and/or ions in the solution using the first two steps of SILCS-Hotspots (1_setup_silcs_hotspots and 2_collect_hotspots). Users may follow the instructions outlined in SILCS-Hotspots. Note that SILCS FragMaps of the protein of interest must be generated to run SILCS-Hotspots.

    If the ions under investigation include potassium, fluoride, bromide, iodide, nitrate, or thiocyanate, a different template parameters file is required. The custom parameter file can be specified by adding paramsfile=$SILCSBIODIR/utils/python/excipients/charge_est/params_ions.tmpl to the 1_setup_silcs_hotspots command.

  2. Calculate the estimated protein effective charge:

    After running 2_collect_hotspots, use the following command to calculate the protein effective charge for a given solution:

    python $SILCSBIODIR/utils/python/excipients/charge_est/charge_calc.py \
         --protein_charge <sequence-based protein charge> \
         --cation <cation name> --anion <anion name> \
         --cation_conc <cation concentration (mM)> --anion_conc <anion concentration (mM)>
    

    Required parameters:

    • Sequence-based charge of the protein:

      --protein_charge <sequence-based protein charge>
      

      The sequence-based charge of the protein may be calculated from external software such as PropKa.

    • Names of the cation and anion:

      --cation <cation name>
      --anion <anion name>
      
    • Concentrations of the cation and anion in mM:

      --cation_conc <cation concentration (mM)>
      --anion_conc <anion concentration (mM)>
      

    Optional parameters:

    • Name of the buffer molecule:

      --buf <buffer name>
      
    • Name(s) of the excipient(s):

      --excipient <excipient name>
      --excipient_2 <second excipient name>
      --excipient_3 <third excipient name>
      

      Note that the protein effective charge estimator is able to accomodate up to 3 excipients.

    • Concentration of the buffer molecule:

      --buffer_conc <buffer concentration>
      
    • Concentration of the excipient(s):

      --excipient_conc <excipient concentration>
      --excipient_2_conc <second excipient concentration>
      --excipient_3_conc <third excipient concentration>
      
    • Path and name of the SILCS-Hotspots directory:

      --hotspots_dir <path and name of hotspots directory>
      

      By default, --hotspots_dir will be set to 4_hotspots.

    • Charge of the cation:

      --cation_charge <cation charge>
      

      By default, --cation_charge will be set to 1.

    • Charge of the anion:

      --anion_charge <anion charge>
      

      By default, --anion_charge will be set to -1.

    • Charge of the buffer:

      --buffer_charge <buffer charge>
      

      By default, --buffer_charge will be set to 0.

    • Charge of the excipient(s):

      --excipient_charge <excipient charge>
      --excipient_2_charge <second excipient charge>
      --excipient_3_charge <third excipient charge>
      

      By default, --excipient_charge, --excipient_2_charge, and --excipient_3_charge will be set to 0.

    • Occupancy cutoff of the buffer:

      --bufocc <buffer occupancy cutoff>
      

      By default, --bufocc will be set to 0.9.

    • Occupancy cutoff of the excipient(s):

      --excocc <excipient occupancy cutoff>
      

      By default, --excocc will be set to 0.9. This cutoff will be applied to all excipients.

    • Occupancy cutoff of the cation:

      --catocc <cation occupancy cutoff>
      

      By default, --catocc will be set to 0.70 for sodium or 0.85 for potassium. If the cation is not sodium or potassium, then --catocc must be specified.

    • Occupancy cutoff of the anion:

      --anocc <anion occupancy cutoff>
      

      By default, --anocc will be set to 0.70 for chloride, 0.75 for fluoride, 0.67 for bromide, 0.60 for iodide, 0.67 for nitrate, and 0.60 for thiocyanate. If the anion is none of the listed anions, then --anocc must be specified.

    • Clustering radius for the cation:

      --catrad <cation clustering radius>
      

      By default, --catrad will be set to 2.75 for sodium or 3.25 for potassium. If the cation is not sodium or potassium, then --catocc must be specified.

    • Clustering radius for the anion:

      --anrad <anion clustering radius>
      

      By default, --anrad will be set to 3.75 for chloride, 3.50 for fluoride, 4.25 for bromide, 4.50 for iodide, 3.75 for nitrate, and 4.50 for thiocyanate. If the anion is none of the listed anions, then --anrad must be specified.