back to Table of Contents

5. Energy function files and modifiers

5.1 Overview

For a more extensive description of the EGAD energy function, please examine the JMB paper (Pokala and Handel 2005), as well as Chapters 3 and 4 of my thesis. The description in the JMB paper is more up-to-date. For most routine work, the default values or values in the supplied energy function files should suffice. Briefly, protein design attempts to find sequences that maximize the energy gap between the target structure and alternative, non-target structures such as the unfolded state, aggregates, and molten globules (Figure 5.1.1).

A FORCEFIELD_FILE must be defined in the inputfile. Energy function filenames other than FORCEFIELD_FILE may also be defined in the inputfile, but do not need to be, since the included forcefield files already contain these filenames. Energy function files defined in the inputfile supercede filenames listed within energy function files. If any energy function modifiers are listed in the inputfile, the modifiers supercede the forcefield file and/or default values.

There are minimally five files required for the EGAD energy function:
         forcefield/atomparam file - describes vdW parameters, torsion classification, and atom solvation parameters for transfer free energies for each atomtype. Also contains global energy function parameters like Born radius calculation factors, empirical scale factors, and dielectric constant. It may also contain filenames for the other required files.
         torsion file - contains parameters for calculating energies between atoms that are three bonds apart
         resparam (residue parameter) file - describes atom and bonding topology for each residuetype and defines charges for each atom. Also contains reference state energies and sidechain-backbone vdW energy filters for each residuetype. May also contain the filename for the rotamer library file (superceded by the rotamer filename in the forcefield file, if any).
         rotamer library file - contains the rotamer dihedral angles.
         SASA points file - binary file that contains the boolean masks lookup table for calculating solvent accessible surface areas . Must generate a unique file for each architecture/ operating system. If this file does not exist or is not defined, the points will be generated automatically.

Files, file formats, and options are described in greater detail in the Appendix: Energy function file formats.

5.2 Energy function modifiers

There are several inputfile modifiers that may be used to alter the energy function. In all cases, the inputfile value supercedes the default or energy function file value.

These values are defined in the forcefield file and likely do not need to be adjusted except during energy function development. The defaults are listed:
VDW_RADII_SCALE                       1.0   # scale factor on vdw radius)
VDW_CUTOFF                         1000.0   # E ≥ VDW_CUTOFF are truncated to this
VDW_CLASH_ENERGY                      2.75  # EvdW ≥ VDW_CLASH_ENERGY defined as a clash
VDW_ATTRACTIVE_FACTOR                 1.0   # scale EvdW < 0
VDW_REPULSIVE_FACTOR                  1.0   # scale EvdW > 0
VDW_LINEAR_START_POINT                      # if not defined, standard lennard-jones is used
Ep                                    0.0   # protein dielectric; must be defined in the
                                            # forcefield file or in the inputfile
CORESURF_MEA_SASA                 26.0   # core-surface-interface SASA definition
CORESURF_MEA_BORN                  3.0   # core-surface-interface Born radius definition          
GENERAL_ASP                        0.0   # general atomic solvation parameter)   
HYDROPHOBIC_ASP                    0.0   # atomic solvation parameter for hphob)
WEIGHT_VDW                         1.0   # empirical weight on vdW energies)
WEIGHT_ELECTROSTATICS              1.0   # empirical weight on electrostatic energies)
WEIGHT_1_4                         1.0   # empirical weight on 1,4 energies)
OVERALL_ENERGY_SCALE               1.0   # empirical slope for stability prediction)
INTRA_ROTAMER_FLAG                   1   # 1=measure intra-rotamer energy)
UNSAT_HBOND_REFSTATE (-VDW_CUTOFF)       # reference state for unsatisfied ligand hbond groups
N_HBOND_UNSATISFIED_REFSTATE     (UNSAT_HBOND_REFSTATE) # ref.state for unsat bkbn N
H_HBOND_UNSATISFIED_REFSTATE     (UNSAT_HBOND_REFSTATE) # ref.state for unsat bkbn H
O_HBOND_UNSATISFIED_REFSTATE     (UNSAT_HBOND_REFSTATE) # ref.state for unsat bkbn O

These files must be defined in the inputfile if they are not defined in the forcefield file:
RESPARAM_FILE
ROTAMER_LIBRARY           # may also be defined in the resparam file)
SASA_POINTS_FILE          # if not defined anywhere, or does not exist, will be generated)

These are flags used to turn various terms on(1) or off(0). The defaults are listed:
VDW_FLAG                           1        # vdW
COULOMB_FLAG                       1        # coulombic electrostatics
TORSION_FLAG                       1        # 1,4 energies
SASA_FLAG                          1        # solvent-accessible surface area based energies
GB_FLAG                            1        # Generalized Born model
GBSA_FLAG
or SOLVATION_FLAG           1        # if 0, then GB_FLAG=0 and SASA_FLAG=0
HBOND_FLAG                         0        # angle-dependent hydrogen bonding from Dreiding
HBOND_SPECIFICITY_FLAG             1        # specificity is considered for designs

By default, charges on atoms in ionizable groups are defined to be the pH-dependent boltzmann average of the conjugate acid and base forms. CHARGES_PH_DEPENDENT_FLAG 1 # for pK calculations, default CHARGES_PH_DEPENDENT_FLAG 0

For JOBTYPE pK however, both forms are automatically explicitly considered; equivalent to setting CHARGES_PH_DEPENDENT_FLAG 0

If inclusion of rotamers extracted from the template structure (default) is not wanted, setting
AVOID_NATIVE_ROTAMER_FLAG 1      # default 0; native rotamers permitted
will ensure that native (i.e.: from the template structure) rotamers are not permitted in the move space. However, energies that include native rotamers will be calculated for the lookup table. This permits using the same lookup table files for calculations that either include or exclude native rotamers.

5.3 Description of energy function datastructures and input functions

5.3.1 Overview of energy function datastructures

The three main data structures for energy function information are the RESPARAM (residue parameter), ATOMPARAM (atomic parameter), and ROTAMERLIB (rotamer library) arrays. In general, functions in EGAD access RESPARAM or its substituent ATOMRESPARAM (atom-in-a-residue parameter) array. Therefore, ATOMPARAM and ROTAMERLIB elements are linked up to the appropriate RESPARAM or ATOMRESPARAM elements. Constituent elements of all these structures are described in further detail in the energy function file formats appendix and in structure_types.h.

There is an ATOMPARAM element for each forcefield atomtype. An ATOMPARAM element contains information such as the vdW, torsion, and surface-area parameters.

A RESPARAM element contains information about a residuetype, such as reference state energies, number of rotatable bonds, overall charge, pKa, etc. It contains the ATOMRESPARAM atom array, which contains an element for each atom in the residue. Each atom element contains information about an atom within a residue, such as its name, charge, and bond connectivity. atom also contains a pointer atom_ptr to the relevant ATOMPARAM element. For energy calculations, the mini_pdbATOM struct for an atom contains a pointer atom_ptr to the relevant ATOMRESPARAM element. The minipdb[i].atom_ptr->atom_ptr motif is therefore a common one. There are also a few pointer arrays that point to atom elements; these are used for sidechain-sidechain energy calculations during lookup table generation.

A RESPARAM element also contains a pointer rotamerlib_ptr to the relevant ROTAMERLIB element. ROTAMERLIB contains information about a rotamer library, such as the number of rotamers, their frequencies, etc. It also contains the ROTAMER rotamer array. Each rotamer element contains information about a rotamer, such as its probability and dihedral angles. It also contains structures that store internal energy and desolvation components for that rotamer.

5.3.2 Description of read_forcefield.cpp: read_forcefield

The energy function data are stored in protein->atomparam, protein->resparam, and protein->rotamerlib, which are sent as arguments to read_forcefield.cpp: read_forcefield by input_stuff.cpp: input_stuff. read_forcefield parses the FORCEFIELD_FILE defined in the inputfile, and then calls functions to parse and process the other energy function files (read_torsiondata, read_residuedata, read_rotamers, and SASA.cpp: SASA_readfiles). Pointers are set up to link the appropriate elements of the ATOMPARAM, RESPARAM, and ROTAMERLIB arrays to each other.

Many energies are precalculated at this point. Functions are called to pre-calculate tables for atompair vdW and electrostatics calculations (calculate_charge_products). These tables (CHARGE_PRODUCT, VDW_WELLDEPTH, VDW_SIGMA6_SCALE, VDW_WELLDEPTH_14, VDW_SIGMA6_14) save time by avoiding redundant multiplications of charges and square roots of vdW parameter products. Components of desolvation due to bound atoms whose relative conformations will remain unchanged are calculated (volume_bonded, energy_functions.cpp: born_bonded). Finally, the internal energies of sidechain rotamers are precalculated (energy_functions.cpp: intrinsic_rotamer_energy).

back to Table of Contents