back to Table of Contents
5. Energy function files and
modifiers
5.1 Overview
For a more extensive description of the EGAD energy function, please
examine the JMB paper (Pokala
and Handel 2005), as well as Chapters 3 and 4 of my thesis.
The description in the JMB paper is more up-to-date. For most routine
work, the default values or values in the supplied energy function
files should suffice. Briefly, protein design attempts to find
sequences that maximize the energy gap between the target structure and
alternative, non-target structures such as the unfolded state,
aggregates, and molten globules (Figure 5.1.1).
A FORCEFIELD_FILE must be defined in the
inputfile. Energy function filenames other than FORCEFIELD_FILE
may also be defined in the inputfile, but do not need to be, since the
included forcefield files already contain these filenames. Energy
function files defined in the inputfile supercede filenames listed
within energy function files. If any energy function modifiers are
listed in the inputfile, the modifiers supercede the forcefield file
and/or default values.
There are minimally five files required for the EGAD energy function:
forcefield/atomparam file - describes vdW parameters, torsion
classification, and atom solvation parameters for transfer free
energies for each atomtype. Also contains global energy function
parameters like Born radius calculation factors, empirical scale
factors, and dielectric constant. It may also contain filenames for the
other required files.
torsion file - contains parameters
for calculating energies between atoms that are three bonds apart
resparam (residue parameter) file - describes atom and bonding
topology for each residuetype and defines charges for each atom. Also
contains reference state energies and sidechain-backbone vdW energy
filters for each residuetype. May also contain the filename for the
rotamer library file (superceded by the rotamer filename in the
forcefield file, if any).
rotamer library file - contains the rotamer dihedral angles.
SASA points file - binary file that contains the boolean masks
lookup table for calculating solvent accessible surface areas . Must
generate a unique file for each architecture/ operating system. If this
file does not exist or is not defined, the points will be generated
automatically.
Files, file formats, and options are described in greater detail in the
Appendix: Energy function file
formats.
5.2 Energy function modifiers
There are several inputfile modifiers that may be used to alter the
energy function. In all cases, the inputfile value supercedes the
default or energy function file value.
These values are defined in the forcefield file and likely do not need
to be adjusted except during energy function development. The defaults
are listed:
VDW_RADII_SCALE
1.0 #
scale factor on vdw radius)
VDW_CUTOFF
1000.0
# E ≥ VDW_CUTOFF are truncated to this
VDW_CLASH_ENERGY
2.75 # EvdW ≥ VDW_CLASH_ENERGY defined as a
clash
VDW_ATTRACTIVE_FACTOR
1.0 # scale EvdW < 0
VDW_REPULSIVE_FACTOR
1.0 # scale EvdW > 0
VDW_LINEAR_START_POINT
# if not defined, standard lennard-jones is used
Ep
0.0 # protein dielectric; must be defined in the
# forcefield file or in the inputfile
CORESURF_MEA_SASA
26.0 # core-surface-interface SASA definition
CORESURF_MEA_BORN
3.0 # core-surface-interface Born radius definition
GENERAL_ASP
0.0 # general atomic solvation parameter)
HYDROPHOBIC_ASP
0.0 # atomic
solvation parameter for hphob)
WEIGHT_VDW
1.0 # empirical weight on vdW energies)
WEIGHT_ELECTROSTATICS
1.0 # empirical
weight on electrostatic energies)
WEIGHT_1_4
1.0 # empirical weight on 1,4 energies)
OVERALL_ENERGY_SCALE
1.0 # empirical
slope for stability prediction)
INTRA_ROTAMER_FLAG
1 # 1=measure intra-rotamer energy)
UNSAT_HBOND_REFSTATE (-VDW_CUTOFF)
# reference state for unsatisfied ligand hbond groups
N_HBOND_UNSATISFIED_REFSTATE
(UNSAT_HBOND_REFSTATE) # ref.state for unsat bkbn N
H_HBOND_UNSATISFIED_REFSTATE
(UNSAT_HBOND_REFSTATE) # ref.state for unsat bkbn H
O_HBOND_UNSATISFIED_REFSTATE
(UNSAT_HBOND_REFSTATE) # ref.state for unsat bkbn O
These files must be defined in the inputfile if they are
not defined in the forcefield file:
RESPARAM_FILE
ROTAMER_LIBRARY
# may also be defined in the resparam file)
SASA_POINTS_FILE
# if not defined anywhere, or does not exist, will be generated)
These are flags used to turn various terms on(1) or off(0). The
defaults are listed:
VDW_FLAG
1
# vdW
COULOMB_FLAG
1
# coulombic electrostatics
TORSION_FLAG
1
# 1,4 energies
SASA_FLAG
1
# solvent-accessible surface area based energies
GB_FLAG
1
# Generalized Born model
GBSA_FLAG or SOLVATION_FLAG
1
# if 0, then GB_FLAG=0 and SASA_FLAG=0
HBOND_FLAG
0
# angle-dependent hydrogen bonding from Dreiding
HBOND_SPECIFICITY_FLAG
1
# specificity is considered for designs
By default, charges on atoms in ionizable groups are
defined to be the pH-dependent boltzmann average of the conjugate acid
and base forms. CHARGES_PH_DEPENDENT_FLAG
1 # for pK calculations, default
CHARGES_PH_DEPENDENT_FLAG 0
For JOBTYPE pK however, both forms are
automatically
explicitly considered; equivalent to setting CHARGES_PH_DEPENDENT_FLAG 0
If inclusion of rotamers extracted from the template structure
(default) is not wanted, setting
AVOID_NATIVE_ROTAMER_FLAG
1
# default 0; native rotamers permitted
will ensure that native (i.e.: from the template
structure) rotamers are not permitted in the move space. However,
energies that include native rotamers will be calculated for the lookup
table. This permits using the same lookup table files for calculations
that either include or exclude native rotamers.
5.3 Description of energy function datastructures
and input functions
5.3.1 Overview of energy function
datastructures
The three main data structures for energy function information are the RESPARAM (residue parameter), ATOMPARAM (atomic parameter), and ROTAMERLIB (rotamer library) arrays. In general,
functions in EGAD access RESPARAM or its
substituent ATOMRESPARAM
(atom-in-a-residue parameter) array. Therefore, ATOMPARAM
and ROTAMERLIB elements are linked up to
the appropriate RESPARAM or ATOMRESPARAM
elements. Constituent elements of all these structures are described in
further detail in the energy function file formats appendix and in structure_types.h.
There is an ATOMPARAM element for each
forcefield atomtype. An ATOMPARAM element
contains information such as the vdW, torsion, and surface-area
parameters.
A RESPARAM element contains
information about a residuetype, such as reference state energies,
number of rotatable bonds, overall charge, pKa, etc. It contains the ATOMRESPARAM atom
array, which contains an element for each atom in the residue. Each atom element contains information about an atom
within a residue, such as its name, charge, and bond connectivity. atom also contains a pointer atom_ptr to the relevant ATOMPARAM
element. For energy calculations, the mini_pdbATOM
struct for an atom contains a pointer atom_ptr
to the relevant ATOMRESPARAM element. The minipdb[i].atom_ptr->atom_ptr motif
is therefore a common one. There are also a few pointer arrays that
point to atom elements; these are used for sidechain-sidechain energy
calculations during lookup table generation.
A RESPARAM element also contains a pointer
rotamerlib_ptr to the relevant ROTAMERLIB element. ROTAMERLIB
contains information about a rotamer library, such as the number of
rotamers, their frequencies, etc. It also contains the ROTAMER rotamer
array. Each rotamer
element contains information about a rotamer, such as its probability
and dihedral angles. It also contains structures that store internal
energy and desolvation components for that rotamer.
5.3.2 Description of read_forcefield.cpp: read_forcefield
The energy function data are stored in protein->atomparam,
protein->resparam, and protein->rotamerlib, which are sent as
arguments to read_forcefield.cpp: read_forcefield
by input_stuff.cpp: input_stuff. read_forcefield parses the FORCEFIELD_FILE
defined in the inputfile, and then calls functions to parse and process
the other energy function files (read_torsiondata,
read_residuedata, read_rotamers,
and SASA.cpp: SASA_readfiles). Pointers
are set up to link the appropriate elements of the ATOMPARAM,
RESPARAM, and ROTAMERLIB
arrays to each other.
Many energies are precalculated at this point. Functions are called to
pre-calculate tables for atompair vdW and electrostatics calculations (calculate_charge_products). These tables (CHARGE_PRODUCT, VDW_WELLDEPTH, VDW_SIGMA6_SCALE,
VDW_WELLDEPTH_14, VDW_SIGMA6_14)
save time by avoiding redundant multiplications of charges and square
roots of vdW parameter products. Components of desolvation due to bound
atoms whose relative conformations will remain unchanged are calculated
(volume_bonded, energy_functions.cpp:
born_bonded). Finally, the internal energies of sidechain
rotamers are precalculated (energy_functions.cpp:
intrinsic_rotamer_energy).
back to Table of Contents