EGAD Library Manual

Energy Function

Vacuum Energies

The vacuum components of the EGAD forcefield model the energies of molecules without the presence of solvent. It is useful to separate these components from those that consider solvent effects because the physics of molecular interactions in vacuum is well understood and relatively easy to calculate.

EGAD employs the OPLS all-atom forcefield to model its vacuum energies. This is a field similar to the AMBER or CHARMM fields used in molecular mechanics simulations. It consists of a simple Lennard-Jones term for Van-der-Waals (VDW) interactions and a coulombic term for electorstatics. Both of these components are atom-wise: that is, they can be broken down to a sum of comparisons between each atom in the molecules being compared.

Lennard-Jones

VDW forces are present in any interaction between two atoms. They have both an attractive and a repulsive component. At long distances, the attractive component acts to bring atoms closer together. Conversely, at close distances, the repulsive component acts to push them apart. Each atom type has an equilibrium radius at which the forces balance in an energy minimum.

A common approximation of the VDW potential is the Lennard-Jones (or 6-12) potential:

E=4*epsilon*(sigma^12/r^12-sigma^6/r^6)

Where sigma is the equilibrium raidus, r is the current distance between atoms, and epsilon is an empirical value called the well-depth. Computationally, one advantage of this potential is that the 12th power term (repulsive) is easy to calculate once the 6th power term (attractive) has been determined.

It is immediately obvious that there a number of parameters that we will have to load to use this energy term. We need to know the equilibrium radii and well-depths for every atom type compared to every other type. This is data that is taken from OPLS and resides in an input file called the Forcefield File. We used this file earlier to load data about what atom types were available.

In EGAD Library, the data for this energy term is stored internally in a VDWParams class. After initializing this class off of the Forcefield File, we can use it immediately to create a LennardJones energy term:

Example 3.1: Setting up the Lennard-Jones energy term

// Continue from example 2.9

// Load the VDW parameters from the Forcefield File
VDWParams vdwParameters;
vdwParameters.Load(forcefield, atomTypes);

// Create the Lennard-Jones energy term
LennardJones energyLJ(&myProtein, &vdwParameters);

You might notice that this is pretty easy, compared to setting up the physical model. Most of the hard work has already been done! Creating an energy function basically consists of telling your client application which energy terms you want it to use to score your protein conformations, and where it can find the data to do that scoring.

Important

Take note of the constructor for LennardJones. It accepts parameters by pointer. This is a convention in EGAD Library. It means that LennardJones is not making an internal copy of myProtein or vdwParameters. Instead, it will refer directly to those objects when making energy calculations. This also means that changes in myProtein and vdwParameters will be reflected in the behavior of LennardJones.

Scaling Equilibrium Radii

We mentioned earlier that the physical model makes approximations like using rotamers and a fixed backbone and that these approximations would have implications on our energy function. The black curve below shows a typical Lennard-Jones calculation for two atoms moving towards each other:

Lennard-Jones Graph

The important thing to note is the steepness of the repulsive force as the atoms approach each other. If the rotamer is even a tenth of an angstrom off, it could result in a huge difference in energy that would not be physically realistic (due to relaxation of the side-chain's conformation). The same holds for errors due to the fixed backbone approximation.

There are two common solutions to this problem, illustrated by the red and green curves. The green curve shows the same potential when the equilibrium radius is scaled down, effectively making the atoms smaller so they don't bump against each other as much. This has the advantage of having a direct physical interpretation, but the disadvantage that it can make some interactions that were previously unfavorable appear favorable.

If the goal is simply to scale down the unfavorable energies, a linearized repulsive term can be used (red curve). In this case, when the function crosses from favorable to unfavorable, the 12th power term is changed to a linear one.

The first approximation is easy to implement in the context of our previous example, while the second one requires the use of a separate class.

Example 3.2: Setting the equilibrium radius scale factor

// Continue from example 3.1

// Scale back our radii to 95% of their original values
energyLJ.SetScaleFactor(0.95);

Scaling the equilibrium radius is trivial. The real art is in finding an appropriate raidus to scale to. There is no good answer to this question, though multiple labs have found success using values between 90 and 100%. Currently, the best way to find a good scaling factor is to try more than one of them and see which one produces the best output. The importance of this scaling factor will also vary based on the other components of the forcefield.

Linearized Lennard-Jones

EGAD Library also contains an implementation of a linearized Lennard-Jones potential. This potential is derived from the one used in the original EGAD application. It is set up in a similar fashion to the normal Lennard-Jones.

Example 3.3: Setting up a linearized Lennard-Jones

// Continue from example 2.9

// Load the VDW parameters from the Forcefield File
VDWParams vdwParameters;
vdwParameters.Load(forcefield, atomTypes);

// Create the linearized Lennard-Jones term
LJ_Linearized energyLJLinear(&protein, &vdwParameters);

Of course, there are a number of arbitrary parameters associated with the linearized Lennard-Jones potential. Finding "reasonable" values for these is difficult. For more information about them, see the documentation for the LJ_Linearized class.

Coulombic

Electrostatic forces are important in determining protein structure. The most basic model of electrostatic interactions between two atoms is Coulomb's law:

F=(q1*q2)/(4*pi*epsilon*r^2)

Where q1 and q2 are the charges on both atoms, the constants on the bottom boil down to one number called Coulomb's constant, and r is the distance between the atoms. By doing some simple calculus, can turn into an equation for work/energy:

E=(q1*q2)/(4*pi*epsilon*r)

Like any good physicist, we ignore the sign change because its an arbitrary choice. Unlike VDW, the Coulombic energy term does not have a lot of empirical values. The only time Coulombic interactions get complicated is when they are not in a vacuum. When there is a uniform medium separating the two charges, it shields the Coulombic interaction. This is represented by a dielectric constant, D, in the denominator:

E=(q1*q2)/(4*pi*epsilon*D*r)

The dielectric for water (which shields charges very well) is around 80, while the dielectric for the interior of a protein (which does not shield charges very well) varies around 4 or 8.

In EGAD Library, the 'vacuum' energy term for Coulombic interactions includes this dielectric and defaults to 8, for the interior of a protein. This is the only empirical parameter to set when constructing a Coulombic energy term.

Example 3.4: Setting up the Coulombic energy term

// Continue from example 3.2

// Create the Coulombic energy term
Coulombic energyC(&myProtein);

// Set the dielectric constant to 8.0, even though it already
// defaults to that value. (For practice, of course)
energyC.SetDielectricConst(8.0);

Since we have a dielectric constant, this term isn't truly a 'vacuum' energy term. However, this term ignores the fact that there is a difference in dielectric between the interior of the protein and its hydrated surface. This requires more complicated terms described in the solvation section.

For very simple physical models, this energy term may be sufficient to describe the electrostatic interactions of the system. For most current protein design applications, it has to be augmented by the additional terms mentioned.

Torsional Correction

Both the VDW and Coulombic potentials assume that the atoms they are comparing do not interact through molecular bonds. This is a valid assumption for most pairs of atoms in the protein, but atoms that are connected by three or fewer bonds do not satisfy this assumption.

Atoms with two or fewer bonds are too close to be modeled accurately by these potentials and are ignored. Atoms with exactly three bonds between them (with two intervening atoms) are close enough to interact through bonds, but far enough away to be accurately modeled. OPLS scales these torsional energies down to half of their unbonded values and adds an empirical energy correction to them.

The scaling by half is done by the terms we have already set up, but the empirical correction is found in a third term. This reads parameters from a torsion parameter file. The relevant atom types for torsion files is a subset of those found in the Forcefield File, so it also reads the Forcefield File to find this mapping.

Example 3.5: Setting up the Torsional Correction energy term

// Continue from example 3.4

// Load the torsion parameter file from disk
std::ifstream finTorsions("opls.torsions");
TorsionFile torsions;
torsions.Load(finTorsions);
finTorsions.close();

// Load the parameters from the input files
TorsionalParams torParams;
torParams.LoadTorsions(torsions, atomTypes);
torParams.LoadTypesFromForcefield(forcefield, atomTypes);

// Set up the torsional correction energy term
TorsionalCorrection energyT(&myProtein, &atomTypes, &torParams);