back to Table of Contents
9. Pair energy lookup table
generation
9.1 Overview
In order to make energy calculations for rotamer optimization
efficient, EGAD employs a pair energy lookup table. The EGAD energy
function has been parameterized for pairwise decomposition, permitting
approximation of intrinsically non-pairwise energies, such as continuum
electrostatics and surface-area dependent solvation . For rotamer optimization, rotamer-rotamer and
rotamer-backbone energies are pre-calculated and stored in a lookup
table. These pairwise partial energies were summed as needed during the
optimization to determine the total energy (Figure
9.1.1).
9.1.1 Rotamer
complexity, disk and memory usage
Protein design is a complex combinatorial optimization problem. For
total designs, the log10(complexity) scales linearly with
the number of positions allowed to change amino acid identity (Figure 9.1.1.1a). The number of
combinations range from 1060 combinations for 26 fully
variable residues to 10424 combinations for 194 fully
variable residues. Despite these enormous numbers, as discussed below, the rotamer optimization
methods in EGAD are able to identify the lowest energy optimal sequence
for many of these problems.
Memory and disk usage scales with the number of variable positions (and
thus, the log10(complexity)). In the worst case, storing the
pairwise lookup table should scale quadratically with the number of
moving positions. However, since many rotamer pairs are too far apart
to interact, the actual scaling exponent is ~1.3. The memory usage
scales from 93MB for 26 fully variable positions to just over 1GB for
194 fully variable positions; these values are well within the memory
range of inexpensive mass-produced computers, suggesting that even very
large problems can be addressed without specialized equipment (Figure 9.1.1.1b). The disk
usage for these problems are significantly larger (up to 5GB for 194
fully variable positions), but are a small fraction of standard
hard-drive sizes (Figure 9.1.1.1c).
9.2 Inputfile options
9.2.1 LOOKUP_TABLE_DIRECTORY
If LOOKUP_TABLE_DIRECTORY is not defined,
it defaults to
temp_lookup_directory.pid/
(pid = process ID
number)
If the program runs to completion successfully, this directory is
automatically removed. However, if there is a crash or premature exit,
this directory must be removed manually. For most situations, it is
recommended that the lookup table be saved to a defined location.
LOOKUP_TABLE_DIRECTORY
directory_path/directory_name
The directory path directory_path must
exist, and must be write-accessible. However, the directory directory_name within directory_path
need not exist a priori; it can be created by the program.
For example, suppose the directory /homedir/joeuser/lookup_tables
exists. Then,
LOOKUP_TABLE_DIRECTORY
/homedir/joeuser/lookup_tables/srcSH2
will create the directory srcSH2
within the directory /homedir/joeuser/lookup_tables/.
Since these directory trees can often become quite large (hundreds of
MB to >GB), it is recommended that the location be defined, and that
the write-accessible area large enough for the problem at hand. Plots
of protein size vs. disk usage are shown for the worst case scenario of
total design. For most uses however, the required disk space will be
much smaller.
If at all possible, it is advantageous to have the lookup table
directory on a disk local to the host performing the calculation, (or
the master for some parallelized jobs). Although this is not necessary,
the network overhead required to access a remote disk can be
significant.
Saving to disk permits concurrent or subsequent processes to reuse the
same data, assuming that the forcefield and template pdb structure are
identical. This can save a tremendous amount of CPU time. Each process
loads the data it needs from the disk. If required data is not
available, a process can calculate it, and save it to disk, permitting
other processes to use it. This scheme permits straightforward
parallelization of lookup table generation and utilization, as
discussed later.
9.2.2 PRECALCULATION_LEVEL
For some jobs, it may not be necessary to calculate or load the entire
lookup table into memory at once. For these cases, set
PRECALCULATION_LEVEL 0
This will calculate and/or load only the sidechain-backbone
energies. The sidechain-sidechain energies (which scale O(n1-2)
), are calculated and/or loaded as needed. As in the default case
(complete precalculation), these sidechain-sidechain energies are saved
to disk.
For almost all non-parallelized jobs, all the sidechain-sidechain pair
energies are likely to be considered at some point during the run, so
there is no significant advantage to using this option. Therefore, this
option is used almost exclusively for launching parallel rotamer
calculation foremen (see multistate design and scanning mutagenesis
sections).
9.2.3 JOBTYPE
LOOKUP_TABLE_SLAVE
The lookup table will be loaded/calculated for any rotamer optimization
job. However, there may be cases in which all that is wanted is the
lookup table. For these cases, set
JOBTYPE LOOKUP_TABLE_SLAVE
This causes the program to exit upon completion of lookup table
generation, instead of starting a rotamer optimization.
9.3 Brief description of lookup table generation
If LOGFILE_FLAG 1 (default), a file pid.lookup_log (pid=
process ID number) is created. It serves as a progress monitor. It
exists while the lookup table is being loaded/calculated, and is
automatically deleted upon completion. As the lookup table
reading/writing progresses, the files pid.lookup_log.0.25,
pid.lookup_log.0.5, and pid.lookup_log.0.75 are created, signaling that
the process has completed 25%, 50%, and 75% of the job respectively. If
multiple processes are using the same disk, the existence of pid.lookup_log files can signal that a process
is still in the disk-access intensive stage of lookup table generation,
and the relative state of progress; deletion of these files upon
completion can act as a signal to another process to start its own
disk-access intensive lookup table generation.
If a pre-existing lookup table directory is being used, the LOOKUP_TABLE_DIRECTORY/forcefield_info file is
checked. Processes that share the same lookup table can have different
residuetypes at variable positions, or even completely different
variable and fixed positions. However, these processes must have
identical forcefield parameters and use the same template structure. If
there is any discrepancy at all, the program will exit with an ERROR message pointing out the problem. If a
process is creating a new table, it will create the directory tree and
write the forcefield info file.
This file, like all the lookup table files, is a binary file, and is
therefore architecture-dependent; one of the forcefield_info checks is
for architecture equivalence. Note that SGI's and x86 machines cannot
share these files. While it might have been more flexible to save these
datafiles as text, the parsing would have been much slower, and the
utilization of disk space significantly greater.
The sidechain-backbone energies are loaded. If these energies are not
available for a particular residue at a given position, the coordinates
and energies are calculated and saved. Sidechain rotamers that have a
vdW interaction energy with the fixed backbone greater than the maximum
defined in the resparam file are filtered out, since it is unlikely
(but not guaranteed) that these are present in the lowest energy
solution. These data are all saved to disk.
At this point, reference state energies are subtracted from the
sidechain-backbone energies. Since it is possible that different
reference state model may be tested or used with the same template
structure and forcefield, these referenced energies are NOT saved to
disk, thus permitting the same lookup table on disk to be shared by
processes that employ different reference state models. The subtraction
of reference state energies adds trivially to the time.
The interaction list that describes what residuetypes at other
positions a given residuetype at a given position interacts with is
loaded. If such a list file does not exist, it is created. If such a
file exists, but does not contain information about new residuetypes at
other positions that were not present in the earlier process that
created it, the file is updated and saved. This interaction list
information is used to guide calculation and memory allocation for
sidechain-sidechain energies.
Sidechain-sidechain pair energy calculation is perhaps the most time
and memory intensive step. For many problems, it is even longer than
the actual optimization itself. Be patient! After this is done, the pid.lookup_log files are deleted.
The data stored in these files are not significantly compressible;
running gzip -r on
lookup table
directories result in negligible changes ( ≤ 0.5%)
in disk usage (assayed by du -sk).
9.4 Parallelization of lookup table generation
The lookup table can be calculated in either a
serial or a parallel
manner. Parallelization of lookup table generation can speed things up
tremendously for large problems, like total design. In general, for any
problem with more than 20 variable positions which have all
residuetypes as options, parallelization for lookup table generation is
a significant speedup. For smaller problems, such as single-sequence
rotamer optimization of a small protein, the speedup may be smaller due
to network overhead and built-in delays designed to prevent process
collisions. For small jobs ( ≤ 1020
rotamer combinations), EGAD automatically switches to serial mode.
Details on running parallel jobs are given in the Parallelization of EGAD jobs section.
9.4.1 Brief description of parallelized
lookup table generation
The master process creates a number of slave inputfiles (inputfilename.master_pid.slave.input, master_pid
= process ID number of master process) in the current working directory
(these will be deleted when the parallel phase is completed). The
master process launches (via ssh)
independent foremen processes to each of the defined slave hosts. Each
foreman process in turn launches child slave processes which perform
the actual calculations.
During the calculation, the a slave inputfile (inputfilename.master_pid.slave.input)
is mv'd to inputfilename.master_pid.slave.working
to indicate that it is being worked on by a slave process. If the slave
job exits prematurely (but gracefully), the inputfilename.master_pid.working
file is mv'd back to inputfilename.master_pid.slave.input,
putting it back in the queue. After a job corresponding to a slave
inputfile is finished, the slave working file is mv'd
to inputfilename.master_pid.slave.done.
The master process uses the existence of files with these suffixes as
progress meters. After all the slave inputfiles are done (in reality,
the master has timers to prevent hung slave jobs from stalling
everything else), the master deletes all the slave inputfiles, and
loads the entire lookup table into memory, filling missing parts (if
any). Finally, the rotamer optimization job defined in the inputfile is
run, as in the single-processor version. The foremen processes kill
themselves automatically after all the slave input files have been
deleted.
IO_ERRORs may pop up during this process.
These are usually benign, and are likely due to process collisions in
which one of slave process will die, or master/foreman jobs trying to
rename or delete files that have been renamed or deleted by another
process.
9.5 Description of lookup table generation code
9.5.1 Description of LOOKUP_ENERGY
datastructures
A schematic of the organization of the lookup table is shown in Figure 9.5.1.1. Please also look at the
appropriate entries in structure_types.h.
The PROTEIN struct contains the LOOKUP_ENERGY lookupEnergy array, which contains
an element for each non-P/G position i. protein->lookupEnergy[i] contains within it
the LOOKUP_ENERGY_RESIDUE lookupRes array,
which contains an element for each residue i_res
permitted at i.
lookupEnergy[i].lookupRes[i_res] contains
information about residue choice i_res,
such as the average surface areas for the rotamers, and whether this
residuetype should be considered to be fixed for the purpose of adding
up pair energies (fixed_flag). It also
contains the roots for two trees. LOOKUP_ENERGY_RESIDUE_RESIDUE
lookupResRes contains information about what residues j_res at other positions j>i interact with residue i,i_res.
This tree exists only transiently, and is cleared from memory before
loading/calculating sidechain-sidechain pair energies. The LOOKUP_ENERGY_ROTAMER lookupRot array contains
an element for each rotamer i_res_rot.
lookupEnergy[i].lookupRes[i_res].lookupRot[i_res_rot]
(ERESROT) contains information about
rotamer i_res_rot. It has ROTAMER rotamer, which has information about
this rotamer. energy_var_fix contains the
actual energy between this rotamer and the fixed atoms, minus the
reference state energy. It also has the atomic coordinates for this
rotamer in the milli_pdbATOM sideAtoms and
ROTAMERLET rotamerlet (for local
minimization of vdW energies) arrays. Finally, it contains the LOOKUP_ENERGY_X lookupX array, which contains an
element for each other position j>i.
ERESROT.lookupX[j-i] contains the LOOKUP_ENERGY_RESIDUE_X lookupResX array for the
interaction of residues at position j>i with rotamer i,i_res,i_res_rot.
For each permitted residue j_res at
position j>i,
ERESROT.lookupX[j-i].lookupResX[j_res-1]
contains the LOOKUP_ENERGY_ROTAMER_X lookupRotX
array. For each rotamer j_res_rot for
residue j_res at position j>i, the
interaction energy with rotamer i,i_res,i_res_rot
is stored in
ERESROT.lookupX[j-i].lookupResX[j_res-1].lookupRotX[j_res_rot-1].energy_var_var
(ERESROT.JRESROT.energy_var_var).
For fixed positions and non-interacting residue and rotamer pairs,
special values are defined by lookup_table.cpp:
initialize_lookuptable_pointers. If residue i,i_res
is defined as fixed, lookupEnergy[i].lookupRes[i_res].lookupRot[1].energy_var_fix
is set to FIXED_POSITION_PTR. If rotamer i,i_res,i_res_rot does not interact with any
residuetype j_res at position j, ERESROT.lookupX[j-i].lookupResX
is set to NON_INTERACT_LOOKUP_RES_X. If
rotamer i,i_res,i_res_rot does not
interact with any rotamer for residuetype j_res
at j, then ERESROT.lookupX[j-i].lookupResX[j_res-1].lookupRotX
is set to NON_INTERACT_LOOKUP_ROT_X.
If residues i,i_res and j,j_res have at least one interacting rotamer
pair, but the energies have not yet been calculated or read, ERESROT.lookupX[j-i].lookupResX[j_res-1].lookupRotX is
set to NULL; the reading function must
calculate or read these values. If rotamer i,i_res,i_res_rot
does not interact with rotamer j,j_res,j_res_rot,
ERESROT.JRESROT.energy_var_var is set to NON_INTERACT_PTR. Functions that read energies
from the lookup table must recognize these special values and act
accordingly. For examples, see CHROMOSOME_to_lookupEnergy.cpp:
CHROMOSOME_to_lookupEnergy, dee_utilities.cpp:
initialize_lookuptable_for_dee and dee_utilities.cpp:
get_energy.
Functions that read energies from the table must recognize these
special values.
9.5.2 Discussion and rationale for the
implemented strategy,
thoughts on improvement
The lookup table is stored on disk as a directory tree that resembles
the organization of the data in memory.
All the coordinates for residues at position i
are stored in the LOOKUP_TABLE_DIRECTORY/coordinates/i/
directory. The coordinates for the max_vdw-filtered
rotamers (including the native rotamer) for residuetype XYZ at position i
are saved in the file LOOKUP_TABLE_DIRECTORY/coordinates/i/XYZ.i.structure.
The coordinates for the just the native rotamer of residuetype XYZ at position i
is stored in LOOKUP_TABLE_DIRECTORY/coordinates/i/XYZ.i.structure.fixed.
This file is read/written if position i is
defined as fixed.
The sidechain-backbone energies for all residues at position i are stored in the LOOKUP_TABLE_DIRECTORY/var_fix/i/
directory. Elements of the LOOKUP_ENERGY_RESIDUE
and the LOOKUP_ENERGY_ROTAMER for
residuetype XYZ at position i (including the native rotamer) are saved in
the file LOOKUP_TABLE_DIRECTORY/var_fix/i/XYZ.i.var_fix_energy.
The LOOKUP_ENERGY_RESIDUE and LOOKUP_ENERGY_ROTAMER for just the native
rotamer of residuetype XYZ at position i are stored in LOOKUP_TABLE_DIRECTORY/coordinates/i/XYZ.i.var_fix_energy.fixed.
This file is read/written if position i is
defined as fixed.
The interaction tables (LOOKUP_ENERGY_RESIDUE_RESIDUE)
for all residues at position i are stored
in the LOOKUP_TABLE_DIRECTORY/interaction_lists/i
directory. The table for residuetype XYZ
at position i is saved in the file LOOKUP_TABLE_DIRECTORY/interaction_lists/i/XYZ.i.interaction_list.
These files are created only if position i
is not defined as fixed.
The sidechain-sidechain interaction energies for all residues at
position i with all residues j>i are stored in
the LOOKUP_TABLE_DIRECTORY/var_var/i/i.j
directory. The sidechain-sidechain interaction energies for residutype XYZ at position i
with residuetype ABC at position j>i are stored in
the file LOOKUP_TABLE_DIRECTORY/var_var/i/i.j
/XYZ.i.ABC.j.var_var_energy. If position i
is fixed, the energies with ABC at
position j are stored in LOOKUP_TABLE_DIRECTORY/var_var/i/i.j/XYZ.i.f.ABC.j.var_var_energy.
Similarly, if j is fixed, the data are
stored in LOOKUP_TABLE_DIRECTORY/var_var/i/i.j/XYZ.i.ABC.j.f.var_var_energy.
If both i and j
are fixed, the energy is NOT saved to disk. If i,XYZ
and j,ABC do not have at least one
interacting rotamer pair, no file is created; based on the LOOKUP_ENERGY_RESIDUE_RESIDUE interaction table,
the program will know not to look for or create the file.
This scheme results in the creation of many small files. This uses up
disk blocks/inodes less efficiently than an alternative scheme that
uses fewer larger files, or even one large file. However, the small
file strategy does have some advantages over the large file scheme. If
larger files were used, the process that created the file initially
would need to know a priori the requirements for subsequent jobs (or
simply assume the worst case), and place spacer material that could be
overwritten latter. Indeed, this scheme is used for the storage of
interaction lists. However, for sidechain-sidechain pair energy
storage, this strategy would result in large files with a lot of filler
material. An alternative strategy would be to use some sort of table of
contents system so information could be appended as needed and found
later as needed (see below for some thoughts on this).
Parallelization is most efficiently accomplished if the job can be
broken into small parts. For this problem, the natural job block units
are the residue-backbone and residue-pair. If a few large files were
used, parallelization would have been less efficient, since an given
file can be written to by only one process at a time. For many
problems, explicit parallelization is not employed or necessary.
However, there is implicit parallelization if succeeding
processes follow each other. If these processes have some lookup table
elements in common, this scheme makes it very straightforward for each
process or read only what it needs, and write anything that it needed
to calculate, making that data available for succeeding processes.
A compromise strategy would be to write data as individual files, thus
avoiding the process-collision problem during parallelization, and then
use tar to concatenate these files into larger files, freeing up space
in partially filled disk blocks. When data is needed, the directory
containing the files of interest is untarred, the files are read, and
the directory re-tarred. Unfortunately, running tar dynamically is slow
(this scheme was tested in an older version of the program). An even
better method would be to read the data directly from the tar file. Tar
files have a table of contents that lists the location of individual
files within the large file, making direct access to files possible
without searching. Another option would be to create a custom, highly
optimized table-of-contents system for accessing files within a large
concatenated file. In any case, since these lookup table directories
are not meant to be kept for long-term storage, and since disk
capacities keep increasing, attempts to further improve this presently
functional implementation were abandoned. As mentioned above, since
these files are not significantly compressible, there is little to be
gained by compression.
9.5.3 Description of lookup table generation
functions
The lookup_table.cpp: generate_lookup_table
manages the generation of protein->lookupEnergy
from the information stored in protein->var_pos.
lookup_table.cpp: initialize_lookuptable_pointers
is called to initialize special pointers and values used for flagging
fixed positions and non-interacting residue and rotamer pairs (see
above). The function lookup_table_disk_stuff.cpp:
check_lookup_table creates the lookup table directory and writes
the forcefield_info file. If the directory
already exists, forcefield parameters are compared to those in the
existing forcefield_info file. If there is
any discrepency, the program will exit with an explanatory ERROR message.
lookup_table_disk_stuff.cpp:
load_lookupRes_from_disk is called to read sidechain coordinates
and sidechain-backbone energies from disk. If the information for a
residuetype at a given position is not available, the coordinates must
be generated, and the sidechain-backbone energies calculated.
In order to calculate approximate surface areas and born radii,
pseudoatom coordinates are generated and placed in the mini_pdbATOM fixed_atoms array. The surface
areas and associated energies for proline and glycine atoms are
calculated, and then latter added to the energy of every rotamer at the
first variable position.
For a given residue choice at a position, the coordinates for each
rotamer is generated, and its energies calculated in turn. These
coordinates and energies are stored temporarily in the PSEUDO_LOOKUP_ENERGY_ROTAMER pseudo_lookupRotamer
array. PSEUDO_LOOKUP_ENERGY_ROTAMER is a
datatype local to lookup_table.cpp. The
information in this struct is condensed into LOOKUP_ENERGY_ROTAMER
for longer-term storage. The born radii and sidechain-backbone energies
are calculated by calling pairwise_energy_calc.cpp:
var_fixed_energy_calc. Once the approximate born radii are
calculated by this function, the internal born energies within the
rotamer are calculated by traversing the COULOMBIC
coulombic linked-list that was generated by energy_functions.cpp:
intrinisic_rotamer_energy during energy function input (see
energy function input section).
If the vdW energy between a rotamer and backbone exceeds resparam.max_vdw for this
residuetype, the
rotamer is flagged for rejection. Acceptable rotamers have their
surface areas calculated. After all the rotamers for a given
residuetype at a given position are measured, boltzmann probabilities
based on the sidechain-backbone vdW energy (apparent T = 500
K -> RT=1) are calculated for acceptable rotamers. Acceptable
rotamers with extremely low probabilities (<10-6) are
flagged for rejection. These probabilities are also used to calculate
the average surface area, hydrophobic surface area, and water-octanol
transfer free energy for a residuetype at a position; these are used
for solubility filters (see solubility design section). Finally,
coordinates and energies for un-rejected rotamers are condensed and
stored in LOOKUP_ENERGY_ROTAMER. These
newly calculated coordinates, energies, surface areas, and born radii
are saved to disk by the function lookup_table_disk_stuff.cpp:
save_lookupRes_to_disk.
To prevent errors due to collisions between independent processes, any
files that are created (or updated) by a process are named (or renamed)
as filename.PID (PID
= process ID number). After writing the file, the file is renamed to
the appropriate filename.
After all the sidechain-backbone energies are read/calculated,
reference state energies are subtracted from each rotamer's lookupRot[i_res_rot].energy_var_fix. Doing this
calculation after the sidechain-backbone energies have been loaded from
disk permits the same lookup table directory to be used with different
reference state models. In any case, since this step scales O(n)
and is simply a subtraction, only a trivial amount of time is spent.
The next step is to load and/or calculate the LOOKUP_ENERGY_RESIDUE_RESIDUE
interaction table for each residuetype at each position. lookup_table_disk_stuff.cpp: load_lookupResRes_from_disk
is used to read these from disk. If there is missing information,
sidechain pairs that have at least one rotamer pair that interact
(atompair ≤
FORCEFIELD_DISTANCE_CUTOFF (10Å) are
identified. This search is done in a hierarchal manner. First,
positions with interacting CB atoms are defined as interacting for all
residue/rotamer pairs. Remaining position pairs are assayed These data
are then written to disk by lookup_table_disk_stuff.cpp:
save_lookupResRes_to_disk.
If parameters.disk_lookup_table_flag = 1 (PRECALCULATION_LEVEL 1; default), energies for
all interacting rotamers are read and/or calculated by lookup_table_disk_stuff.cpp: load_lookupResX_from_disk.
This function allocates memory, sets non-interacting sidechain pair
flags, loads, calculates, and writes sidechain pair energies to and
from disk as needed. The rotamer pair energies are calculated by
calling pairwise_energy_calc.cpp:
rotamer_pair_energy, which in turn calls pairwise_energy_calc.cpp:
var_var_energy_calc. Rotamer coordinates are freed from memory
after they are no longer needed.
If PRECALCULATION_LEVEL 0, then memory is
allocated for sidechain pairs, and non-interacting sidechain pair flags
are set. However, memory is not allocated for rotamer pairs; ERESROT.lookupX[j-i].lookupResX[j_res-1].lookupRotX is
set to NULL, signalling the reading
function to allocate memory and calculate/ or load these values as
needed with CHROMOSOME_to_lookupEnergy.cpp:
load_calc_save_lookupResX. In order to use this function, extern char LOOKUP_TABLE_DIRECTORY[] must be
defined to the proper directory (currently set in input_stuff.cpp:
input_stuff, as well as at the beginning of each rotamer
optimization algorithm control function to protein->parameters.lookup_energy_table_directory).
Positions with one and only one rotamer are marked as fixed for the
purposes of the rotamer optimization. Energies between fixed positions
and moving positions are calculated and added to ERESROT.energy_var_fix
for the moving positions. Energies between fixed positions are added up
and appended to parameters.fixedatoms_energy.
The energy_var_fix energies for the fixed
positions are also appended to parameters.fixedatoms_energy.
Finally,
ERESROT.energy_var_fix for these fixed
positions are set to FIXED_POSITION_PTR,
as described in the data-structure section, and the lookup table
branches rooted from this position are pruned.
9.5.4 Description of parallel lookup table
generation functions
If a process is defined as a lookup_table_master
in the command-line, the inputfile is parsed by input_stuff.cpp:
input_stuff. The command-line is parsed by EGAD.cpp:
parse_command_line. If available_processors
are defined in the command-line, these are read and used to write the AVAILABLE_PROCESSORS_FILE /tmp/avail_processors.PID
(PID = process ID number).
protein and the name of the inputfile are
sent to parallel_egad.cpp: lookup_table_master.
This function reads the AVAILABLE_PROCESSORS_FILE,
and counts the number of available CPUs (num_processors).
The file inputfile_top.PID is created. It
is essentially the parameter section from the original inputfile,
except:
OTHER_RESIDUES none
LOGFILE_FLAG 0 #
don't create and write logfiles
QUIET_FLAG 1 # don't print error
messages
This file is used to generate the parameter section for
all the slave inputfiles.
num_processors job_frag files are created.
Each of these has lines for a subset of the variable positions lines
from the original inputfile (shortcuts are expanded). The file PID.n.job_frag has lines for variable positions
starting at the nth line and every (num_processors
+ n)th one after that.
These job_frag files are used, along with
the inputfile_top.PID file, to create the
slave inputfiles. The first num_processors
slave inputfiles PID.n.slave.input are
simply PID.n.job_frag appended to the
contents of inputfile_top.PID. When these
are run, the coordinates, sidechain-self and sidechain-backbone
energies for these positions are calculated and saved. The
sidechain-pair energies between these positions are calculated and
saved as well.
The remaining inputfiles are created by generating variable position
sections from all non-redundant pairs of job_frag
files, and appending these to the contents of inputfile_top.PID.
When these are run, energies between sidechains from the different job_frag blocks are calculated; the coordinates
and energies between sidechains within a given block, as well as their
self and backbone interaction energies, should have been calculated by
the first num_processors slave inputfiles
discussed earlier.
The master process launches/forks foremen processes to the slaves by
calling io.cpp: launch_command. It then
waits until the slave inputfiles are all mv'd
to PID.n.slave.working. If only one slave
job remains to be launched, yet has not started after five minutes, the
master goes to the next step and waits for all the PID.n.slave.working
files to be moved to PID.n.slave.done. If
after five minutes, a slave process is still not done, the master
assumes that a job has crashed, and moves on. The master removes all
the temporary slave files, and then loads the lookup table into memory,
filling in missing pieces, if any, and then performs the rotamer
optimization job it was assigned to do.
A lookup table foreman process is defined in the command-line argument
that launched it. In practice, foremen jobs should only be launched by
a master EGAD process, not manually. When detected by main, the inputfile name, along with the master_PID and number of slave inputfiles, are
parsed and sent to parallel_egad.cpp:
lookup_table_foreman. This function checks for the existence of
slave inputfiles. If it finds one, it mv's
it from master_PID.n.slave.input to master_PID.n.slave.working. This renamed file is
used as input for a child (not forked) EGAD lookup_table_slave
process. If this slave process returns 0
(success), master_PID.n.slave.working is
renamed master_PID.n.slave.done. Otherwise,
master_PID.n.slave.working is returned to
the queue by mv'ing it back to master_PID.n.slave.input. The foreman keeps
checking for remaining slave inputfiles until all are working. Once all
the slave inputs are working, it waits until all are done. If one of
these other jobs crashed, it picks up the re-queued inputfile and runs
it. Finally, after all the jobs are done, or when the master removes
all the slave inputfiles, the foreman process exits.
back to Table of Contents