Concepts

In order to effectively use LEAP it is necessary to understand the philosophy behind the program. The philosophy which guides LEAP is developed in this section. This is done by exploring the concepts of LEAP commands, variables, and objects. Once the user understands how commands, variables, and objects are defined and employed within LEAP, they will have also learned the principles necessary to use the program effectively. In addition to exploring these concepts, this section also addresses the use of external files and libraries with the program.

Commands

A researcher uses LEAP by entering commands that manipulate objects. An object is just a basic building block; some examples of objects are ATOMs, RESIDUEs, UNITs, and PARMSETs. The commands that are supported within LEAP are described throughout the manual and are defined in detail in the "Command Reference" section.

The heart of LEAP is a command-line interface that accepts text commands which direct the program to perform operations on objects. All LEAP commands have one of the following two forms:

command argument1 argument2 argument3 ...
variable = command argument1 argument2 ...
For example:
edit ALA
trypsin = loadPdb trypsin.pdb

Each command is followed by zero or more arguments that are separated by commas. Some commands return objects which are then associated with a variable using an assignment (=) statement. Each command acts upon its arguments, and some of the commands modify their arguments' contents. The commands themselves are case- insensitive. That is, in the above example, edit could have been entered as Edit, eDiT, or any combination of upper and lower case characters. Similarly, loadPdb could have been entered a number of different ways, including loadpdb. In this manual, we frequently use a mixed case for commands. We do this to enhance the differences between commands and as a mnemonic device. Thus, while we write createAtom, createResidue, and createUnit in the manual, the user can use any case when entering these commands into the program.

The arguments in the command text may be objects such as NUMBERS, STRINGS, or LISTs or they may be variables. In the following manual sections, we discuss variables and objects. It is important that the user be able to differentiate between variables and objects in order to properly use the LEaP command line interface.

Variables

A variable is a handle for accessing an object. A variable name can be any alphanumeric string whose first character is an alphabetic character. (Alphanumeric means that the characters of the name may be letters, numbers, or special symbols such as "*". The following special symbols should not be used in variable names: dollar sign, comma, period, pound sign, equal sign, space, semicolon, double quote, or list open or close characters { and }. LEaP commands should not be used as variable names. Variable names are case-sensitive: "ARG" and "arg" are different variables. Variables are associated with objects using an assignment statement not unlike regular computer languages such as FORTRAN or C.

mole = 6.02E23
MOLE = 6.02E23
myName = "Joe Smith"
listOf7Numbers = { 1.2 2.3 3.4 4.5 6 7 8 }

In the above examples, both mole and MOLE are variable names, whose contents are the same (6.02E23). Despite the fact that both mole and MOLE have the same contents, they are not the same variable. This is due to the fact that variable names are case-sensitive. LEaP maintains a list of variables that are currently defined and this list can be displayed using the list command.

The contents of a variable can be printed using the desc command. Each variable is associated with one object. Variables may also be assigned to the object represented by some other variable. For example, suppose that you wanted to create a RESIDUE called AIB (aminoisobutyric acid) using LEaP. Since this amino acid differs from ALA by only one substituent, you might decide to start with the ALA UNIT loaded from the "lib/all_amino94.lib" file and edit this to add one methyl group to the ALA RESIDUE. To implement this idea, you might enter the command line:

AIB = ALA
After executing the above statement, AIB and ALA will "point to" the same (single) object. This means more than simply saying AIB and ALA have the same contents. At this point, there will only be one UNIT object; both AIB and ALA will represent that one object. If the contents of the object are changed at some later time (such as editing AIB to add a methyl substituent), then the change will be seen in both AIB and ALA. Clearly, this would not be a good idea for this specific example. Instead, in this case one could use the LEaP copy command to create a duplicate of the ALA object. The strategy of creating an equivalent relationship between objects is largely used to prevent unnecessary duplication of objects, some of which can be very large. NOTE: equivalencing for residues, effective only when loading PDB files, is also achieved by the addResidueNameMap command, and the use of alternate names ("AIB = ALA") may be discontinued in future.

Objects

The object is the fundamental entity in LEaP. Objects range from the simple objects NUMBERS and STRINGS to the complex objects UNITs, RESIDUEs, ATOMs. Complex objects have properties that can be altered using the set command and some complex objects can contain other objects. For example, RESIDUEs are complex objects that can contain ATOMs and have the properties: residue name, connect atoms, and residue type.

NUMBERs

NUMBERs are simple objects and they are identical to double precision variables in FORTRAN and double in C.

STRINGs

STRINGS are simple objects that are identical to character arrays in C and similar to character strings in FORTRAN. STRINGS are represented by sequences of characters which may be delimited by double quote characters. STRINGS may also be represented by prefixing a sequence of characters by a dollar sign, where the delimiter is a comma, a space, a semicolon, or a list open or close character ({ or }). If a string does not contain a comma, a space, a semicolon, or a list open or close character and there is no variable defined that is the same as that string, then it is not necessary to put the string in quotes or prefix it by a dollar sign. Double quote characters within the STRING may be represented by a pair of double quotes. Example strings are:

"Hello there"
"String with a "" (quote) character"
"Strings contain letters and numbers:1231232"
$noQuotes
$343noQuotesAgain
noQuotesOrDollarSign

LISTs

LISTs are made up of sequences of other objects delimited by LIST open and close characters. The LIST open character is an open curly bracket ({) and the LIST close character is a close curly bracket (}). LISTs can contain other LISTs and be nested arbitrarily deep. Example LISTs are:

{ 1 2 3 4 }
{ 1.2 "string" $anotherString }
{ 1 2 3 { 1 2 } { 3 4 } }

LISTs are used by many commands to provide a more flexible way of passing data to the commands. The zMatrix command has two arguments, one of which is a LIST of LISTs where each subLIST contains between three and eight objects.

PARMSETs (Parameter Sets)

PARMSETs are objects that contain bond, angle, torsion, and nonbond parameters for AMBER force field calculations. They are normally loaded from e.g. parm94.dat and frcmod files.

ATOMs

ATOMs are complex objects that do not contain any other objects. The ATOM object is similar to the chemical concept of atoms. Thus, it is a single entity that may be bonded to other ATOMs and it may be used as a building block for creating molecules. ATOMs have many properties that can be changed using the set command. These properties are defined below.

name

This is a case-sensitive STRING property and it is the ATOM's name. The names for all ATOMs in a RESIDUE should be unique. The name has no relevance to molecular mechanics force field parameters; it is chosen arbitrarily as a means to identify ATOMs. Ideally, the name should correspond to the PDB standard, being 3 characters long except for hydrogens, which can have an extra digit as a 4th character.

type

This is a STRING property. It defines the AMBER force field atom type. It is important that the character case match the canonical type definition used in the appropriate "parm.dat" or "frcmod" file. For smooth operation, all atom types need to have element and hybridization defined by the addAtomTypes command. The standard AMBER force field atom types are added by the default "leaprc" file.

charge

The charge property is a NUMBER that represents the ATOM's electrostatic point charge to be used in a molecular mechanics force field.

element

The atomic element provides a simpler description of the atom that the type, and is used only for LEAP's internal purposes (typically when force field information is not available). The element names correspond to standard nomenclature; the character "?" is used for special cases.

position

This property is a LIST of NUMBERS. The LIST must contain three values: the (X, Y, Z) Cartesian coordinates of the ATOM.

Both the AMBER and SPASMS software packages support a type of calculation know as Free Energy Perturbation. During Free Energy Perturbation, one chemical species is slowly transformed into another and the energy change associated with the transformation is measured. In order to perform a Free Energy Perturbation, the properties of the perturbed ATOMs must also be set. These properties correspond to the ATOM properties described above, but the values represent the final state of the perturbed species, as described below. If a Free Energy Perturbation calculation is not to be performed, the following properties can be left as null. They are only used when the "PERTURB" property's value is "true" for that atom, when doing a saveAmberParmPert to save a perturbation topology file. (Note that mass is never perturbed.)

pertName

This property can either be null or a case sensitive STRING. The property is a unique identifier for an ATOM in its final state during a Free Energy Perturbation calculation. If it is null then the perturbed ATOM will inherit the unperturbed name. The pertName has no effect on calculations and is mainly useful as a reminder of what was intended.

pertType

This property can either be null or a STRING. If the value is null then the ATOM type will not be perturbed in a perturbation calculation. If the pertType is a STRING, the STRING is the AMBER force field atom type of the perturbed ATOM. This property is case-sensitive.

pertCharge

The pertCharge property is a NUMBER. It represents the final electrostatic point charge on an ATOM during a Free Energy Perturbation.

RESIDUEs

RESIDUEs are complex objects that contain ATOMs. RESIDUEs are collections of ATOMs that are either molecules (e.g. formaldehyde) or are linked together to form molecules (e.g. amino acid monomers). RESIDUEs have several properties that can be changed using the set command.

One property of RESIDUEs is connection ATOMs. Connection ATOMs are ATOMs that are used to make linkages between RESIDUEs. For example, in order to create a protein, the N-terminus of one amino acid residue must be linked to the C-terminus of the next residue. This linkage can be made within LEaP by setting the N ATOM to be a connection ATOM at the N-terminus and the C ATOM to be a connection ATOM at the C-terminus. As another example, two CYX amino acid residues may form a disulfide bridge by crosslinking a connection atom on each residue.

When residues are read from AMBER PREP input files, LEAP creates a RESIDUE object for each residue read and defines the first main chain atom of the AMBER residue to be the connect0 ATOM of the RESIDUE. The last main chain atom of the AMBER residue becomes the connect1 ATOM of the RESIDUE. Any other atoms that would be used for cross links must be explicitly defined as connect ATOMs using the set command. The scripts in "leap/lib/" show how this is done for the standard force field residues.

There are several properties of RESIDUEs that can be modified using the set command. The properties are described below:

connect0

This defines an ATOM that is used in making links to other RESIDUEs. In UNITs containing single RESIDUEs, the RESIDUEss connect0 ATOM is usually defined as the UNITs' head ATOM. (This is how the standard library UNITs are defined.) For amino acids, the convention is to make the N-terminal nitrogen the connect0 ATOM.

connect1

This defines an ATOM that is used in making links to other RESIDUEs. In UNITs containing single RESIDUEs, the RESIDUEs' connect1 ATOM is usually defined as the UNITs' tail ATOM. (This is done in the standard library UNITs.) For amino acids, the convention is to make the C-terminal oxygen the connect1 ATOM.

connect2

This is an ATOM property which defines an ATOM that can be used in making links to other RESIDUEs. In amino acids, the convention is that this is the ATOM to which disulphide bridges are made.

connect3

This is an ATOM property which defines an ATOM that can be used in making links to other RESIDUEs.

connect4

This is an ATOM property which defines an ATOM that can be used in making links to other RESIDUEs.

connect5

This is an ATOM property which defines an ATOM that can be used in making links to other RESIDUEs.

restype

This property is a STRING that represents the type of the RESIDUE. Currently, it can have one of the following values: "undefined", "solvent", "protein", "nucleic", or "saccharide". Some of the LEAP commands behave in different ways depending on the type of a residue. For example, the solvate commands require that the solvent residues be of type "solvent". It is important that the proper character case be used when defining this property.

name

The RESIDUE name is a STRING property. It is important that the proper character case be used when defining this property.

UNITs

UNITs are the most complex objects within LEAP, and the most important. UNITs, when paired with one or more PARMSETs, contain all of the information required to perform a calculation using AMBER or SPASMS. UNITs have the following properties which can be changed using the set command:

head
tail
These define the ATOMs within the UNIT that are connected when UNITs are joined together using the sequence command or when UNITs are joined together with the PDB or PREP file reading commands. The tail ATOM of one UNIT is connected to the head ATOM of the next UNIT in any sequence. (Note: a "TER card" in a PDB file causes a new UNIT to be started.)

box

This property can either be null, a NUMBER, or a LIST. The property defines the bounding box of the UNIT. If it is defined as null then no bounding box is defined. If the value is a single NUMBER then the bounding box will be defined to be a cube with each side being NUMBER of angstroms across. If the value is a LIST then it must be a LIST containing three numbers, the lengths of the three sides of the bounding box.

cap

This property can either be null or a LIST. The property defines the solvent cap of the UNIT. If it is defined as null then no solvent cap is defined. If the value is a LIST then it must contain four numbers, the first three define the Cartesian coordinates (X, Y, Z) of the origin of the solvent cap in angstroms, the fourth NUMBER defines the radius of the solvent cap in angstroms.

Examples of setting the above properties are:

set dipeptide head dipeptide.1.N
set dipeptide box { 5.0 10.0 15.0 }
set dipeptide cap { 15.0 10.0 5.0 8.0 }

The first example makes the amide nitrogen in the first RESIDUE within "dipeptide" the head ATOM. The second example places a rectangular bounding box around the origin with the (X, Y, Z) dimensions of ( 5.0, 10.0, 15.0 ) in angstroms. The third example defines a solvent cap centered at ( 15.0, 10.0, 5.0 ) angstroms with a radius of 8.0 Å. Note: the "set cap" command does not actually solvate, it just sets an attribute. See the solvateCap command for a more practical case.

UNITs are complex objects that can contain RESIDUEs and ATOMs. UNITs can be created using the createUnit command and modified using the set commands. The contents of a UNIT can be modified using the add and remove commands.

UNITs also contain information about restraints. Users are encouraged to avoid applying such restraints in LEAP, and instead to use the more robust ones available in the simulation programs. Restraints are supported in LEAP only for backward compatibility.

Restraints can be modified using the LEAP commands: addBondRestraint, addAngleRestraint, addTorsionRestraint, and removeRestraint. Restraints are additional energy terms that can be placed between two, three or four ATOMs. There are three kinds of restraints: bond, angle, and torsion restraints. Bond restraints can be created between any two ATOMs, and they are defined by the two ATOMs, an equilibrium distance, and a force constant. Angle restraints are defined by three ATOMs, an equilibrium angle, and a force constant. Torsion restraints are defined by four ATOMs, an equilibrium torsion angle, a force constant and a multiplicity. See Appendix C, Parameter Development, for more details.

Complex objects and accessing subobjects

UNITs and RESIDUEs are complex objects. Among other things, this means that they can contain other objects. There is a loose hierarchy of complex objects and what they are allowed to contain. The hierarchy is as follows:

The hierarchy is loose because it does not forbid UNITs from containing ATOMs directly. However, the convention that has evolved within LEAP is to have UNITs directly contain RESIDUEs which directly contain ATOMs.

Objects that are contained within other objects can be accessed using dot "." notation. An example would be a UNIT which describes a dipeptide ALA-PHE. The UNIT contains two RESIDUEs each of which contain several ATOMs. If the UNIT is referenced (named) by the variable dipeptide, then the RESIDUE named ALA can be accessed in two ways. The user may type one of the following commands to display the contents of the RESIDUE:

desc dipeptide.ALA
desc dipeptide.1

The first translates to "some RESIDUE named ALA within the UNIT named dipeptide". The second form translates as "the RESIDUE with sequence number 1 within the UNIT named dipeptide". The second form is more useful because every subobject within an object is guaranteed to have a unique sequence number. If the first form is used and there is more than one RESIDUE with the name ALA, then an arbitrary residue with the name ALA is returned. To access ATOMs within RESIDUEs, the notation to use is as follows:

desc dipeptide.1.CA
desc dipeptide.1.3

Assuming that the ATOM with the name CA has a sequence number 3, then both of the above commands will print a description of the $alpha$-carbon of RESIDUE dipeptide.ALA or dipeptide.1. The reader should keep in mind that dipeptide.1.CA is the ATOM, an object, contained within the RESIDUE named ALA within the variable dipeptide. This means that dipeptide.1.CA can be used as an argument to any command that requires an ATOM as an argument. However dipeptide.1.CA is not a variable and cannot be used on the left hand side of an assignment statement.

In order to further illustrate the concepts of UNITs, RESIDUEs, and ATOMs, we can examine the log file from a LEAP session. Part of this log file is printed below.

> loadOff all_amino94.lib
> desc GLY
UNIT name: GLY
Head atom: .R.A
Tail atom: .R.A
Contents:
R
> desc GLY.1
RESIDUE name: GLY
RESIDUE sequence number: 1
RESIDUE PDB sequence number: 0
Type: protein
Connection atoms:
Connect atom 0: A
Connect atom 1: A
Contents:
A
A
A
A
A
A
A
> desc GLY.1.3
ATOM
Normal Perturbed
Name: CA CA
Type: CT CT
Charge: -0.025 0.000
Element: C (not affected by pert)
Atom position: 3.970048, 2.845795, 0.000000
Atom velocity: 0.000000, 0.000000, 0.000000
Bonded to .R.A by a single bond.
Bonded to .R.A by a single bond.
Bonded to .R.A by a single bond.
Bonded to .R.A by a single bond.

In this example, command lines are prefaced by ">" and the LEAP program output has no such character preface. The first command,

> loadOff all_amino94.lib

loads an OFF library containing amino acids. The second command,

> desc GLY

allows us to examine the contents of the amino acid UNIT, GLY. The UNIT contains one RESIDUE which is named GLY and this RESIDUE is the first residue in the UNIT (R). In fact, it is also the only RESIDUE in the UNIT. The head and tail ATOMs of the UNIT are defined as the N- and C-termini, respectively. The box and cap UNIT properties are defined as "null". If these latter two properties had values other than "null", the information would have been included in the output of the desc command.

The next command line in the session,

> desc GLY.1

enables us to examine the first residue in the GLY UNIT. This RESIDUE is named GLY and its residue type is that of a protein. The connect0 ATOM (N) is the same as the UNITs' head ATOM and the connect1 ATOM (C) is the same as the UNITs' tail ATOM. There are seven ATOM objects contained within the RESIDUE GLY in the UNIT GLY.

Finally, let us look at one of the ATOMs in the GLY RESIDUE.

> desc GLY.1.3

The ATOM has a name (CA) that is unique among the atoms of the residue. The AMBER force field atom type for CA is CT. The type of element, atomic point charge, and Cartesian coordinates for this ATOM have been defined along with its bonding attributes. Other force filed parameters, such as the van der Waals well depth, have been included in "lib/parm94.dat".


[Contents] [Previous] [Next]
Updated on January 5, 2000. Comments to case@scripps.edu