This section works through in some detail setting up a protein simulation
in AMBER. The example is for plastocyanin in water, and contains a number
of things that experienced AMBER users know how to do, but which may be
far from obvious for others. In particular, there are a number of items that
go beyond a simple protein. The examples assume you have an environment
variable $AMBERHOME that points to the top of your AMBER distribution.
The files here can be found in $AMBERHOME/examples/pcy.
This will be a lot of work, but it's infinitely easier now in AMBER than it used to be. We will also use this example to show how to set up some constraints, such as might be found in a NMR refinement, and will illustrate how to carry out simulated annealing optimizations.
For plastocyanin, I will define two new types of residues: HIC, which will be a histidine coupled to a copper ion, and which will take the place of HIS 37 in the "real" sequence, and MEM, which is a modified methionine where the sulfur atom is of type "SM" rather than type "S". "SM" is a type I made up, and will use to create special force field parameters for MET 94, which is bonded to the copper ion with a fairly long bond.
Here are the input files for these two residues:
+-------------------------------------------------------------------------+ | hicu.in | +-------------------------------------------------------------------------+ | 0 0 2 | | | |HISTIDINE PLUS | |hicu.db94 | | HIC INT 1 | | CORR OMIT DU BEG | | 0.00000 | | 1 DUMM DU M 0 -1 -2 0.000 0.000 0.000 0.00000 | | 2 DUMM DU M 1 0 -1 1.449 0.000 0.000 0.00000 | | 3 DUMM DU M 2 1 0 1.522 111.100 0.000 0.00000 | | 4 N N M 3 2 1 1.335 116.600 180.000 -0.34790 | | 5 H H E 4 3 2 1.010 119.800 0.000 0.27470 | | 6 CA CT M 4 3 2 1.449 121.900 180.000 -0.13540 | | 7 HA H1 E 6 4 3 1.090 109.500 300.000 0.12120 | | 8 CB CT 3 6 4 3 1.525 111.100 60.000 -0.04140 | | 9 HB2 HC E 8 6 4 1.090 109.500 300.000 0.08100 | | 10 HB3 HC E 8 6 4 1.090 109.500 60.000 0.08100 | | 11 CG CC S 8 6 4 1.510 115.000 180.000 -0.00120 | | 12 ND1 NB B 11 8 6 1.390 122.000 180.000 -0.15130 | | 13 CU CU E 12 11 8 2.050 126.000 0.000 0.38660 | | 14 CE1 CR B 12 11 8 1.320 108.000 180.000 -0.01700 | | 15 HE1 H5 E 14 12 11 1.090 120.000 180.000 0.26810 | | 16 NE2 NA B 14 12 11 1.310 109.000 0.000 -0.17180 | | 17 HE2 H E 16 14 12 1.010 125.000 180.000 0.39110 | | 18 CD2 CW S 16 14 12 1.360 110.000 0.000 -0.11410 | | 19 HD2 H4 E 18 16 14 1.090 120.000 180.000 0.23170 | | 20 C C M 6 4 3 1.522 111.100 180.000 0.73410 | | 21 O O E 20 6 4 1.229 120.500 0.000 -0.58940 | | | |LOOP | | CG CD2 | | | |IMPROPER | | -M CA N H | | CA +M C O | | CE1 CD2 NE2 HE2 | | CG NE2 CD2 HD2 | | ND1 NE2 CE1 HE1 | | ND1 CD2 CG CB | | | |DONE | |STOP | +-------------------------------------------------------------------------+
+-------------------------------------------------------------------------+ | mem.in | +-------------------------------------------------------------------------+ | 0 0 2 | | | |METHIONINE, with SM atom type for the sulfur | |mem.db94 | | MEM INT 1 | | CORR OMIT DU BEG | | 0.00000 | | 1 DUMM DU M 0 -1 -2 0.000 0.000 0.000 0.00000 | | 2 DUMM DU M 1 0 -1 1.449 0.000 0.000 0.00000 | | 3 DUMM DU M 2 1 0 1.522 111.100 0.000 0.00000 | | 4 N N M 3 2 1 1.335 116.600 180.000 -0.41570 | | 5 H H E 4 3 2 1.010 119.800 0.000 0.27190 | | 6 CA CT M 4 3 2 1.449 121.900 180.000 -0.02370 | | 7 HA H1 E 6 4 3 1.090 109.500 300.000 0.08800 | | 8 CB CT 3 6 4 3 1.525 111.100 60.000 0.03420 | | 9 HB2 HC E 8 6 4 1.090 109.500 300.000 0.02410 | | 10 HB3 HC E 8 6 4 1.090 109.500 60.000 0.02410 | | 11 CG CT 3 8 6 4 1.525 109.470 180.000 0.00180 | | 12 HG2 H1 E 11 8 6 1.090 109.500 300.000 0.04400 | | 13 HG3 H1 E 11 8 6 1.090 109.500 60.000 0.04400 | | 14 SD SM S 11 8 6 1.810 110.000 180.000 -0.27370 | | 15 CE CT 3 14 11 8 1.780 100.000 180.000 -0.05360 | | 16 HE1 H1 E 15 14 11 1.090 109.500 60.000 0.06840 | | 17 HE2 H1 E 15 14 11 1.090 109.500 180.000 0.06840 | | 18 HE3 H1 E 15 14 11 1.090 109.500 300.000 0.06840 | | 19 C C M 6 4 3 1.522 111.100 180.000 0.59730 | | 20 O O E 19 6 4 1.229 120.500 0.000 -0.56790 | | | |IMPROPER | | -M CA N H | | CA +M C O | | | |DONE | |STOP | +-------------------------------------------------------------------------+
I made mem.in just by copying the relevant portions of the methionine entry from all_amino94.in in the database directory, changing the atom type of the sulfur, and adding appropriate first and last lines. Similar things were done for the histidine residue (starting from the library's HIP residue), except that I added a copper atom bonded to ND1. It is a good idea to read these files alongside the PREP documentation.
Several small changes need to be made to the input PDB file to make it work with Amber:
(An alternative is to simply strip out the "crystallographic" waters and not use them at all. This is most appropriate if you are planning an MD or free energy simulation that will go through an extensive equilibration period before the "real" simulation begins. One goal of equilibration is to minimize dependence upon the starting conditions, and certainly the individual water molecules will move around a lot during any decent equilibration. At that point, the fact that you went to some trouble to originally place the waters in some nice positions may be irrelevant. Or maybe not; opinions differ on this matter, which is why we try to provide flexible tools in Amber. For this tutorial, we will not use the "crystallographic" waters in our starting structure.)
One final change involves residue names. Brookhaven files do no distinguish between cysteine residue that are involved in bonds to other things (and hence which have no proton on the sulfur atom) and "free" residues that do have such a proton. Molecular mechanics studies need to make this sort of distinction. Since residue 84 in plastocyanin has the sulfur atom bonded to the copper ion, I changed its name from CYS to CYX. Similar comments apply to histidines: molecular mechanics studies need to know (or guess, or assume) whether the histidine has a proton bonded at the `delta` position (HID), at the `epsilon` position (HIE), or at both (HIP). This is pretty easy for plastocyanin, since the two histidine side chains are both bound to copper through the ND1 nitrogen. So we initially change both HIS residues to HIE, in order to tell AMBER to put the protons on the NE2 nitrogen. (Note that in many other proteins, it will often be reasonable to assign surface-accessible histidines to be protonated, residue name HIP.)
( protonate -k -d PROTON_INFO < 1plc.nowat.pdb
> 1plc.nowat.H.pdb) >& protonate.outprotonate -k -d PROTON_INFO < 1plc.nowat.pdb
> 1plc.nowat.H.pdb 2> protonate.out
We first need to make modifications to the AMBER force field to recognize the copper atom and the modified methionine residue. The best way to do this is not to edit the main force field file, but to create a frcmod file with changes specific to each project. Here is the one I created for plastocyanin:
+-----------------------------------------------------------+ | frcmod.pcy | +-----------------------------------------------------------+ |# modifications to force field for poplar plastocyanin | |MASS | |SM 32.06 | |CU 65.36 | | | |BOND | |NB-CU 70.000 2.05000 kludge by JRS | |CU-S 70.000 2.10000 kludge by JRS | |CU-SM 70.000 2.90000 for pcy | |CT-SM 222.000 1.81000 met(aa) | | | |ANGLE | |CU-NB-CV 50.000 126.700 JRS estimate | |CU-NB-CR 50.000 126.700 JRS estimate | |CU-NB-CP 50.000 126.700 JRS estimate | |CU-NB-CC 50.000 126.700 JRS estimate | |CU-SM-CT 50.000 120.000 JRS estimate | |CU-S -CT 50.000 120.000 JRS estimate | |CU-S -C2 50.000 120.000 JRS estimate | |CU-S -C3 50.000 120.000 JRS estimate | |NB-CU-NB 10.000 110.000 dac estimate | |NB-CU-SM 10.000 110.000 dac estimate | |NB-CU-S 10.000 110.000 dac estimate | |SM-CU-S 10.000 110.000 dac estimate | |CU-SM-CT 50.000 120.000 JRS estimate | |CT-CT-SM 50.000 114.700 met(aa) | |HC-CT-SM 35.000 109.500 | |H1-CT-SM 35.000 109.500 | |CT-SM-CT 62.000 98.900 MET(OL) | | | |DIHE | |X -NB-CU-X 1 0.000 180.000 3.000 | |X -CU-SM-X 1 0.000 180.000 3.000 | |X -CU-S -X 1 0.000 180.000 3.000 | |X -CT-SM-X 3 1.000 0.000 3.000 | | | |NONBON | | CU 2.20 0.200 | | SM 2.00 0.200 | | | +-----------------------------------------------------------+
Creating a frcmod file is a bit of an art, since one is often faced with unknown parameters (such as force constants from copper to its ligands in the above example). In this tutorial, I am just going over the mechanics of running AMBER, and make no claims about the validity or appropriateness of the above parameters. There is a big literature on parameter estimation, and users are encouraged to consult this when faced with unusual chemical environments. A starting point is Appendix C: Parameter Development, in this manual.
Now we are ready to run LEaP. Let's start by setting up an input file:
+-----------------------------------------------------------+ | leap.in | +-----------------------------------------------------------+ |frcmod = loadamberparams frcmod.pcy | |loadAmberPrep mem.in | |set MEM restype protein | |loadAmberPrep hicu.in | |set HIC restype protein | |plc = loadPdb 1plc.protein.pdb | |bond plc.87.ND1 plc.37.CU | |bond plc.84.SG plc.37.CU | |bond plc.92.SD plc.37.CU | |saveAmberParm plc prmtop.nowat prmcrd.nowat | +-----------------------------------------------------------+
By default, LEaP will read in the standard AMBER 94 force field
libraries (Cornell et al. 95). The first line in the file above
loads the extra force field parameters from the frcmod.pcy file
described above. The next four lines load the files describing the
modified residues HIC and MEM and set their residue types. Then a molecule,
named "plc" is created by reading in the appropriate pdb file. (LEaP
will often complain at this point if it finds something it doesn't
understand or doesn't like; a typical task would be to modify the pdb file
and try again.) Next, the three "cross-links" that connect residues 84, 87
and 92 to the copper atom are added via the bond command. Finally,
the saveAmberParm command is used to create the required output files.
Details of all of these commands can be found in the LEaP manual.
The file above is read into LEaP as follows; ">" is the prompt that LEaP provides:
tleap
> source leap.in
> quit
We start with a very simple minimization with restraints to keep the protein from moving too far. In the examples below, do not include the comments in parenthesis in your actual files.
+----------------------------------------------------------------+ | min1.in | +----------------------------------------------------------------+ | | |Minimization with Cartesian restraints | | &cntrl | | imin=1, maxcyc=200, (invoke minimization) | | scee=1.2, igb=3, cut=12.0, (force field options) | | ntpr=5, (print frequency) | | ntr=1, ntb=0, (turn on cartesian restraints) | | &end | |Group input for restrained atoms | | 1.0 (force constant for restraint) | |RES 1 99 (all atoms in residues 1-99) | |END | |END | | | +----------------------------------------------------------------+
To run this file, use the following command:
sander -i min1.in -c prmcrd.nowat -p prmtop.nowat -ref prmcrd.nowat
-o min1.out -r min1.xyz
At this point, one often would want to carry out a more robust optimization, using a dynamical simulated annealing protocol. One also might add some sort of constraints at this point. In this example, we will illustrate how NMR-derived constraints might be incorporated; constraints from other sources of information would be handled in a similar fashion.
+---------------------------------------------------------------------------+ | simul_anneal.in | +---------------------------------------------------------------------------+ || | 15ps simulated annealing protocol | | &cntrl | | nstlim=15000, ntt=1, (time limit, temp. control) | | scee=1.2, igb=3, ntb=0, (scee holds the 1-4 scale factor) | | ntpr=500, pencut=0.1, (control of printout) | | ipnlty=1, nmropt=1, (NMR penalty function options) | | vlimit=10, (prevent bad temp. jumps) | | &end | |# | |# Simple simulated annealing algorithm: | |# | |# from steps 0 to 1000: raise target temperature 10->1200K | |# from steps 1000 to 3000: leave at 1200K | |# from steps 3000 to 15000: re-cool to low temperatures | |# | | &wt type='TEMP0', istep1=0,istep2=1000,value1=10., | | value2=1200., &end | | &wt type='TEMP0', istep1=1001, istep2=3000, value1=1200., | | value2=1200.0, &end | | &wt type='TEMP0', istep1=3001, istep2=15000, value1=0., | | value2=0.0, &end | |# | |# Strength of temperature coupling: | |# | |# steps 0 to 3000: tight coupling for heating and equilibration | |# steps 3000 to 11000: slow cooling phase | |# steps 11000 to 13000: somewhat faster cooling | |# steps 13000 to 15000: fast cooling, like a minimization | |# | | &wt type='TAUTP', istep1=0,istep2=3000,value1=0.2, | | value2=0.2, &end | | &wt type='TAUTP', istep1=3001,istep2=11000,value1=4.0, | | value2=2.0, &end | | &wt type='TAUTP', istep1=11001,istep2=13000,value1=1.0, | | value2=1.0, &end | | &wt type='TAUTP', istep1=13001,istep2=14000,value1=0.5, | | value2=0.5, &end | | &wt type='TAUTP', istep1=14001,istep2=15000,value1=0.05, | | value2=0.05, &end | | | | (continued on next page) | +---------------------------------------------------------------------------+
+---------------------------------------------------------------+ | simul_anneal.in (continued) | +---------------------------------------------------------------+ |# | |# "Ramp up" the restraints over the first 3000 steps: | |# | | &wt type='REST', istep1=0,istep2=3000,value1=0.1, | | value2=1.0, &end | | &wt type='REST', istep1=3001,istep2=15000,value1=1.0, | | value2=1.0, &end | | | | &wt type='END' &end | |LISTOUT=POUT (get restraint violation list) | |DISANG=RST.f (file containing NMR restraints) | +---------------------------------------------------------------+
The restraint file referenced above (RST.f) would ordinarily hold hundreds to thousands of constraints based on experimental information. The constraints would keep the protein near its native conformation even though we have heated to a very high temperature. Examples of such constraint files are given in the SANDER section of the Users' Manual. Since a refinement like this can take a long time to run, we have not actually included files for it in the tutorial directory: the example given above can serve as a guide for real calculations that you might want to carry out.
Next, we turn from "vacuum" simulations to consider how to set up and carry out molecular dynamics simulations in a box of solvent water, using periodic boundary conditions. This example is typical of many molecular dynamics simulations begin carried out with AMBER.
This requires just a modified version of the LEaP input file we used
above, calling the solvateBox command as well as addIons:
+-----------------------------------------------------------+ | leap.in | +-----------------------------------------------------------+ |frcmod = loadamberparams frcmod.pcy | |loadAmberPrep mem.in | |set MEM restype protein | |loadAmberPrep hicu.in | |set HIC restype protein | |plc = loadPdb 1plc.protein.pdb | |bond plc.87.ND1 plc.37.CU | |bond plc.84.SG plc.37.CU | |bond plc.92.SD plc.37.CU | |addIons plc Na+ 0 | |solvateBox plc WATBOX216 10.0 0.8 | |saveAmberParm plc prmtop.wat prmcrd.wat | +-----------------------------------------------------------+
We will first run a simple minimization in water to remove initial bad contacts:
+--------------------------------------------------------+ | min2.in | +--------------------------------------------------------+ | | | minimization run | | &cntrl | | imin=1, | | ntb=1, ntc=2, | | maxcyc=500, ntpr=25, | | &end | | | +--------------------------------------------------------+
Here is the command to run:
sander -O -i min2.in -c prmcrd.wat -p prmtop.wat
-o min2.out -r min2.xyz &(This is run in background, since it will take a few minutes to run; Next, try out a short molecular dynamics run; actual "production" computations would include many, many more MD steps than is given here.
+----------------------------------------------------------------+ | md1.in | +----------------------------------------------------------------+ | | |molecular dynamics run | | &cntrl | | imin=0, irest=0, ntx=1, tempi=0., (no restart MD) | | ntt=1, temp0=300.0, tautp=0.2, (temperature control) | | ntp=2, taup=0.2, (pressure control) | | ntb=2, ntc=2, ntf=2, (SHAKE, periodic bc.) | | nstlim=500, (run for 0.0005 nsec) | | ntwe=100, ntwx=100, ntpr=25, (output frequency) | | &end | | | +----------------------------------------------------------------+
Here we will start from the output of the minimization step to carry out the dynamics:
sander -O -i md1.in -c min2.xyz -p prmtop.wat
-o md1.out -r md1.xyz -x md1.crd -e md1.ene &The output will include the md1.out file giving information about the progress of the trajectory, along with md1.crd and md1.ene files that contain the coordinates and energy information at every 100th step, respectively. Many of the analysis programs in AMBER can use these sorts of files.