Example 2. A more complicated protein example.


This section works through in some detail setting up a protein simulation in AMBER. The example is for plastocyanin in water, and contains a number of things that experienced AMBER users know how to do, but which may be far from obvious for others. In particular, there are a number of items that go beyond a simple protein. The examples assume you have an environment variable $AMBERHOME that points to the top of your AMBER distribution. The files here can be found in $AMBERHOME/examples/pcy.

  1. Plastocyanin contains a metal ion bound to four amino acids, and I also want to modify a methionine residue that is bound to the copper in such a way that it has a different type of sulfur than is found in the standard database.
  2. The Brookhaven crystallographic file (1PLC) contains crystallographic waters, which I might want to keep. Only the oxygen positions are provided, so I will need to try to figure out where to put protons.
  3. Somewhat unusually, this PDB file has proton positions for the protein, which I would like to keep. However, Brookhaven uses proton names that are different than what NMR spectroscopists use, and I would like to be able to use the latter to make easy contact with NMR results.
  4. Using the most probable ionization states of the protein (at neutral pH) results in a protein with a net charge of -8, so I would like to include mobile counterions in the solution to create an overall neutral system.

This will be a lot of work, but it's infinitely easier now in AMBER than it used to be. We will also use this example to show how to set up some constraints, such as might be found in a NMR refinement, and will illustrate how to carry out simulated annealing optimizations.

Make database file for the modified residues.

For plastocyanin, I will define two new types of residues: HIC, which will be a histidine coupled to a copper ion, and which will take the place of HIS 37 in the "real" sequence, and MEM, which is a modified methionine where the sulfur atom is of type "SM" rather than type "S". "SM" is a type I made up, and will use to create special force field parameters for MET 94, which is bonded to the copper ion with a fairly long bond.

Here are the input files for these two residues:

+-------------------------------------------------------------------------+
|				 hicu.in				  |
+-------------------------------------------------------------------------+
|    0	  0    2							  |
|									  |
|HISTIDINE PLUS								  |
|hicu.db94								  |
| HIC  INT     1							  |
| CORR OMIT DU	 BEG							  |
|   0.00000								  |
|   1  DUMM  DU	   M	0  -1  -2     0.000	0.000	  0.000	  0.00000 |
|   2  DUMM  DU	   M	1   0  -1     1.449	0.000	  0.000	  0.00000 |
|   3  DUMM  DU	   M	2   1	0     1.522   111.100	  0.000	  0.00000 |
|   4  N     N	   M	3   2	1     1.335   116.600	180.000	 -0.34790 |
|   5  H     H	   E	4   3	2     1.010   119.800	  0.000	  0.27470 |
|   6  CA    CT	   M	4   3	2     1.449   121.900	180.000	 -0.13540 |
|   7  HA    H1	   E	6   4	3     1.090   109.500	300.000	  0.12120 |
|   8  CB    CT	   3	6   4	3     1.525   111.100	 60.000	 -0.04140 |
|   9  HB2   HC	   E	8   6	4     1.090   109.500	300.000	  0.08100 |
|  10  HB3   HC	   E	8   6	4     1.090   109.500	 60.000	  0.08100 |
|  11  CG    CC	   S	8   6	4     1.510   115.000	180.000	 -0.00120 |
|  12  ND1   NB	   B   11   8	6     1.390   122.000	180.000	 -0.15130 |
|  13  CU    CU	   E   12  11	8     2.050   126.000	  0.000	  0.38660 |
|  14  CE1   CR	   B   12  11	8     1.320   108.000	180.000	 -0.01700 |
|  15  HE1   H5	   E   14  12  11     1.090   120.000	180.000	  0.26810 |
|  16  NE2   NA	   B   14  12  11     1.310   109.000	  0.000	 -0.17180 |
|  17  HE2   H	   E   16  14  12     1.010   125.000	180.000	  0.39110 |
|  18  CD2   CW	   S   16  14  12     1.360   110.000	  0.000	 -0.11410 |
|  19  HD2   H4	   E   18  16  14     1.090   120.000	180.000	  0.23170 |
|  20  C     C	   M	6   4	3     1.522   111.100	180.000	  0.73410 |
|  21  O     O	   E   20   6	4     1.229   120.500	  0.000	 -0.58940 |
|									  |
|LOOP									  |
| CG   CD2								  |
|									  |
|IMPROPER								  |
| -M   CA   N	 H							  |
| CA   +M   C	 O							  |
| CE1  CD2  NE2	 HE2							  |
| CG   NE2  CD2	 HD2							  |
| ND1  NE2  CE1	 HE1							  |
| ND1  CD2  CG	 CB							  |
|									  |
|DONE									  |
|STOP									  |
+-------------------------------------------------------------------------+

+-------------------------------------------------------------------------+
|				  mem.in				  |
+-------------------------------------------------------------------------+
|    0	  0    2							  |
|									  |
|METHIONINE, with SM atom type for the sulfur				  |
|mem.db94								  |
| MEM  INT     1							  |
| CORR OMIT DU	 BEG							  |
|   0.00000								  |
|   1  DUMM  DU	   M	0  -1  -2     0.000	0.000	  0.000	  0.00000 |
|   2  DUMM  DU	   M	1   0  -1     1.449	0.000	  0.000	  0.00000 |
|   3  DUMM  DU	   M	2   1	0     1.522   111.100	  0.000	  0.00000 |
|   4  N     N	   M	3   2	1     1.335   116.600	180.000	 -0.41570 |
|   5  H     H	   E	4   3	2     1.010   119.800	  0.000	  0.27190 |
|   6  CA    CT	   M	4   3	2     1.449   121.900	180.000	 -0.02370 |
|   7  HA    H1	   E	6   4	3     1.090   109.500	300.000	  0.08800 |
|   8  CB    CT	   3	6   4	3     1.525   111.100	 60.000	  0.03420 |
|   9  HB2   HC	   E	8   6	4     1.090   109.500	300.000	  0.02410 |
|  10  HB3   HC	   E	8   6	4     1.090   109.500	 60.000	  0.02410 |
|  11  CG    CT	   3	8   6	4     1.525   109.470	180.000	  0.00180 |
|  12  HG2   H1	   E   11   8	6     1.090   109.500	300.000	  0.04400 |
|  13  HG3   H1	   E   11   8	6     1.090   109.500	 60.000	  0.04400 |
|  14  SD    SM	   S   11   8	6     1.810   110.000	180.000	 -0.27370 |
|  15  CE    CT	   3   14  11	8     1.780   100.000	180.000	 -0.05360 |
|  16  HE1   H1	   E   15  14  11     1.090   109.500	 60.000	  0.06840 |
|  17  HE2   H1	   E   15  14  11     1.090   109.500	180.000	  0.06840 |
|  18  HE3   H1	   E   15  14  11     1.090   109.500	300.000	  0.06840 |
|  19  C     C	   M	6   4	3     1.522   111.100	180.000	  0.59730 |
|  20  O     O	   E   19   6	4     1.229   120.500	  0.000	 -0.56790 |
|									  |
|IMPROPER								  |
| -M   CA   N	 H							  |
| CA   +M   C	 O							  |
|									  |
|DONE									  |
|STOP									  |
+-------------------------------------------------------------------------+

I made mem.in just by copying the relevant portions of the methionine entry from all_amino94.in in the database directory, changing the atom type of the sulfur, and adding appropriate first and last lines. Similar things were done for the histidine residue (starting from the library's HIP residue), except that I added a copper atom bonded to ND1. It is a good idea to read these files alongside the PREP documentation.

Do some editing of the PDB file.

Several small changes need to be made to the input PDB file to make it work with Amber:

  1. First, we need to split of the HOH water residues into a separate file, say watpdb, and remove them from the main PDB file (call this modified file 1plc.nowat.pdb). Further, the remarks in this pdb file indicate that waters #183 and #187 are a disordered pair, and should not both be present. So, I arbitrarily choose to delete #187 and to keep #183. For some reason, these two disordered waters were both put in the PDB file, and not assigned as alternate conformers. Note that AMBER by default will also choose only the principal position for disordered side chains, i.e. the "A" conformation if there is more than one. But this is done automatically, and does not require editing. If you want to start a simulation from the "B" conformation of a side chain, you need to manually edit the PDB file to remove the "A" conformation and blank-out the alternate conformation flag for the atoms you want AMBER to use. Generally speaking, you have to look carefully at a Brookhaven file before really using it.

    (An alternative is to simply strip out the "crystallographic" waters and not use them at all. This is most appropriate if you are planning an MD or free energy simulation that will go through an extensive equilibration period before the "real" simulation begins. One goal of equilibration is to minimize dependence upon the starting conditions, and certainly the individual water molecules will move around a lot during any decent equilibration. At that point, the fact that you went to some trouble to originally place the waters in some nice positions may be irrelevant. Or maybe not; opinions differ on this matter, which is why we try to provide flexible tools in Amber. For this tutorial, we will not use the "crystallographic" waters in our starting structure.)

    One final change involves residue names. Brookhaven files do no distinguish between cysteine residue that are involved in bonds to other things (and hence which have no proton on the sulfur atom) and "free" residues that do have such a proton. Molecular mechanics studies need to make this sort of distinction. Since residue 84 in plastocyanin has the sulfur atom bonded to the copper ion, I changed its name from CYS to CYX. Similar comments apply to histidines: molecular mechanics studies need to know (or guess, or assume) whether the histidine has a proton bonded at the `delta` position (HID), at the `epsilon` position (HIE), or at both (HIP). This is pretty easy for plastocyanin, since the two histidine side chains are both bound to copper through the ND1 nitrogen. So we initially change both HIS residues to HIE, in order to tell AMBER to put the protons on the NE2 nitrogen. (Note that in many other proteins, it will often be reasonable to assign surface-accessible histidines to be protonated, residue name HIP.)

  2. Next, we need to work on the proton names in the main protein file. Most Brookhaven crystallographic files do not have protons, so the protonate program is used to add them. Even here, we want to change the names Brookhaven uses to NMR conventions as described above, so we will still use protonate. This program also does sanity and chirality checking, so it is generally a good idea to use it prior to putting any pdb file into Amber. Now run:

    csh:
    ( protonate -k -d PROTON_INFO < 1plc.nowat.pdb
    > 1plc.nowat.H.pdb) >& protonate.out

    sh:
    protonate -k -d PROTON_INFO < 1plc.nowat.pdb
    > 1plc.nowat.H.pdb 2> protonate.out

    The -k option changes the names but "keeps" the positions of the protons in the input pdb file. As in other examples, you need to use the location of the PROTON_INFO file on your machine in place the the location listed above.
  3. Next, I moved the copper ATOM card from the end of the pdb file into residue 37, changing its residue name to "HIC" and its residue number to 37. I also changed the residue name for the rest of atoms of residue 37 from "HIE" to "HIC", and changed the residue name for residue 92 from "MET" to "MEM" as described above. I call this file 1plc.protein.pdb.

Set up the system without solvent.

We first need to make modifications to the AMBER force field to recognize the copper atom and the modified methionine residue. The best way to do this is not to edit the main force field file, but to create a frcmod file with changes specific to each project. Here is the one I created for plastocyanin:

+-----------------------------------------------------------+
|			 frcmod.pcy			    |
+-----------------------------------------------------------+
|#  modifications to force field for poplar plastocyanin    |
|MASS							    |
|SM 32.06						    |
|CU 65.36						    |
|							    |
|BOND							    |
|NB-CU	  70.000   2.05000	 kludge by JRS		    |
|CU-S	  70.000   2.10000	 kludge by JRS		    |
|CU-SM	  70.000   2.90000	 for pcy		    |
|CT-SM	 222.000   1.81000	 met(aa)		    |
|							    |
|ANGLE							    |
|CU-NB-CV   50.000     126.700	  JRS estimate		    |
|CU-NB-CR   50.000     126.700	  JRS estimate		    |
|CU-NB-CP   50.000     126.700	  JRS estimate		    |
|CU-NB-CC   50.000     126.700	  JRS estimate		    |
|CU-SM-CT   50.000     120.000	  JRS estimate		    |
|CU-S -CT   50.000     120.000	  JRS estimate		    |
|CU-S -C2   50.000     120.000	  JRS estimate		    |
|CU-S -C3   50.000     120.000	  JRS estimate		    |
|NB-CU-NB   10.000     110.000	  dac estimate		    |
|NB-CU-SM   10.000     110.000	  dac estimate		    |
|NB-CU-S    10.000     110.000	  dac estimate		    |
|SM-CU-S    10.000     110.000	  dac estimate		    |
|CU-SM-CT   50.000     120.000	  JRS estimate		    |
|CT-CT-SM   50.000     114.700	   met(aa)		    |
|HC-CT-SM   35.000     109.500				    |
|H1-CT-SM   35.000     109.500				    |
|CT-SM-CT   62.000	98.900	  MET(OL)		    |
|							    |
|DIHE							    |
|X -NB-CU-X    1       0.000	 180.000       3.000	    |
|X -CU-SM-X    1       0.000	 180.000       3.000	    |
|X -CU-S -X    1       0.000	 180.000       3.000	    |
|X -CT-SM-X    3       1.000	   0.000       3.000	    |
|							    |
|NONBON							    |
|  CU	    2.20      0.200				    |
|  SM	    2.00      0.200				    |
|							    |
+-----------------------------------------------------------+

Creating a frcmod file is a bit of an art, since one is often faced with unknown parameters (such as force constants from copper to its ligands in the above example). In this tutorial, I am just going over the mechanics of running AMBER, and make no claims about the validity or appropriateness of the above parameters. There is a big literature on parameter estimation, and users are encouraged to consult this when faced with unusual chemical environments. A starting point is Appendix C: Parameter Development, in this manual.

Now we are ready to run LEaP. Let's start by setting up an input file:

+-----------------------------------------------------------+
|			  leap.in			    |
+-----------------------------------------------------------+
|frcmod = loadamberparams frcmod.pcy			    |
|loadAmberPrep mem.in					    |
|set MEM restype protein				    |
|loadAmberPrep hicu.in					    |
|set HIC restype protein				    |
|plc = loadPdb 1plc.protein.pdb				    |
|bond plc.87.ND1 plc.37.CU				    |
|bond plc.84.SG plc.37.CU				    |
|bond plc.92.SD plc.37.CU				    |
|saveAmberParm plc prmtop.nowat prmcrd.nowat		    |
+-----------------------------------------------------------+

By default, LEaP will read in the standard AMBER 94 force field libraries (Cornell et al. 95). The first line in the file above loads the extra force field parameters from the frcmod.pcy file described above. The next four lines load the files describing the modified residues HIC and MEM and set their residue types. Then a molecule, named "plc" is created by reading in the appropriate pdb file. (LEaP will often complain at this point if it finds something it doesn't understand or doesn't like; a typical task would be to modify the pdb file and try again.) Next, the three "cross-links" that connect residues 84, 87 and 92 to the copper atom are added via the bond command. Finally, the saveAmberParm command is used to create the required output files. Details of all of these commands can be found in the LEaP manual.

The file above is read into LEaP as follows; ">" is the prompt that LEaP provides:


tleap
> source leap.in
> quit

Run a simple minimization.

We start with a very simple minimization with restraints to keep the protein from moving too far. In the examples below, do not include the comments in parenthesis in your actual files.

+----------------------------------------------------------------+
|			     min1.in				 |
+----------------------------------------------------------------+
|								 |
|Minimization with Cartesian restraints				 |
|  &cntrl							 |
|    imin=1, maxcyc=200,	  (invoke minimization)		 |
|    scee=1.2, igb=3, cut=12.0,	  (force field options)		 |
|    ntpr=5,			  (print frequency)		 |
|    ntr=1, ntb=0,		  (turn on cartesian restraints) |
|  &end								 |
|Group input for restrained atoms				 |
| 1.0				  (force constant for restraint) |
|RES 1 99			  (all atoms in residues 1-99)	 |
|END								 |
|END								 |
|								 |
+----------------------------------------------------------------+

To run this file, use the following command:


sander -i min1.in -c prmcrd.nowat -p prmtop.nowat -ref prmcrd.nowat
-o min1.out -r min1.xyz

Run simulated annealing optimization.

At this point, one often would want to carry out a more robust optimization, using a dynamical simulated annealing protocol. One also might add some sort of constraints at this point. In this example, we will illustrate how NMR-derived constraints might be incorporated; constraints from other sources of information would be handled in a similar fashion.

+---------------------------------------------------------------------------+
|			      simul_anneal.in				    |
+---------------------------------------------------------------------------+
|

| | 15ps simulated annealing protocol | | &cntrl | | nstlim=15000, ntt=1, (time limit, temp. control) | | scee=1.2, igb=3, ntb=0, (scee holds the 1-4 scale factor) | | ntpr=500, pencut=0.1, (control of printout) | | ipnlty=1, nmropt=1, (NMR penalty function options) | | vlimit=10, (prevent bad temp. jumps) | | &end | |# | |# Simple simulated annealing algorithm: | |# | |# from steps 0 to 1000: raise target temperature 10->1200K | |# from steps 1000 to 3000: leave at 1200K | |# from steps 3000 to 15000: re-cool to low temperatures | |# | | &wt type='TEMP0', istep1=0,istep2=1000,value1=10., | | value2=1200., &end | | &wt type='TEMP0', istep1=1001, istep2=3000, value1=1200., | | value2=1200.0, &end | | &wt type='TEMP0', istep1=3001, istep2=15000, value1=0., | | value2=0.0, &end | |# | |# Strength of temperature coupling: | |# | |# steps 0 to 3000: tight coupling for heating and equilibration | |# steps 3000 to 11000: slow cooling phase | |# steps 11000 to 13000: somewhat faster cooling | |# steps 13000 to 15000: fast cooling, like a minimization | |# | | &wt type='TAUTP', istep1=0,istep2=3000,value1=0.2, | | value2=0.2, &end | | &wt type='TAUTP', istep1=3001,istep2=11000,value1=4.0, | | value2=2.0, &end | | &wt type='TAUTP', istep1=11001,istep2=13000,value1=1.0, | | value2=1.0, &end | | &wt type='TAUTP', istep1=13001,istep2=14000,value1=0.5, | | value2=0.5, &end | | &wt type='TAUTP', istep1=14001,istep2=15000,value1=0.05, | | value2=0.05, &end | | | | (continued on next page) | +---------------------------------------------------------------------------+

+---------------------------------------------------------------+
|		  simul_anneal.in (continued)			|
+---------------------------------------------------------------+
|#								|
|#  "Ramp up" the restraints over the first 3000 steps:		|
|#								|
| &wt type='REST', istep1=0,istep2=3000,value1=0.1,		|
|	     value2=1.0,  &end					|
| &wt type='REST', istep1=3001,istep2=15000,value1=1.0,		|
|	     value2=1.0,  &end					|
|								|
| &wt type='END'  &end						|
|LISTOUT=POUT		       (get restraint violation list)	|
|DISANG=RST.f		       (file containing NMR restraints) |
+---------------------------------------------------------------+

The restraint file referenced above (RST.f) would ordinarily hold hundreds to thousands of constraints based on experimental information. The constraints would keep the protein near its native conformation even though we have heated to a very high temperature. Examples of such constraint files are given in the SANDER section of the Users' Manual. Since a refinement like this can take a long time to run, we have not actually included files for it in the tutorial directory: the example given above can serve as a guide for real calculations that you might want to carry out.

Setup the system with counterions in a box of water.

Next, we turn from "vacuum" simulations to consider how to set up and carry out molecular dynamics simulations in a box of solvent water, using periodic boundary conditions. This example is typical of many molecular dynamics simulations begin carried out with AMBER.

Use LEaP to set up the prmtop file

This requires just a modified version of the LEaP input file we used above, calling the solvateBox command as well as addIons:

+-----------------------------------------------------------+
|			  leap.in			    |
+-----------------------------------------------------------+
|frcmod = loadamberparams frcmod.pcy			    |
|loadAmberPrep mem.in					    |
|set MEM restype protein				    |
|loadAmberPrep hicu.in					    |
|set HIC restype protein				    |
|plc = loadPdb 1plc.protein.pdb				    |
|bond plc.87.ND1 plc.37.CU				    |
|bond plc.84.SG plc.37.CU				    |
|bond plc.92.SD plc.37.CU				    |
|addIons plc Na+ 0					    |
|solvateBox plc WATBOX216 10.0 0.8			    |
|saveAmberParm plc prmtop.wat prmcrd.wat		    |
+-----------------------------------------------------------+

Run solvated molecular dynamics simulation.

We will first run a simple minimization in water to remove initial bad contacts:

+--------------------------------------------------------+
 |			  min2.in			  |
 +--------------------------------------------------------+
 |							  |
 |  minimization run					  |
 | &cntrl						  |
 |    imin=1,						  |
 |    ntb=1, ntc=2,					  |
 |    maxcyc=500, ntpr=25,				  |
 | &end							  |
 |							  |
 +--------------------------------------------------------+

Here is the command to run:


sander -O -i min2.in -c prmcrd.wat -p prmtop.wat
-o min2.out -r min2.xyz &

(This is run in background, since it will take a few minutes to run; Next, try out a short molecular dynamics run; actual "production" computations would include many, many more MD steps than is given here.

+----------------------------------------------------------------+
|			     md1.in				 |
+----------------------------------------------------------------+
|								 |
|molecular dynamics run						 |
| &cntrl							 |
|    imin=0, irest=0, ntx=1, tempi=0.,	   (no restart MD)	 |
|    ntt=1, temp0=300.0, tautp=0.2,	   (temperature control) |
|    ntp=2, taup=0.2,			   (pressure control)	 |
|    ntb=2, ntc=2, ntf=2,		   (SHAKE, periodic bc.) |
|    nstlim=500,			   (run for 0.0005 nsec) |
|    ntwe=100, ntwx=100, ntpr=25,	   (output frequency)	 |
| &end								 |
|								 |
+----------------------------------------------------------------+

Here we will start from the output of the minimization step to carry out the dynamics:


sander -O -i md1.in -c min2.xyz -p prmtop.wat
-o md1.out -r md1.xyz -x md1.crd -e md1.ene &

The output will include the md1.out file giving information about the progress of the trajectory, along with md1.crd and md1.ene files that contain the coordinates and energy information at every 100th step, respectively. Many of the analysis programs in AMBER can use these sorts of files.



[Contents] [Previous] [Next]
Updated on January 5, 2000. Comments to case@scripps.edu