Molecular Modeling for Biological Systems (Supercomputer Teleconference) Part 1
- 1990-Jan-24
These captions and transcript were generated by a computer and may contain errors. If there are significant errors that should be corrected, please let us know by emailing digital@sciencehistory.org.
Transcript
00:00:00 This program was made possible by support from Digital Equipment Corporation.
00:00:07 Additional support was provided by Tripos Associates.
00:00:30 Good morning, ladies and gentlemen, and welcome to another American Chemical Society satellite
00:00:47 TV course.
00:00:48 I am Sylvia Ware, Director of the ACS Education Division.
00:00:52 We are pleased to co-sponsor today's course in Molecular Modeling with San Diego State
00:00:57 University and the San Diego Supercomputer Center.
00:01:01 One of the primary objectives of the society is to bring up-to-date information about chemistry
00:01:07 to our members and other interested scientists.
00:01:10 In over 100 years, ACS has accomplished this objective in many ways.
00:01:15 ACS publishes numerous books and journals, conducts many meetings, and produces a variety
00:01:20 of short courses, audio, and videotape programs.
00:01:24 The satellite television courses are our newest means of communicating timely educational
00:01:29 material directly to you.
00:01:32 The capability to receive satellite programs is still in its infancy in the chemical community.
00:01:38 Receiving such programs is common in many universities and colleges.
00:01:42 However, most chemical companies do not presently own dishes to receive programs at their research
00:01:48 labs.
00:01:49 We hope to see this situation change rapidly during the next few years, so that tuning
00:01:54 in to chemical education via television will become just as easy as turning on the evening
00:02:00 news at home, and it will become commonplace to hear late-breaking scientific information
00:02:06 directly from the scientists doing the work.
00:02:10 Thank you for joining us in this program today and helping us make satellite communications
00:02:15 a reality for the American Chemical Society.
00:02:19 San Diego Supercomputer Center, San Diego State University, and the American Chemical
00:02:26 Society presents Molecular Modeling for Biological Systems.
00:02:33 Welcome to Molecular Modeling for Biological Systems.
00:02:37 I'm your moderator, Whitney Mandel.
00:02:40 San Diego State University and San Diego Supercomputer Center have worked together on previous supercomputer
00:02:46 video conferences.
00:02:48 We are pleased by the American Chemical Society's sponsorship of today's program.
00:02:54 Supercomputers are simply computers.
00:02:56 They are qualitatively the same as standard computers, with the difference of being quantitatively
00:03:02 faster, so they are defined as leading-edge technology.
00:03:07 What is interesting is that this faster speed allows for a qualitative change in the type
00:03:13 of problems that can be solved.
00:03:15 In today's program, we will be looking at applications that can tax this technology.
00:03:21 Any researcher can gain access to the National Science Foundation-sponsored supercomputer
00:03:27 centers across the U.S.
00:03:29 The San Diego Supercomputer Center welcomes proposals from all researchers.
00:03:35 Researchers and industry are invited to become sponsors and use the resources here in San
00:03:40 Diego.
00:03:41 And here is Sid Caron, director of the San Diego Supercomputer Center.
00:03:47 Hello, and welcome to the San Diego Supercomputer Center.
00:03:53 Biochemists and molecular biologists have made extensive use of our supercomputer facility.
00:03:58 They are now able to calculate and visualize the solution of problems that just a few short
00:04:02 years ago they only dreamed of solving.
00:04:06 As the technical advisors for this teleconference, the chemistry staff at SDSC has assembled
00:04:11 a group of leading computational experts in this field.
00:04:15 Together with you, today they will work to push ahead the frontiers of computational
00:04:20 biochemistry.
00:04:22 I know that you'll feel the same excitement that I feel after seeing the results that
00:04:26 are now achievable.
00:04:30 Thank you, Sid Caron.
00:04:31 I want to take this opportunity to welcome the participants at over 50 sites throughout
00:04:36 the United States and Canada.
00:04:38 And to introduce Dr. Arthur Olson, Art is a member of the Research Institute of Scripps
00:04:44 Clinic here in La Jolla and director of its Molecular Graphics Laboratory.
00:04:49 His work is in molecular graphics and biomolecular interactions.
00:04:54 Art will provide a user's perspective on today's talks and facilitate the discussions during
00:04:59 the question and answer periods.
00:05:01 Art?
00:05:02 Hello, Whitney.
00:05:03 Hello.
00:05:04 These are exciting times for the field of biomolecular modeling since two rapidly evolving
00:05:14 technologies are converging in this area.
00:05:18 On the one hand, we have the advances in modern molecular biology, which are producing materials
00:05:24 and raising questions that weren't even thought about 10 years ago.
00:05:28 On the other hand, we have the evolving power of computers, which are really bringing the
00:05:35 capability to answer questions of the complexity raised by molecular biology to a wide variety
00:05:42 of scientists.
00:05:43 And I'm sure that the excitement that I feel will be echoed by our speakers today.
00:05:47 Thanks, Art.
00:05:48 There are a couple of things we need to cover before we hear from our speakers.
00:05:53 First, I want to point out that each of you should have received a copy of the course
00:05:58 notes prepared by our speakers.
00:06:01 Complete transcripts of the speakers' presentations and copies of their diagrams will be made
00:06:06 available to your site coordinator soon.
00:06:09 We are providing these transcripts to the local site coordinators as part of your registration
00:06:14 in this course.
00:06:15 Also, each of you should have received an evaluation form.
00:06:20 Please complete the evaluation and return it to the site coordinator when you leave.
00:06:24 We need your feedback to design programs that fit your needs and interests.
00:06:29 In our program today, we will hear from four speakers.
00:06:33 Dr. Peter Coleman will begin with methods in molecular modeling, an overview of types
00:06:39 of problems being addressed.
00:06:41 Next, we'll hear from Dr. William Jorgensen on structure and binding in bio-organic host-guest
00:06:48 chemistry.
00:06:49 Dr. Jeffrey Blaney is our next speaker.
00:06:52 He'll be covering a distance geometry approach to ligand macromolecular docking.
00:06:59 Then Peter Coleman joins us again with a presentation of current simulations using molecular dynamics
00:07:05 and free energy perturbation applications to study large molecules.
00:07:10 And Dr. David Case will be our final speaker with a presentation on determining three-dimensional
00:07:16 solution structures of proteins from NMR data.
00:07:20 We will have question and answer periods after each of the presentations with a full panel
00:07:25 discussion at the end of the program.
00:07:27 I will give you the numbers to call now.
00:07:30 In California, the number is 1-800-942-1515.
00:07:36 In the U.S., the number is 1-800-972-1515.
00:07:41 And if you are calling from Canada, the number to call is 1-619-265-6429.
00:07:49 Our Canadian callers can call collect.
00:07:52 As we approach our question and answer periods, we will let you know that we're getting ready
00:07:57 for your calls.
00:07:58 Well, Art, we're ready to hear from Peter Coleman.
00:08:02 Even though Peter doesn't look old enough, he's certainly considered one of the founders
00:08:06 in the field of biomolecular modeling, and rightly so since the AMBER program comes out
00:08:13 of his lab and it's used now widely throughout the world.
00:08:17 In addition, Peter's application work in modeling DNA dynamics and ionophores is cited
00:08:23 throughout the literature.
00:08:26 I'm looking forward to Peter's comments.
00:08:32 Thank you, Art.
00:08:33 Perhaps the reason that I look young is because of all the gray hairs I've torn out while
00:08:39 we were developing the AMBER program.
00:08:43 My goal today is to give an overview of molecular modeling methods and a selected set of applications.
00:08:49 The focus of my remarks will be on organic and biological systems, but there are many
00:08:55 areas of modeling involving zeolites, metals, and polymers which I will not attempt to cover.
00:09:01 I will attempt to keep my remarks at a level that any well-educated chemist can understand
00:09:07 without boring the experts.
00:09:10 The molecular modeling methods that I will discuss in cursory detail because of time
00:09:14 limitation are listed in the first slide.
00:09:19 They include quantum mechanical calculations, molecular mechanics and dynamics, Monte Carlo
00:09:26 calculations, computer graphics, distance geometry, and pharmacophoric pattern matching
00:09:32 and QSAR.
00:09:34 The selected applications include some that have been, in my biased view, have been among
00:09:41 the most important and significant in the last decade.
00:09:46 These include calculating the free energies of activation of chemical reactions in solution
00:09:53 and in enzyme-active sites, calculating the free energies of non-covalent associations
00:09:59 of ligands to macromolecules, predictions of protein tertiary structures using homology
00:10:05 model building, the use of distance geometry in docking and drug design, and the use of
00:10:11 distance geometry or molecular dynamics in macromolecular structure determination.
00:10:18 We now proceed to describe the molecular modeling methods with the applications interspersed
00:10:24 in the discussion.
00:10:27 The goal of molecular modeling is to simulate real chemistry.
00:10:32 If the molecule to be studied is small and in the gas phase, one can apply quantum mechanical
00:10:37 methods.
00:10:39 The Schrodinger equation, shown in the third slide, is a many-particle differential equation
00:10:44 that cannot be solved analytically for more than one particle.
00:10:49 But within the non-relativistic Born-Oppenheimer approximation, in which one neglects relativistic
00:10:54 effects and assumes the electrons move in the field of fixed nuclei, one can approximate
00:11:00 the solution to the Schrodinger equation numerically.
00:11:04 There are three major classes of quantum mechanical calculations applied to molecules of chemical
00:11:09 and biological interest.
00:11:11 The first, the ab initio approach, makes no further approximations in solving the Schrodinger
00:11:16 equation.
00:11:17 In its implementation, it usually assumes that one can represent the wave function as
00:11:21 a linear combination of Slater determinants of one-electron functions, each of which is
00:11:26 a linear combination of atomic orbitals.
00:11:29 Using a single determinant for the wave function and solving for the coefficients in these
00:11:32 linear combinations is called the Hartree-Fock LCAO, linear combination of atomic orbital
00:11:38 molecular orbital approximation.
00:11:41 One can go beyond this single determinant approximation in various ways, including configuration
00:11:46 interaction and perturbation theory, for example, molar Plesset approaches.
00:11:51 Currently, such ab initio approaches can be applied to a wide variety of organic molecules.
00:11:57 However, for the non-expert, one should remember that one ab initio calculation can differ
00:12:02 dramatically from another in rigor and computer time.
00:12:06 For example, the results of calculation when one uses a minimal set of atomic orbitals
00:12:11 to represent the molecular orbitals and the LCAO-MO approximation can differ qualitatively
00:12:17 and quantitatively from those in which one uses a much larger set of atomic orbitals
00:12:23 and goes beyond the single determinant approximation.
00:12:27 Of course, the accuracy one needs to connect the calculations to real chemistry differs
00:12:33 depends on what one wishes to simulate, but for molecules of a few atoms in the first
00:12:38 rows of the periodic table, one can simulate most of their properties to experimental accuracy.
00:12:44 The size of the molecule for which this can be done is a very sensitive function of available
00:12:48 computer power.
00:12:51 The development of semi-empirical quantum mechanical methods in which one approximates
00:12:55 the solutions to the Schrodinger equation using semi-empirical adjustments to the Hamiltonian
00:13:00 has been catalyzed by the limitations in the size of the system to which ab initio methods
00:13:04 can apply.
00:13:06 The current most highly developed of such approaches are those that originated from
00:13:10 the Doerr group in Texas, MNDO3, MNDO, AM1, and PM3, whose focus is on molecules of organic
00:13:16 chemical interest, and Zindo from Zerner in Florida, which can handle transition metal
00:13:21 systems.
00:13:23 Of course, there are other methods that are useful in some applications, such as X-alpha
00:13:26 approaches, which case has contributed to the development of biological systems, extended
00:13:31 Huckel theory applied in a useful qualitative way for organic and inorganic systems by Hoffman
00:13:36 of Cornell, and assortment of others that have a particular niche in chemistry.
00:13:40 Finally, there are valence bond approaches, and the EVB, empirical valence bond approach
00:13:45 developed by the Warshall group, has had many useful applications to chemical and biological
00:13:50 molecules.
00:13:52 Valence bond theory starts out with a different approach than molecular orbital theory in
00:13:55 the way the molecular wave function is constructed.
00:13:58 It has the virtue, when used in an empirical fashion as Warshall has, to allow the computer-efficient
00:14:03 approximate solution to the Schrodinger equation.
00:14:07 The above methods solve for the energy of a collection of nucleon electrons as a function
00:14:11 of the nuclear coordinates, and so can, in principle, determine the complete potential
00:14:15 surface of the molecule as a function of nuclear coordinates.
00:14:20 Methods to directly calculate the first and second derivatives of the quantum mechanical
00:14:23 energy as a function of nuclear coordinates have considerably increased the power of such
00:14:27 methods.
00:14:28 However, when one wants to consider the properties of condensed phases and the properties of
00:14:33 macromolecules, it is not appropriate or possible to use quantum mechanical methods alone.
00:14:39 For example, one can simulate water-liquid using an analytical function that has been
00:14:43 developed by carrying out a large number of calculations on water dimer in various configurations
00:14:49 in order to determine the water-water interaction potential.
00:14:53 Liquid properties must be derived by taking an average of a very large number of thermally
00:14:57 accessible configurations, the energy of which must be evaluated.
00:15:02 A typical water-liquid simulation uses 216 molecules and periodic boundary conditions
00:15:06 to represent the system.
00:15:08 One must then evaluate the energy of this 648-atom system and do so a million or so
00:15:13 times.
00:15:14 This clearly requires an energy function that is simple.
00:15:18 One can use either Monte Carlo or molecular dynamics method to generate the configurations
00:15:22 of the system.
00:15:24 Monte Carlo methods typically move one molecule at a time and by comparing the new configuration
00:15:28 to the old either accept it or reject it based on the Boltzmann factor for the relative energies.
00:15:34 Molecular dynamics methods evaluate the energy and its analytical derivatives and move all
00:15:39 the atoms at once using Newton's laws of motion.
00:15:42 Thus, each atom is characterized by both its position and velocity.
00:15:45 It is the kinetic energy of the atoms that allows them to sample various configurations
00:15:50 of the system.
00:15:52 As noted above, one can use quantum mechanical calculations to derive the analytical potentials
00:15:56 which must be used in Monte Carlo or molecular dynamics calculations on liquids.
00:16:01 However, these have a number of disadvantages.
00:16:03 A, when they are based on two-body interactions, they leave out many body interaction effects
00:16:08 which are critical to the quantitative representation of the properties of polar liquids like water.
00:16:14 And they require large amounts of computer time to derive.
00:16:17 And this time goes up even more if many body effects are derived in this fashion.
00:16:22 Thus, the currently most useful approaches to deriving energy functions for liquid simulations
00:16:27 have been empirical.
00:16:29 Both TIPS and SPC models for liquid water were based on simulations that varied the
00:16:33 parameters to force agreement with the calculated density and enthalpy of vaporization of the
00:16:37 liquid.
00:16:39 These achieved this agreement by being effective two-body potentials with many body effects
00:16:43 implicitly built in.
00:16:45 For example, the dipole moment for such water molecules is 2.3 to 2.4 Debye in contrast
00:16:50 to the gas phase value of 1.85 Debye.
00:16:54 In the same spirit of empiricism, one can derive functions that represent all the intra-
00:16:58 and intermolecular interactions of molecules.
00:17:01 And these are called molecular mechanical potential functions.
00:17:05 In the next slide, we describe the parameters in such a function.
00:17:10 They often come from a combination of quantum mechanical and empirical data.
00:17:15 Typically, they are derived from a test set of molecules and then assumed to be transferable
00:17:19 to a wide variety of others.
00:17:21 The reason these methods work at all is, for example, a C-C bond is about 1.5 angstroms
00:17:27 in most molecules, and the deviations from this are small and can be analyzed by strain
00:17:31 effects.
00:17:33 One can derive similar functions for peptides, proteins, nucleic acids, and other macromolecules
00:17:39 and use them in molecular mechanics and molecular dynamics calculations.
00:17:43 The most important sources of experimental data used in such calculations to derive the
00:17:48 functions to be used in such calculations include vibrational frequencies from IR and
00:17:52 Raman data, bond lengths, angles, and dihedral angles from microwave and x-ray structural
00:17:57 data, rotational barrier heights from microwave spectroscopy, and density structures and enthalpies
00:18:02 of vaporization from liquids and crystals.
00:18:05 Arguably, the most important part of these energy functions are the non-bonded interactions,
00:18:09 and these have been derived using crystal lattice simulations, Lifson, Hagler, and others,
00:18:14 or liquid simulations using the Monte Carlo methods, the OPLS parameters derived by Jorgensen.
00:18:21 What is the difference between molecular mechanics and molecular dynamics?
00:18:25 Next two slides.
00:18:26 They both use an energy function E and its analytical gradient, where the force is the
00:18:31 negative gradient of the energy.
00:18:34 Molecular mechanics minimizes this function by moving down in potential energy to the
00:18:38 nearest local minimum.
00:18:41 Molecular dynamics, on the other hand, sets this force equal to the second derivative
00:18:46 of the distance with respect to time, that's Newton's law, and numerically solves for a
00:18:52 trajectory of the system at a given temperature.
00:18:55 The temperature enters in because the velocity of the atoms are kept such that the average
00:18:59 kinetic energy can be related to the classical expression for temperature.
00:19:03 Thus, molecular dynamics does not always decrease the energy and can cover much more of phase
00:19:07 space by surmounting barriers that are of modest size.
00:19:11 However, to solve the equations of motion, one must use numerical methods with time steps
00:19:16 of the order of femtoseconds.
00:19:17 Thus, to represent a single nanosecond trajectory of the system requires a million numerical
00:19:22 integrations.
00:19:23 For a macromolecule, each step is quite time-consuming, with the rate-limiting step being the evaluation
00:19:29 of non-bonded interactions.
00:19:33 Given the difficulties in describing systems with so many degrees of freedom, how can one
00:19:36 study chemical reactions in solution?
00:19:39 One of the most important papers in this regard was by one of our speakers, Bill Jorgensen.
00:19:43 He studied the simple exchange reaction, methyl chloride plus chloride minus going to methyl
00:19:48 chloride plus chloride minus.
00:19:50 By using a high level of ab initio calculations, he was able to show that as the chloride minus
00:19:55 approaches the methyl chloride, the energy decreased as an ion-dipole complex was formed.
00:20:01 Then the energy rose until the formation of the transition state for the reaction, in
00:20:05 which both chlorines were equidistant from the carbon.
00:20:08 The gas phase barrier for the reaction was about 10 kcals per mole, and the one-dimensional
00:20:13 energy surface for chloride approach along the three-fold axis of methyl chloride was
00:20:17 fit to an analytical function, and analytical potentials were derived for chloride minus
00:20:21 water and methyl chloride water interactions, based on quantum mechanical calculations on
00:20:26 these, with appropriate interpolations for the interactions with other species along
00:20:30 the reaction pathway.
00:20:34 Then Monte Carlo methods were used to move the system along the reaction pathway for
00:20:37 the reaction of a periodic box of water molecules.
00:20:41 Umbrella sampling was employed to force the complex to surmount the free energy barrier
00:20:46 and form products.
00:20:47 In this manner, a free energy activation of 26 kcals per mole was calculated in excellent
00:20:52 agreement with the experiment.
00:20:54 The next slide shows a figure from the Jorgensen paper, with the free energy as a function
00:20:59 of reaction coordinate in the gas phase and solution.
00:21:02 Interestingly, in water, there was no ion-dipole complex formed at all.
00:21:07 In subsequent studies in other solvents, it was found that with less strongly interacting
00:21:12 solvents, an ion-dipole minimum, as found in the gas phase, was found.
00:21:17 The calculations also gave insight into the reason for the increased activation free energy
00:21:22 in solution.
00:21:23 The water molecules interact much more strongly with the localized charge on the chloride
00:21:26 than they do with the delocalized charge in the transition state.
00:21:31 Using a very different approach, Warshall was able to simulate the proton transfer and
00:21:34 acylation attack on substrates of trypsin and sutyllysine using a combination of empirical
00:21:39 valence bond and molecular dynamics methods.
00:21:42 The focus was on rationalizing site-specific mutagenesis effects on the activation free
00:21:46 energies in these enzymes, and the calculations were quite successful in this regard.
00:21:51 The next slide shows a figure from Warshall's paper in which he has simulated the reaction
00:21:56 profile in solution versus in the enzyme trypsin.
00:21:59 The reason for the 5 kcal per mole higher activation free energy for the Gly-216-226
00:22:05 to Alla-216-26 mutant was suggested to be structural distortions in the oxyanion hole
00:22:10 due to the two methyl groups.
00:22:12 Similarly, in sutyllysine, replacing acin-155 by a variety of other side chains led to increases
00:22:18 in calculated activation free energies comparable to those found experimentally.
00:22:23 In 1989, Ochvist and Warshall published a paper in biochemistry describing catalysis
00:22:29 by staphylococcal nuclease, which was very successful in reproducing not only the change
00:22:33 in activation free energy due to enzyme catalysis, but also the site-specific mutation effect
00:22:38 of an aspartic to glutamic acid residue.
00:22:42 One should not lose sight of the difficulties in the above studies and their limitations
00:22:45 to few dimensional reaction coordinates and very well-defined reaction mechanisms.
00:22:49 But we see that one of the more useful applications of molecular modeling has been on the study
00:22:53 of complex reactions in solution.
00:22:56 In another set of studies, Bash et al. were able to calculate in good agreement with experiment
00:23:00 the absolute solvation free energy of a wide variety of functional groups relevant to protein
00:23:05 side chains using free energy perturbation theory with molecular dynamics.
00:23:10 This method and the closely related method, thermodynamic integration, allow the calculation
00:23:13 of relative free energies for a wide variety of processes, including changes on either
00:23:18 ligand or the protein on the relative free energy for protein-ligand association, studies
00:23:23 of the relative stability of different DNA sequences and different structural forms,
00:23:27 and studies of the relative stability of a protein and its site-specific mutants.
00:23:31 We will discuss some of these examples in detail in our next lecture.
00:23:37 One of the most exciting developments in recent years has been the integration of molecular
00:23:41 dynamics with NMR or X-ray refined methods, so that one optimizes a target function which
00:23:46 is a linear combination of agreement of experiment and the molecular mechanics energy function.
00:23:52 By varying the relative weights of these two parts of the function, one can improve
00:23:55 the agreement with the experimental data while retaining a low energy structure, something
00:24:00 that is very difficult to do with standard refinement methods.
00:24:04 These combined methods greatly increase the efficiency of refinement and reduce the number
00:24:08 of steps of manual model building required.
00:24:11 They have been of great use in a large number of refinement problems.
00:24:15 Some papers on MD refinement using NMR data include the first by Kaptein et al. in the
00:24:22 next slide, which just is the title of the next slide, and a model calculation on Cramben
00:24:29 by Brunger et al.
00:24:33 In the next slide, the radius of convergence of such calculations is illustrated, i.e.
00:24:37 their ability to start from an extended structure and end up with a native-like structure.
00:24:42 Shish et al. have shown the usefulness of including explicit water in the simulations,
00:24:46 particularly when the NMR data is not very extensive.
00:24:50 In the next slide, one can see that the presence of water gives a structure that is in much
00:24:57 better agreement with the X-ray structure than any other, because the water causes the
00:25:01 burial of hydrophobic groups and exposure of hydrophilic side chains.
00:25:06 That RMD aqueous refers to the deviation from the X-ray structure when water is explicitly
00:25:12 included.
00:25:14 In an equally exciting application, Brunger et al. have shown how one can use MD in refinement
00:25:19 of X-ray structures.
00:25:20 In the next slide, it's just the title of that paper.
00:25:24 Up to now, we have focused on methods which evaluate the energy of the system and use
00:25:28 this energy to move the atoms or evaluate their properties.
00:25:31 There are a number of non-energy-based methods that have become powerful tools in modeling
00:25:36 molecules.
00:25:37 First and foremost are computer graphics methods.
00:25:40 These use molecular models based on energy calculations or experimental data, such as
00:25:44 X-ray crystallography or nuclear magnetic resonance, to construct a three-dimensional
00:25:48 representation of the atoms in the molecule.
00:25:50 New molecules can be constructed using stereochemical principles derived from the databases of known
00:25:55 structures.
00:25:56 Polymers can be built from monomer fragments and manipulated in color and stereo with real-time
00:26:01 rotation and translation of the molecules.
00:26:04 One of our moderators, Olson, is an expert in this area.
00:26:08 These representations are essential not only to do theoretical science on complex systems,
00:26:13 but to present the results of numerical simulations to the rest of the scientific community.
00:26:18 A few examples of computer graphic applications include the development of electrostatic potential
00:26:23 molecular surfaces, the representation of electrostatic potential gradients, and the
00:26:28 applications to superoxide dismutase and the analysis why Hoogsteen-based pair makes the
00:26:35 DNA ligamer, and it's the confirmation that it exists.
00:26:40 The graphics makes clear the results suggested by the crystallographers and confirmed the
00:26:45 molecular mechanics calculations.
00:26:49 The next slide shows, the previous slide showed, the electrostatic potential representation
00:26:55 and the electrostatic potential and its gradient in the active site of superoxide dismutase.
00:27:02 In the next slide, we show two representations of one theoretical and one experimental, and
00:27:10 the one on the right shows why it exists, because one has a snug Van der Waals interaction
00:27:16 between the DNA and its drug in that confirmation and not on the one on the left.
00:27:23 The next slide will show the work by Blaney et al. back in 1982, where he used a hole
00:27:31 in the active site when a known ligand was bound to think about designing new molecules
00:27:38 that would interact more effectively with thyroxine analogs.
00:27:44 This slide illustrates the hole in the active site that Blaney were able to fill with the
00:27:49 design of new analogs.
00:27:52 A second set of methods which do not use energy functions is distance geometry.
00:27:57 These begin with representing the system in terms of atom-atom or group-group distances
00:28:01 and then by using mathematical projection methods from many-dimensional space into three-dimensional
00:28:05 space, turning these distances into a set of three-dimensional structures.
00:28:10 Although originally derived to simulate protein folding, such approaches have been found use
00:28:15 in docking ligands to macromolecules, analyzing the way various small molecules could fit
00:28:19 into unknown receptors, deriving quantitative structure activity relationships for ligand
00:28:24 binding to an unknown receptor, and fitting nuclear magnetic resonance-derived distances
00:28:28 to a three-dimensional structure.
00:28:30 The strength of such methods is that they have a potential to be more unbiased and give
00:28:33 the chemist insight into what he knows and what he doesn't know about the structure in
00:28:37 question.
00:28:39 The weakness is the qualitative nature of the structures, which currently require further
00:28:42 refinement by energy-based methods to be fully realistic.
00:28:46 The next slide illustrates the use of a distance geometry method dock by Desjardins and Kuntz
00:28:53 on some papain-binding ligands to the enzyme papain.
00:28:59 The idea is to examine a large base of ligands and to screen these on steric criteria, saving
00:29:05 the more subtle screen for electrostatic and hydrogen bond complementarity for later.
00:29:11 It is much more common in the pharmaceutical industry to have a number of biological activities
00:29:15 but no known structure of the relevant receptor.
00:29:18 Slides 19 and 20 are from a paper that applies ensemble distance geometry to nicotinic receptor
00:29:25 agonists with the idea of finding the best superposition of active analogs using distance
00:29:31 criteria.
00:29:34 A final set of non-energy-based methods are those in which one is attempting to fit some
00:29:38 biological activity or binding data of a number of small ligands to an unknown receptor.
00:29:44 One can use a variety of approaches here, including pharmacophoric pattern matching,
00:29:49 statistical methods, for example, QSAR, or a combination of these.
00:29:54 As mentioned above, Krippen's distance geometry QSAR falls in this category.
00:29:59 The next two slides are from a paper by Gose and Krippen in which the structure-activity
00:30:05 relationships of inhibitors of dihydrofolate reductase are used to build a hypothetical
00:30:10 receptor shown in the next slide.
00:30:13 Illustrated near the molecule are sight points, which have groups that can contribute positively
00:30:17 or negatively to binding.
00:30:19 Hansch and co-workers have used computer graphics visualization of known macromolecular structures
00:30:24 to rationalize QSAR equations.
00:30:27 A potentially powerful approach, COMFA, combines features of pharmacophoric matching and QSAR.
00:30:32 In the next two slides are the title of the paper, and the following slides show areas
00:30:39 near a steroid which can have positive or negative interactions with the three-dimensional
00:30:48 structure of the receptor.
00:30:51 What is the prognosis for one of the most difficult and challenging problems in molecular
00:30:56 modeling?
00:30:57 The prediction of three-dimensional structure of proteins from amino acid sequence.
00:31:02 As noted just now, there are many biological systems for which there is no three-dimensional
00:31:06 structure for the receptor.
00:31:08 With the advent of gene-cloning techniques, the sequence of many of these receptors are
00:31:11 becoming known.
00:31:14 Can we turn these into structures?
00:31:16 The prognosis for this is in general poor because of the fundamental difficulty which
00:31:19 exists in simulating all complex molecular systems, sufficient sampling, and correctly
00:31:24 ranking the free energies of all the local minima of the system.
00:31:28 There are a large number of methods in the literature for conformational searching, but
00:31:31 these are mainly applicable to systems with tens rather than thousands of degrees of freedom.
00:31:37 These include Monte Carlo methods in Cartesian or internal coordinate space, methods based
00:31:41 on distance geometry, followed by molecular mechanics and dynamics, systematic search
00:31:46 techniques, high-temperature molecular dynamics, and methods that use cyclic boundary conditions
00:31:50 and Fourier analysis.
00:31:52 Whatever the method of generating conformations, it still faces the difficulty of evaluating
00:31:57 the free energies of these conformations, which is very difficult to do for many conformations
00:32:01 of polar or ionic molecules in solvent.
00:32:05 To predict protein three-dimensional structures from amino acid sequences, probably most effectively
00:32:09 done with a pattern recognition approach and a docking approach to predict qualitative
00:32:13 secondary and tertiary structures, which can then be refined with energy-based methods.
00:32:18 But still, in a typical case, there are far too many possible solutions.
00:32:21 A more limited and feasible approach can be used if one is predicting the three-dimensional
00:32:25 structure for a protein when one knows the structure of a homologous protein.
00:32:29 Depending on the percent homology, one can use different techniques that impose similarity
00:32:33 in secondary and tertiary structure on the unknown protein.
00:32:37 A most exciting recent success in this area was the prediction of the structure of the
00:32:40 HIV protease by such methods, as shown in the next slides.
00:32:46 The next slide has the title of that paper, and the following slide is the predicted structure.
00:32:51 The subsequent X-ray structure by two groups have been amazingly consistent with the predicted
00:32:55 structure.
00:32:56 The next slide, the very small, gives a title of that paper.
00:33:01 Despite the very small percent homology, the structure used for the template was the known
00:33:06 aspartyl protease of length of more than 200 amino acids, and the unknown AIDS protease
00:33:10 was a dimer of 99 amino acids.
00:33:13 This shows an example of some steroids bound to the HIV site using the dock technique to
00:33:19 find some other inhibitors.
00:33:24 But by using secondary structure near the active site and the conserved active site
00:33:27 residues, including the catalytic aspartic acids, the model structure was built.
00:33:31 Even with its qualitative correctness, it's not clear that such methods will be useful
00:33:35 for inhibitor drug design.
00:33:38 In summary, one can see that there are a wide variety of methods and exciting applications
00:33:42 of molecular modeling to organic and biological systems.
00:33:46 I have attempted to give a broad brush overview of what I consider to be among the most interesting
00:33:50 and significant areas of recent research.
00:33:54 What will the next decade bring?
00:33:55 It is dangerous to predict, but I see the increased computer power brought about by
00:33:59 massive parallelism, allowing much progress to be made on solving the global minimum problem
00:34:04 for larger and larger systems, and increasing dramatically the ability to calculate free
00:34:09 energies of activation, association, and stability.
00:34:12 New and more accurate energy functions will also increase the accuracy of modeling methods,
00:34:17 and better methodologies for integrating quantum mechanics and molecular dynamics will appear.
00:34:22 The ability of theoretical and computer-based approaches for macromolecules to impact experiments
00:34:26 may not reach the predictive capability of quantum mechanical calculations on small molecules
00:34:31 in the gas phase, but molecular dynamics has become a nearly indispensable ingredient in
00:34:35 protein structure refinement, and its usefulness in this area is likely to increase.
00:34:40 Nonetheless, the success of these predictions has been most impressive.
00:34:52 It is now time to take your questions.
00:35:02 Once you have called, please stay on until we have asked for your question on air.
00:35:07 You will be able to hear the program over the phone, so don't hang up.
00:35:11 If we are not able to use your call, we'll tell you, but we won't hang up on you, so
00:35:16 please be patient.
00:35:17 Art, while we wait for that first call, why don't you go ahead?
00:35:21 Well, Peter, I guess part of your reputation that I didn't mention at the beginning was
00:35:28 your answer machine message, and so I want to give you the opportunity now to answer
00:35:34 probably the question that's on most people's minds.
00:35:37 Where were you during the earthquake?
00:35:39 Well, Art, I was looking forward to the World Series game in the Chicago airport, and I
00:35:44 was very lucky that I was not at my desk because a bookcase with tons of J-Med chem
00:35:53 fell on my desk, so it was lucky that I didn't have an epitaph so I couldn't have appeared
00:35:59 here that said, killed by a thousand J-Med chems.
00:36:04 But what a beautiful way to go, right?
00:36:05 Yes, that would have been a wonderful way to go.
00:36:08 But actually, getting back to the purpose of this teleconference, this is the time for
00:36:15 people to interact with you, both the people here in the audience and the people that are
00:36:21 viewing at the remote sites.
00:36:24 But while we wait for them to contact us or come forth to the microphone, you people out
00:36:30 there, I'd like to ask at least one question to start out with, and that is that in the
00:36:35 overview you mentioned a large number of computing techniques, modeling techniques,
00:36:41 and there are a lot of people out here that have problems that they want to address with
00:36:47 these techniques.
00:36:48 Now, you and I are both in the fortunate position of being able to pick our problems for, maybe
00:36:57 they're well-behaved or maybe you know a lot about them or there's a lot of experimental
00:37:01 data.
00:37:02 The situation for most people is, well, I've got this problem, what technique do I use?
00:37:09 How much effort do I invest?
00:37:12 How many people do I commit?
00:37:15 How do they answer that kind of question?
00:37:17 Well, I think that that's a good general question that I think is appropriate to really spend
00:37:27 a lot of time thinking about it and talking to an expert about whether the current theoretical
00:37:35 techniques can say anything useful.
00:37:36 Of course, the expert may not know either, but I think we're often so carried away with
00:37:42 the beauty of the color pictures and the technology is so exciting that we will just throw resources
00:37:48 at a problem which has no, we don't have enough experimental data to be able to attack usefully.
00:37:56 I mean, that's one of the differences I try to emphasize in my talk, that we can take
00:38:02 small molecules in the gas phase and do everything with Schrodinger's equation and relate beautifully
00:38:06 to experiment, but when we have complicated systems in solution, a reasonable amount of
00:38:12 experimental data is often critical to guide us to a sensible solution, to sort of make
00:38:17 our solution, to frame our solution in such a way that it can further be useful in designing
00:38:24 and carrying out other experiments.
00:38:26 So we work really hand-in-hand with experiments, but if there isn't enough experimental data
00:38:30 there, I think one just can't attack the problem, and there are unfortunately too many problems
00:38:34 that have this, you know, today that we can't attack because we don't have the data.
00:38:41 Yeah, I guess it's, you know, it's a good ad for consulting, but I think a real problem
00:38:54 is overselling a technique, in a way, because our business is to be in the, to write papers
00:39:01 and to extend the techniques.
00:39:06 I guess another way of asking the question is, are there examples where it's been picked
00:39:14 up and used very, very successfully by non-developers?
00:39:19 Yeah, well, I, again, I, you know, the people have their, all have their own anecdotes,
00:39:28 but I think that, you know, I get feedback from people who, you know, have used molecular
00:39:37 dynamics and just run many trajectories, and of course they, in some cases they've actually
00:39:43 stumbled on new confirmations, and in other cases they have, in other cases they have
00:39:47 not, you know, they haven't found anything.
00:39:50 So I think people, you know, I think that there are many examples of where people have
00:39:59 used the technique, you know, used the technique in a useful way, but usually, I'd say much
00:40:06 more often, they've been successful in cases where they have worked closely with another
00:40:14 chemist who has a sense of the technology.
00:40:19 I want to remind our viewers that you can call in.
00:40:23 Please don't be shy about doing that.
00:40:25 We have two experts right here on the set who are waiting to take your questions, so
00:40:29 pick up the phone and call, and as soon as you do, we'll try to get to you.
00:40:33 So why don't you continue, gentlemen?
00:40:36 Okay.
00:40:37 But you have to realize that if you don't call in, or if you, people don't come up and
00:40:41 talk and give your point of view, then you have to sit here and listen to our questions
00:40:46 and our points of view.
00:40:48 And actually, some of the questions that we're asking now, I think, could be held for the
00:40:53 more general discussion at the end, where...
00:40:55 Well, we do have a caller right now.
00:40:57 It's Bill on line six, so Bill, please go ahead with your question.
00:41:05 And where is Bill?
00:41:06 This always happens.
00:41:07 Bill's coming through.
00:41:08 Bill's coming through.
00:41:09 He must be calling from a faraway site.
00:41:12 From Canada.
00:41:13 Yes, from Canada.
00:41:14 No, I don't know that.
00:41:17 I'll let you know when Bill comes through.
00:41:20 Bill's not ready.
00:41:21 All right.
00:41:22 This always happens, by the way, with the very first caller, so as time goes by, we'll
00:41:26 get this worked out.
00:41:27 I can amplify, you know, the way I answered your last question, Art, and that is that
00:41:32 I think molecular modeling, the birth of molecular modeling as a widely applicable
00:41:38 tool or of interest to many organic chemists really happened with the development and refinement
00:41:45 of computer graphics.
00:41:46 In other words, when molecular modeling was mainly just empirical or numerical computation,
00:41:57 it was in the hands of the experts.
00:41:58 I think it has really been graphics of which, you know, graphics techniques development
00:42:06 where you could represent the results from calculations in a way that chemists started
00:42:11 to see them and get insight into them that has led to molecular modeling being a more
00:42:16 generally used and useful tool.
00:42:19 And I don't think there's any prescription for the sort of the overselling of something.
00:42:23 I mean, it's sort of like let the buyer beware of the techniques.
00:42:28 I mean, the techniques are out there.
00:42:30 They can be gotten at an ever decreasing cost, but it's like any other area of science.
00:42:38 You have to make a decision on whether you want to do this experiment, whether you want
00:42:43 to do this simulation, and ultimately the test of whether this is successful is whether
00:42:51 you gain some insight that will guide further experiments that will be predicted.
00:42:58 I always, I usually often mention in talks that the three roles of theoretical chemistry
00:43:07 as applied to molecules are first, you would like to be able to simulate the system in
00:43:12 reasonable agreement with known experiments.
00:43:15 Secondly, you want to use those results to gain insight into the system.
00:43:21 That is, in other words, get a qualitative picture that lets you manipulate it better,
00:43:26 that lets you explain it better, that lets you make predictions even in a qualitative
00:43:29 way.
00:43:30 And thirdly, you'd like to be able to make quantitative numerical predictions on the
00:43:35 system in advance of experiment.
00:43:37 That's when the theory is fully successful.
00:43:40 And again, coming back to the theme that Art and I mentioned, the fact is that today in
00:43:48 biological systems, we maybe only can have 10% of the systems where we can have partial
00:43:54 success by these three criteria.
00:43:57 Maybe it's going to be 20%.
00:43:58 I mean, we don't know what the future will hold.
00:44:00 Yes?
00:44:01 Okay, I believe that we have a call from Memphis.
00:44:07 So what do you have to say, Memphis?
00:44:12 Are we through now?
00:44:13 Oh, good.
00:44:14 You've put me on hold before.
00:44:15 This is Bill Purcell calling, and first of all, I'd like to say hello to Peter.
00:44:21 I think you did a super job.
00:44:23 Peter, do you have any idea of any examples where one has tried to correlate HPLC binding
00:44:33 with molecular modeling?
00:44:35 That is to say, if you look at the retention times of various molecules coming through
00:44:41 a column, for example, can that be correlated with the energy of binding that one might
00:44:47 approximate through molecular modeling?
00:44:50 I don't have an answer directly, Bill, for the HPLC correlation, but I know the methods
00:44:59 I mentioned on doing free energy calculations of transfer from the gas phase to water.
00:45:05 There have been applications, I think, both by Bill Jorgensen and a group at Oxford looking
00:45:10 at the free energies of transfer now between various small molecules between a water phase
00:45:17 and another nonpolar phase.
00:45:20 So I think that if we had a sense of what the phase the HPLC is involved in, if we could
00:45:26 represent that molecularly, or the two phases, I think current free energy methods could
00:45:32 be applied in a way to deal with this problem.
00:45:38 For instance, one of the students in my lab, Steve DeBold, is very interested in the idea
00:45:44 of how amino acids are soluble in membranes.
00:45:48 And so one would have to create the right molecular environment to represent this free
00:45:53 energy of transfer.
00:45:55 But I think our potential functions are now at the stage where we can calculate such free
00:46:00 energies of transfer between various phases.
00:46:03 We just have to define the molecular mix.
00:46:06 We're getting ever better at being able to calculate that.
00:46:09 Great.
00:46:10 Thank you very much, Peter.
00:46:13 I appreciate that.
00:46:14 And maybe we should get together sometime and talk about it privately.
00:46:20 Thank you, Bill, for calling.
00:46:21 Peter, we have a question from our audience.
00:46:23 Hi.
00:46:24 I'm Joe Major from Hybertech Incorporated here in San Diego.
00:46:27 I have a question for Dr. Coleman.
00:46:29 You mentioned a number of different molecular modeling techniques.
00:46:32 I wonder if you would care to comment on the comparative advantages, disadvantages of,
00:46:36 say, Monte Carlo simulations versus molecular dynamics.
00:46:40 When one would use one, what are the parameters that one should keep in mind in considering
00:46:45 those two?
00:46:47 Thanks very much for your question.
00:46:49 I think it's a very good question because we have, really, the world's expert on Monte
00:46:54 Carlo calculations, the world's advocate of Monte Carlo calculations, Bill Jorgensen,
00:46:58 later on our program.
00:47:01 And I think that we have done more molecular dynamics calculations, but we have Bill Jorgensen's
00:47:09 BOSS program running in our lab, and there are particular applications.
00:47:12 For instance, the derivation of parameters for liquids, where we use the Monte Carlo
00:47:23 programs because they are more effective in our mind at deriving the parameters for liquids
00:47:28 because Bill has sort of set the framework for that.
00:47:31 The advantage of molecular dynamics, though, is on multiply connected polymers, that in
00:47:36 one time step, all of the atoms are moved at once, whereas in Monte Carlo methods, you
00:47:42 move essentially one molecule at a time.
00:47:46 And the disadvantage of applying Monte Carlo methods to polymers is that, for instance,
00:47:51 if you rotate one bond and look whether that's an allowed energy, the whole system will move
00:47:56 a lot and you'll get a lot of disallowed moves.
00:48:00 So it's, in principle, relatively inefficient for polymers, although Nobuhiro Go of Japan
00:48:07 has developed a Monte Carlo method where he looks at the normal modes of vibration of
00:48:11 the polymer and moves, does his Monte Carlo moves in normal mode space rather than in
00:48:17 Cartesian space.
00:48:19 And he claims that he gets good efficiency for movement of a peptide chain that way.
00:48:25 So I think that it's like anything else, there'll be particular problems where the
00:48:29 dynamics is more useful.
00:48:31 It's a more general theory in that you can get dynamical properties, but there will be
00:48:34 cases where Monte Carlo is more efficient and more effective.
00:48:39 Thank you.
00:48:40 We have Robert from Columbia on line five.
00:48:43 Robert?
00:48:44 This is University of Missouri.
00:48:46 I'd like to ask whether it's possible to relate the three-dimensional structure of a peptide
00:48:53 to binding to, of course, the molecular model of a receptor or large molecular protein.
00:49:01 One case in point might be insulin binding its receptor, and if you can do that, then
00:49:06 can you project altered insulins or other small peptides with its receptor or enzyme?
00:49:15 Thank you.
00:49:18 Thanks very much.
00:49:19 That's a very good question, and I think it gives me the opportunity to make a point that
00:49:24 I didn't make in my talk, and that is I think the molecular modeling methods can be more
00:49:29 powerful if you actually have the structure of the receptor.
00:49:34 For instance, we have the structure of insulin, the three-dimensional x-ray structure.
00:49:37 We can do a lot of manipulations on it very accurately and effectively.
00:49:41 We don't have the structure of the insulin receptor.
00:49:46 Also, another problem with what you're suggesting is that it's often more difficult to determine
00:49:52 structures of small peptides than it is for larger proteins because these peptides are
00:49:56 floppy.
00:49:57 They don't actually have a well-defined confirmation.
00:50:00 I'm answering all the aspects of your question.
00:50:03 It had many facets.
00:50:06 One of the features of my talk was this ensemble distance geometry and other techniques when
00:50:10 you don't have the structure of the receptor.
00:50:13 For instance, for insulin, one could look at the insulin molecule itself and try to
00:50:19 think about the structure of the receptor around insulin, but it's a difficult problem.
00:50:28 All of these are difficult problems.
00:50:30 They become more difficult if you don't have the three-dimensional structure of at least
00:50:35 one of the molecules or perhaps both of the molecules that you're trying to model.
00:50:39 All right.
00:50:40 We have Charles on Line 8 in Minneapolis.
00:50:43 Charles, go ahead, please.
00:50:44 Thank you.
00:50:45 I'd be interested in hearing your thoughts regarding what we can expect in terms of new
00:50:51 developments in methodology and new developments in hardware and how these are going to affect
00:50:59 the limitations on the kinds of biological systems that we can actually look at in the
00:51:04 future.
00:51:12 This is a dangerous prediction.
00:51:13 I mean, as I said at the end of my talk, it's dangerous to make these predictions.
00:51:17 But I think that if you asked most chemists, computational chemists, they would agree that
00:51:24 if we can get to the stage of harnessing the massive parallelism that we see in some computers
00:51:30 where each processor is very ... There are not only millions of processors, but each
00:51:37 is very intelligent, can communicate.
00:51:39 You can write high-level languages to use them.
00:51:42 We would see a major increase in the size of molecules where we could essentially solve
00:51:49 the local minimum problem.
00:51:51 And this would actually really stimulate, I think, more and bring out to the fore inaccuracies
00:51:58 in the energy function that we have.
00:52:02 That's a second area of development, not only the increased computer power, but improving
00:52:06 the energy functions.
00:52:07 And there are many groups in the world working on improving the parameters and the actual
00:52:12 functional form to use to represent molecular systems.
00:52:17 I see those two as being important technical developments that are going to happen in the
00:52:24 next years.
00:52:25 And they will ... I also think that we will, as we increase our database of protein structures
00:52:31 and protein structural motifs, and get more clever about using homology to build new protein
00:52:38 structures, we will get ... We will make progress, and I see over the next 10 years a terrific
00:52:44 progress in coming closer to not necessarily solving the protein folding problem, but having
00:52:50 a lot more success as the example that Pearl and Taylor showed, that I showed of Pearl
00:52:56 and Taylor having a lot more success in deriving three-dimensional structures of protein from
00:53:03 known homologous protein.
00:53:04 I have to jump in here.
00:53:06 It's time to move on.
00:53:07 Peter, thank you.
00:53:08 We will see Peter back in about an hour when he talks about his newest applications.
00:53:13 Thank you.
00:53:16 We go now to William Jorgensen's talk on Structure and Binding in Bio-Organic Host-Guest Chemistry.
00:53:22 Art?
00:53:23 Bill's work branches, or bridges, between organic chemistry and molecular biology.
00:53:29 His contributions to the understanding of solvent effects have had wide impact, and
00:53:33 his recent work on hydrogen bonding promises to address a number of biological questions.
00:53:39 Bill?
00:53:40 Thank you, Art.
00:53:42 The theme of my talk today will be Understanding Intermolecular Interactions and Solution.
00:53:49 This is critically important for both controlling molecular recognition and the design of selective
00:53:54 hosts.
00:53:55 In view of the time limitations, I have chosen to focus on hydrogen bonding.
00:53:59 The key issue will be the competition of inter-solute and solute-solvent interactions.
00:54:04 Three examples will be considered.
00:54:06 The first is a simple one, the dimerization of N-methyl acid amide, NMA, in chloroform
00:54:12 and water.
00:54:13 We will then move on to a more elegant organic host, one of Rebix molecular clefts, which
00:54:18 can bind nitrogen heterocycles.
00:54:21 And finally, we will consider interactions of nucleotide bases in chloroform and some
00:54:25 striking variations in binding for triply hydrogen-bonded systems.
00:54:31 The methodology that is involved in all of these studies features Monte Carlo statistical
00:54:37 mechanics simulations using the BOAS program.
00:54:41 In each case, dilute solutions are being simulated.
00:54:44 The systems consist of the solutes plus about 300 solvent molecules in a box with periodic
00:54:50 boundary conditions.
00:54:52 The calculations were run at constant temperature, 25 degrees, and pressure of one atmosphere.
00:54:58 The binding thermodynamics are probed with statistical perturbation theory as described
00:55:02 by Professor Coleman.
00:55:04 Both absolute and relative free energies of binding can be obtained.
00:55:08 In figure one, there is a representation of the NMA dimer in chloroform from the simulations.
00:55:15 The OPLS potential functions that are used here employ a united atom model for CHN groups.
00:55:21 So the methyl hydrogens in NMA and the hydrogen in chloroform are implicit.
00:55:26 All other atoms are explicitly represented and interact intermolecularly via Coulomb
00:55:32 plus Leonard-Jones terms.
00:55:34 For this example, we have computed a free energy profile or potential of mean force
00:55:39 for separating the two NMA molecules as a function of a carbonyl O to amide N distance.
00:55:46 What we find in chloroform at short separations, about three angstroms, is that the amides
00:55:52 are indeed hydrogen-bonded as illustrated.
00:55:55 Of course, there is variation in the structure during the simulation owing to the thermal
00:56:00 configurational averaging that yields the proper Boltzmann weighted results.
00:56:05 However, in turning to water, we find that the amides do not hydrogen-bond at short separations.
00:56:13 What one gets is a preference for stacked arrangements that feature reasonable dipole-dipole
00:56:18 alignment for the amides.
00:56:20 The inter-solute interaction is weaker here at about minus three and a half kilocalories
00:56:25 per mole versus minus seven and a half kilocalories per mole for the hydrogen-bonded arrangement
00:56:31 in figure one.
00:56:33 However, the edges of the amides are now exposed for maximal hydrogen bonding with the solvent.
00:56:40 Both carbonyl oxygens are participating in about two hydrogen bonds with water, and the
00:56:45 NH groups act as a donor of another hydrogen bond each.
00:56:49 It is important to realize that in the second figure, just one snapshot of millions from
00:56:56 the simulations was shown, though it has been chosen as typical.
00:57:00 Nevertheless, these pictures provide no sense of whether the illustrated arrangements are
00:57:05 in fact stabilized relative to infinite separation of the amides and the solvents.
00:57:12 That information is clearly provided by the potentials of mean force, which are showed
00:57:16 in the next illustration.
00:57:18 In two series of simulations, the amides were gradually perturbed apart, and the free
00:57:23 energy changes were accumulated.
00:57:25 In chloroform, there is indeed a substantial hydrogen bonding well with a maximum depth
00:57:31 of minus three and a half kilocalories per mole at a separation of about 2.9 angstroms.
00:57:38 It might be noted that with these OPLS potentials, the optimal NMA-NMA interaction in the gas
00:57:45 phase is minus nine kilocalories per mole.
00:57:48 So there is substantial damping of that figure via the configurational averaging under ambient
00:57:54 conditions and in view of the solvent competition.
00:57:58 On the other hand, in water, we find that the amides exhibit no net attraction.
00:58:03 Their approach is innocuous, up to about 3.5 angstroms, and then it becomes repulsive at
00:58:09 shorter distances.
00:58:11 The 9 kcal per mole optimal interaction is entirely wiped out by the thermal and solvent
00:58:16 effects.
00:58:17 It is clearly very difficult to predict this a priori.
00:58:22 Appropriate experiments or computations are required.
00:58:25 But how do we know that the computed results have any validity?
00:58:29 A comparison between theory and experiment is desired.
00:58:33 It can be provided by computing an association constant, Ka, from the potentials of mean
00:58:38 force.
00:58:40 The equation shown in the figure gives the desired relationship that has been known since
00:58:46 the 1920s during the development of Debye-Huckel theory and Bjerrum's modifications to it.
00:58:52 Integration over the orientationally averaged potential of mean force is performed to a
00:58:57 cutoff distance C that defines the geometric limit to association.
00:59:02 The results are given as a function of C in the next illustration.
00:59:07 Fortunately, the dependence on C is not strong, so reasonable choices that encompass the free
00:59:13 energy well give a computed Ka of about 4 liters per mole for NMA in chloroform.
00:59:20 There is a corresponding value available from IR measurements.
00:59:24 It is about 3 liters per mole.
00:59:27 So theory and experiment are in good accord.
00:59:30 In water, the classic experiments of Klotz and Franzen yielded a Ka for NMA of .005 liters
00:59:37 per mole.
00:59:39 There is much difficulty in measuring such small values.
00:59:42 However, the qualitative conclusion is clearly consistent with the present computed results.
00:59:49 Amides exhibit negligible attraction in water.
00:59:52 Quantitatively, the appropriate choice for the cutoff C is not obvious.
00:59:56 In this case, the values of 4 to 5 angstroms give KAs of about a tenth of a liter per mole.
01:00:02 Thus, the overall comparisons provide confidence in the computed potentials of mean force.
01:00:08 This example also clearly illustrates the difficulties in just considering gas phase
01:00:13 interactions and then coming to any conclusions about binding and solution.
01:00:18 The effect of solvent needs to be evaluated in molecular detail.
01:00:24 Moving to the next system, we see an organic host that has been prepared by Rebic and co-workers.
01:00:31 The molecule features a cleft with the two acid groups and the lone pair on the acridine
01:00:36 nitrogen directed inward.
01:00:39 This is nicely reminiscent of the binding clefts that are usually present in the active
01:00:43 site region of enzymes.
01:00:45 Thus, Rebic's design has clear advantages for potential molecular recognition and catalysis
01:00:51 over some alternative bioorganic host molecules, such as cyclodextrins, where the functionalization
01:00:58 must occur on the rim of the cup.
01:01:01 Rebic has studied binding of various nitrogen heterocycles to this host, including pyridine,
01:01:07 pyrimidine, and pyrazine.
01:01:10 The cleft has been proposed to be ideal for binding pyrazine in the two-point fashion
01:01:16 that is illustrated.
01:01:18 With the gas nuzzled into the cleft and anchored by the two hydrogen bonds.
01:01:23 The question is, how can one tell if this is really happening?
01:01:28 What has been measured by Rebic is the binding data shown in the next illustration obtained
01:01:34 by NMR titrations.
01:01:36 In particular, we see that pyrazine is well bound with a Ka of 1400 liters per mole in
01:01:42 deuterochloroform and that pyridine, which could only participate in one hydrogen bond
01:01:47 with the host, has a smaller Ka of 120 liters per mole.
01:01:52 The factor of 12 enhancement translates to a free energy preference of 1.45 kcals per
01:01:59 mole for binding pyrazine.
01:02:01 The qualitatively reasonable, the more refined question is whether 1.45 kcals per mole is
01:02:08 quantitatively reasonable for the gain of the second hydrogen bond.
01:02:13 One should realize