Digital Collections

Molecular Modeling for Biological Systems (Supercomputer Teleconference) Part 1

  • 1990-Jan-24

These captions and transcript were generated by a computer and may contain errors. If there are significant errors that should be corrected, please let us know by emailing digital@sciencehistory.org.

Transcript

00:00:00 This program was made possible by support from Digital Equipment Corporation.

00:00:07 Additional support was provided by Tripos Associates.

00:00:30 Good morning, ladies and gentlemen, and welcome to another American Chemical Society satellite

00:00:47 TV course.

00:00:48 I am Sylvia Ware, Director of the ACS Education Division.

00:00:52 We are pleased to co-sponsor today's course in Molecular Modeling with San Diego State

00:00:57 University and the San Diego Supercomputer Center.

00:01:01 One of the primary objectives of the society is to bring up-to-date information about chemistry

00:01:07 to our members and other interested scientists.

00:01:10 In over 100 years, ACS has accomplished this objective in many ways.

00:01:15 ACS publishes numerous books and journals, conducts many meetings, and produces a variety

00:01:20 of short courses, audio, and videotape programs.

00:01:24 The satellite television courses are our newest means of communicating timely educational

00:01:29 material directly to you.

00:01:32 The capability to receive satellite programs is still in its infancy in the chemical community.

00:01:38 Receiving such programs is common in many universities and colleges.

00:01:42 However, most chemical companies do not presently own dishes to receive programs at their research

00:01:48 labs.

00:01:49 We hope to see this situation change rapidly during the next few years, so that tuning

00:01:54 in to chemical education via television will become just as easy as turning on the evening

00:02:00 news at home, and it will become commonplace to hear late-breaking scientific information

00:02:06 directly from the scientists doing the work.

00:02:10 Thank you for joining us in this program today and helping us make satellite communications

00:02:15 a reality for the American Chemical Society.

00:02:19 San Diego Supercomputer Center, San Diego State University, and the American Chemical

00:02:26 Society presents Molecular Modeling for Biological Systems.

00:02:33 Welcome to Molecular Modeling for Biological Systems.

00:02:37 I'm your moderator, Whitney Mandel.

00:02:40 San Diego State University and San Diego Supercomputer Center have worked together on previous supercomputer

00:02:46 video conferences.

00:02:48 We are pleased by the American Chemical Society's sponsorship of today's program.

00:02:54 Supercomputers are simply computers.

00:02:56 They are qualitatively the same as standard computers, with the difference of being quantitatively

00:03:02 faster, so they are defined as leading-edge technology.

00:03:07 What is interesting is that this faster speed allows for a qualitative change in the type

00:03:13 of problems that can be solved.

00:03:15 In today's program, we will be looking at applications that can tax this technology.

00:03:21 Any researcher can gain access to the National Science Foundation-sponsored supercomputer

00:03:27 centers across the U.S.

00:03:29 The San Diego Supercomputer Center welcomes proposals from all researchers.

00:03:35 Researchers and industry are invited to become sponsors and use the resources here in San

00:03:40 Diego.

00:03:41 And here is Sid Caron, director of the San Diego Supercomputer Center.

00:03:47 Hello, and welcome to the San Diego Supercomputer Center.

00:03:53 Biochemists and molecular biologists have made extensive use of our supercomputer facility.

00:03:58 They are now able to calculate and visualize the solution of problems that just a few short

00:04:02 years ago they only dreamed of solving.

00:04:06 As the technical advisors for this teleconference, the chemistry staff at SDSC has assembled

00:04:11 a group of leading computational experts in this field.

00:04:15 Together with you, today they will work to push ahead the frontiers of computational

00:04:20 biochemistry.

00:04:22 I know that you'll feel the same excitement that I feel after seeing the results that

00:04:26 are now achievable.

00:04:30 Thank you, Sid Caron.

00:04:31 I want to take this opportunity to welcome the participants at over 50 sites throughout

00:04:36 the United States and Canada.

00:04:38 And to introduce Dr. Arthur Olson, Art is a member of the Research Institute of Scripps

00:04:44 Clinic here in La Jolla and director of its Molecular Graphics Laboratory.

00:04:49 His work is in molecular graphics and biomolecular interactions.

00:04:54 Art will provide a user's perspective on today's talks and facilitate the discussions during

00:04:59 the question and answer periods.

00:05:01 Art?

00:05:02 Hello, Whitney.

00:05:03 Hello.

00:05:04 These are exciting times for the field of biomolecular modeling since two rapidly evolving

00:05:14 technologies are converging in this area.

00:05:18 On the one hand, we have the advances in modern molecular biology, which are producing materials

00:05:24 and raising questions that weren't even thought about 10 years ago.

00:05:28 On the other hand, we have the evolving power of computers, which are really bringing the

00:05:35 capability to answer questions of the complexity raised by molecular biology to a wide variety

00:05:42 of scientists.

00:05:43 And I'm sure that the excitement that I feel will be echoed by our speakers today.

00:05:47 Thanks, Art.

00:05:48 There are a couple of things we need to cover before we hear from our speakers.

00:05:53 First, I want to point out that each of you should have received a copy of the course

00:05:58 notes prepared by our speakers.

00:06:01 Complete transcripts of the speakers' presentations and copies of their diagrams will be made

00:06:06 available to your site coordinator soon.

00:06:09 We are providing these transcripts to the local site coordinators as part of your registration

00:06:14 in this course.

00:06:15 Also, each of you should have received an evaluation form.

00:06:20 Please complete the evaluation and return it to the site coordinator when you leave.

00:06:24 We need your feedback to design programs that fit your needs and interests.

00:06:29 In our program today, we will hear from four speakers.

00:06:33 Dr. Peter Coleman will begin with methods in molecular modeling, an overview of types

00:06:39 of problems being addressed.

00:06:41 Next, we'll hear from Dr. William Jorgensen on structure and binding in bio-organic host-guest

00:06:48 chemistry.

00:06:49 Dr. Jeffrey Blaney is our next speaker.

00:06:52 He'll be covering a distance geometry approach to ligand macromolecular docking.

00:06:59 Then Peter Coleman joins us again with a presentation of current simulations using molecular dynamics

00:07:05 and free energy perturbation applications to study large molecules.

00:07:10 And Dr. David Case will be our final speaker with a presentation on determining three-dimensional

00:07:16 solution structures of proteins from NMR data.

00:07:20 We will have question and answer periods after each of the presentations with a full panel

00:07:25 discussion at the end of the program.

00:07:27 I will give you the numbers to call now.

00:07:30 In California, the number is 1-800-942-1515.

00:07:36 In the U.S., the number is 1-800-972-1515.

00:07:41 And if you are calling from Canada, the number to call is 1-619-265-6429.

00:07:49 Our Canadian callers can call collect.

00:07:52 As we approach our question and answer periods, we will let you know that we're getting ready

00:07:57 for your calls.

00:07:58 Well, Art, we're ready to hear from Peter Coleman.

00:08:02 Even though Peter doesn't look old enough, he's certainly considered one of the founders

00:08:06 in the field of biomolecular modeling, and rightly so since the AMBER program comes out

00:08:13 of his lab and it's used now widely throughout the world.

00:08:17 In addition, Peter's application work in modeling DNA dynamics and ionophores is cited

00:08:23 throughout the literature.

00:08:26 I'm looking forward to Peter's comments.

00:08:32 Thank you, Art.

00:08:33 Perhaps the reason that I look young is because of all the gray hairs I've torn out while

00:08:39 we were developing the AMBER program.

00:08:43 My goal today is to give an overview of molecular modeling methods and a selected set of applications.

00:08:49 The focus of my remarks will be on organic and biological systems, but there are many

00:08:55 areas of modeling involving zeolites, metals, and polymers which I will not attempt to cover.

00:09:01 I will attempt to keep my remarks at a level that any well-educated chemist can understand

00:09:07 without boring the experts.

00:09:10 The molecular modeling methods that I will discuss in cursory detail because of time

00:09:14 limitation are listed in the first slide.

00:09:19 They include quantum mechanical calculations, molecular mechanics and dynamics, Monte Carlo

00:09:26 calculations, computer graphics, distance geometry, and pharmacophoric pattern matching

00:09:32 and QSAR.

00:09:34 The selected applications include some that have been, in my biased view, have been among

00:09:41 the most important and significant in the last decade.

00:09:46 These include calculating the free energies of activation of chemical reactions in solution

00:09:53 and in enzyme-active sites, calculating the free energies of non-covalent associations

00:09:59 of ligands to macromolecules, predictions of protein tertiary structures using homology

00:10:05 model building, the use of distance geometry in docking and drug design, and the use of

00:10:11 distance geometry or molecular dynamics in macromolecular structure determination.

00:10:18 We now proceed to describe the molecular modeling methods with the applications interspersed

00:10:24 in the discussion.

00:10:27 The goal of molecular modeling is to simulate real chemistry.

00:10:32 If the molecule to be studied is small and in the gas phase, one can apply quantum mechanical

00:10:37 methods.

00:10:39 The Schrodinger equation, shown in the third slide, is a many-particle differential equation

00:10:44 that cannot be solved analytically for more than one particle.

00:10:49 But within the non-relativistic Born-Oppenheimer approximation, in which one neglects relativistic

00:10:54 effects and assumes the electrons move in the field of fixed nuclei, one can approximate

00:11:00 the solution to the Schrodinger equation numerically.

00:11:04 There are three major classes of quantum mechanical calculations applied to molecules of chemical

00:11:09 and biological interest.

00:11:11 The first, the ab initio approach, makes no further approximations in solving the Schrodinger

00:11:16 equation.

00:11:17 In its implementation, it usually assumes that one can represent the wave function as

00:11:21 a linear combination of Slater determinants of one-electron functions, each of which is

00:11:26 a linear combination of atomic orbitals.

00:11:29 Using a single determinant for the wave function and solving for the coefficients in these

00:11:32 linear combinations is called the Hartree-Fock LCAO, linear combination of atomic orbital

00:11:38 molecular orbital approximation.

00:11:41 One can go beyond this single determinant approximation in various ways, including configuration

00:11:46 interaction and perturbation theory, for example, molar Plesset approaches.

00:11:51 Currently, such ab initio approaches can be applied to a wide variety of organic molecules.

00:11:57 However, for the non-expert, one should remember that one ab initio calculation can differ

00:12:02 dramatically from another in rigor and computer time.

00:12:06 For example, the results of calculation when one uses a minimal set of atomic orbitals

00:12:11 to represent the molecular orbitals and the LCAO-MO approximation can differ qualitatively

00:12:17 and quantitatively from those in which one uses a much larger set of atomic orbitals

00:12:23 and goes beyond the single determinant approximation.

00:12:27 Of course, the accuracy one needs to connect the calculations to real chemistry differs

00:12:33 depends on what one wishes to simulate, but for molecules of a few atoms in the first

00:12:38 rows of the periodic table, one can simulate most of their properties to experimental accuracy.

00:12:44 The size of the molecule for which this can be done is a very sensitive function of available

00:12:48 computer power.

00:12:51 The development of semi-empirical quantum mechanical methods in which one approximates

00:12:55 the solutions to the Schrodinger equation using semi-empirical adjustments to the Hamiltonian

00:13:00 has been catalyzed by the limitations in the size of the system to which ab initio methods

00:13:04 can apply.

00:13:06 The current most highly developed of such approaches are those that originated from

00:13:10 the Doerr group in Texas, MNDO3, MNDO, AM1, and PM3, whose focus is on molecules of organic

00:13:16 chemical interest, and Zindo from Zerner in Florida, which can handle transition metal

00:13:21 systems.

00:13:23 Of course, there are other methods that are useful in some applications, such as X-alpha

00:13:26 approaches, which case has contributed to the development of biological systems, extended

00:13:31 Huckel theory applied in a useful qualitative way for organic and inorganic systems by Hoffman

00:13:36 of Cornell, and assortment of others that have a particular niche in chemistry.

00:13:40 Finally, there are valence bond approaches, and the EVB, empirical valence bond approach

00:13:45 developed by the Warshall group, has had many useful applications to chemical and biological

00:13:50 molecules.

00:13:52 Valence bond theory starts out with a different approach than molecular orbital theory in

00:13:55 the way the molecular wave function is constructed.

00:13:58 It has the virtue, when used in an empirical fashion as Warshall has, to allow the computer-efficient

00:14:03 approximate solution to the Schrodinger equation.

00:14:07 The above methods solve for the energy of a collection of nucleon electrons as a function

00:14:11 of the nuclear coordinates, and so can, in principle, determine the complete potential

00:14:15 surface of the molecule as a function of nuclear coordinates.

00:14:20 Methods to directly calculate the first and second derivatives of the quantum mechanical

00:14:23 energy as a function of nuclear coordinates have considerably increased the power of such

00:14:27 methods.

00:14:28 However, when one wants to consider the properties of condensed phases and the properties of

00:14:33 macromolecules, it is not appropriate or possible to use quantum mechanical methods alone.

00:14:39 For example, one can simulate water-liquid using an analytical function that has been

00:14:43 developed by carrying out a large number of calculations on water dimer in various configurations

00:14:49 in order to determine the water-water interaction potential.

00:14:53 Liquid properties must be derived by taking an average of a very large number of thermally

00:14:57 accessible configurations, the energy of which must be evaluated.

00:15:02 A typical water-liquid simulation uses 216 molecules and periodic boundary conditions

00:15:06 to represent the system.

00:15:08 One must then evaluate the energy of this 648-atom system and do so a million or so

00:15:13 times.

00:15:14 This clearly requires an energy function that is simple.

00:15:18 One can use either Monte Carlo or molecular dynamics method to generate the configurations

00:15:22 of the system.

00:15:24 Monte Carlo methods typically move one molecule at a time and by comparing the new configuration

00:15:28 to the old either accept it or reject it based on the Boltzmann factor for the relative energies.

00:15:34 Molecular dynamics methods evaluate the energy and its analytical derivatives and move all

00:15:39 the atoms at once using Newton's laws of motion.

00:15:42 Thus, each atom is characterized by both its position and velocity.

00:15:45 It is the kinetic energy of the atoms that allows them to sample various configurations

00:15:50 of the system.

00:15:52 As noted above, one can use quantum mechanical calculations to derive the analytical potentials

00:15:56 which must be used in Monte Carlo or molecular dynamics calculations on liquids.

00:16:01 However, these have a number of disadvantages.

00:16:03 A, when they are based on two-body interactions, they leave out many body interaction effects

00:16:08 which are critical to the quantitative representation of the properties of polar liquids like water.

00:16:14 And they require large amounts of computer time to derive.

00:16:17 And this time goes up even more if many body effects are derived in this fashion.

00:16:22 Thus, the currently most useful approaches to deriving energy functions for liquid simulations

00:16:27 have been empirical.

00:16:29 Both TIPS and SPC models for liquid water were based on simulations that varied the

00:16:33 parameters to force agreement with the calculated density and enthalpy of vaporization of the

00:16:37 liquid.

00:16:39 These achieved this agreement by being effective two-body potentials with many body effects

00:16:43 implicitly built in.

00:16:45 For example, the dipole moment for such water molecules is 2.3 to 2.4 Debye in contrast

00:16:50 to the gas phase value of 1.85 Debye.

00:16:54 In the same spirit of empiricism, one can derive functions that represent all the intra-

00:16:58 and intermolecular interactions of molecules.

00:17:01 And these are called molecular mechanical potential functions.

00:17:05 In the next slide, we describe the parameters in such a function.

00:17:10 They often come from a combination of quantum mechanical and empirical data.

00:17:15 Typically, they are derived from a test set of molecules and then assumed to be transferable

00:17:19 to a wide variety of others.

00:17:21 The reason these methods work at all is, for example, a C-C bond is about 1.5 angstroms

00:17:27 in most molecules, and the deviations from this are small and can be analyzed by strain

00:17:31 effects.

00:17:33 One can derive similar functions for peptides, proteins, nucleic acids, and other macromolecules

00:17:39 and use them in molecular mechanics and molecular dynamics calculations.

00:17:43 The most important sources of experimental data used in such calculations to derive the

00:17:48 functions to be used in such calculations include vibrational frequencies from IR and

00:17:52 Raman data, bond lengths, angles, and dihedral angles from microwave and x-ray structural

00:17:57 data, rotational barrier heights from microwave spectroscopy, and density structures and enthalpies

00:18:02 of vaporization from liquids and crystals.

00:18:05 Arguably, the most important part of these energy functions are the non-bonded interactions,

00:18:09 and these have been derived using crystal lattice simulations, Lifson, Hagler, and others,

00:18:14 or liquid simulations using the Monte Carlo methods, the OPLS parameters derived by Jorgensen.

00:18:21 What is the difference between molecular mechanics and molecular dynamics?

00:18:25 Next two slides.

00:18:26 They both use an energy function E and its analytical gradient, where the force is the

00:18:31 negative gradient of the energy.

00:18:34 Molecular mechanics minimizes this function by moving down in potential energy to the

00:18:38 nearest local minimum.

00:18:41 Molecular dynamics, on the other hand, sets this force equal to the second derivative

00:18:46 of the distance with respect to time, that's Newton's law, and numerically solves for a

00:18:52 trajectory of the system at a given temperature.

00:18:55 The temperature enters in because the velocity of the atoms are kept such that the average

00:18:59 kinetic energy can be related to the classical expression for temperature.

00:19:03 Thus, molecular dynamics does not always decrease the energy and can cover much more of phase

00:19:07 space by surmounting barriers that are of modest size.

00:19:11 However, to solve the equations of motion, one must use numerical methods with time steps

00:19:16 of the order of femtoseconds.

00:19:17 Thus, to represent a single nanosecond trajectory of the system requires a million numerical

00:19:22 integrations.

00:19:23 For a macromolecule, each step is quite time-consuming, with the rate-limiting step being the evaluation

00:19:29 of non-bonded interactions.

00:19:33 Given the difficulties in describing systems with so many degrees of freedom, how can one

00:19:36 study chemical reactions in solution?

00:19:39 One of the most important papers in this regard was by one of our speakers, Bill Jorgensen.

00:19:43 He studied the simple exchange reaction, methyl chloride plus chloride minus going to methyl

00:19:48 chloride plus chloride minus.

00:19:50 By using a high level of ab initio calculations, he was able to show that as the chloride minus

00:19:55 approaches the methyl chloride, the energy decreased as an ion-dipole complex was formed.

00:20:01 Then the energy rose until the formation of the transition state for the reaction, in

00:20:05 which both chlorines were equidistant from the carbon.

00:20:08 The gas phase barrier for the reaction was about 10 kcals per mole, and the one-dimensional

00:20:13 energy surface for chloride approach along the three-fold axis of methyl chloride was

00:20:17 fit to an analytical function, and analytical potentials were derived for chloride minus

00:20:21 water and methyl chloride water interactions, based on quantum mechanical calculations on

00:20:26 these, with appropriate interpolations for the interactions with other species along

00:20:30 the reaction pathway.

00:20:34 Then Monte Carlo methods were used to move the system along the reaction pathway for

00:20:37 the reaction of a periodic box of water molecules.

00:20:41 Umbrella sampling was employed to force the complex to surmount the free energy barrier

00:20:46 and form products.

00:20:47 In this manner, a free energy activation of 26 kcals per mole was calculated in excellent

00:20:52 agreement with the experiment.

00:20:54 The next slide shows a figure from the Jorgensen paper, with the free energy as a function

00:20:59 of reaction coordinate in the gas phase and solution.

00:21:02 Interestingly, in water, there was no ion-dipole complex formed at all.

00:21:07 In subsequent studies in other solvents, it was found that with less strongly interacting

00:21:12 solvents, an ion-dipole minimum, as found in the gas phase, was found.

00:21:17 The calculations also gave insight into the reason for the increased activation free energy

00:21:22 in solution.

00:21:23 The water molecules interact much more strongly with the localized charge on the chloride

00:21:26 than they do with the delocalized charge in the transition state.

00:21:31 Using a very different approach, Warshall was able to simulate the proton transfer and

00:21:34 acylation attack on substrates of trypsin and sutyllysine using a combination of empirical

00:21:39 valence bond and molecular dynamics methods.

00:21:42 The focus was on rationalizing site-specific mutagenesis effects on the activation free

00:21:46 energies in these enzymes, and the calculations were quite successful in this regard.

00:21:51 The next slide shows a figure from Warshall's paper in which he has simulated the reaction

00:21:56 profile in solution versus in the enzyme trypsin.

00:21:59 The reason for the 5 kcal per mole higher activation free energy for the Gly-216-226

00:22:05 to Alla-216-26 mutant was suggested to be structural distortions in the oxyanion hole

00:22:10 due to the two methyl groups.

00:22:12 Similarly, in sutyllysine, replacing acin-155 by a variety of other side chains led to increases

00:22:18 in calculated activation free energies comparable to those found experimentally.

00:22:23 In 1989, Ochvist and Warshall published a paper in biochemistry describing catalysis

00:22:29 by staphylococcal nuclease, which was very successful in reproducing not only the change

00:22:33 in activation free energy due to enzyme catalysis, but also the site-specific mutation effect

00:22:38 of an aspartic to glutamic acid residue.

00:22:42 One should not lose sight of the difficulties in the above studies and their limitations

00:22:45 to few dimensional reaction coordinates and very well-defined reaction mechanisms.

00:22:49 But we see that one of the more useful applications of molecular modeling has been on the study

00:22:53 of complex reactions in solution.

00:22:56 In another set of studies, Bash et al. were able to calculate in good agreement with experiment

00:23:00 the absolute solvation free energy of a wide variety of functional groups relevant to protein

00:23:05 side chains using free energy perturbation theory with molecular dynamics.

00:23:10 This method and the closely related method, thermodynamic integration, allow the calculation

00:23:13 of relative free energies for a wide variety of processes, including changes on either

00:23:18 ligand or the protein on the relative free energy for protein-ligand association, studies

00:23:23 of the relative stability of different DNA sequences and different structural forms,

00:23:27 and studies of the relative stability of a protein and its site-specific mutants.

00:23:31 We will discuss some of these examples in detail in our next lecture.

00:23:37 One of the most exciting developments in recent years has been the integration of molecular

00:23:41 dynamics with NMR or X-ray refined methods, so that one optimizes a target function which

00:23:46 is a linear combination of agreement of experiment and the molecular mechanics energy function.

00:23:52 By varying the relative weights of these two parts of the function, one can improve

00:23:55 the agreement with the experimental data while retaining a low energy structure, something

00:24:00 that is very difficult to do with standard refinement methods.

00:24:04 These combined methods greatly increase the efficiency of refinement and reduce the number

00:24:08 of steps of manual model building required.

00:24:11 They have been of great use in a large number of refinement problems.

00:24:15 Some papers on MD refinement using NMR data include the first by Kaptein et al. in the

00:24:22 next slide, which just is the title of the next slide, and a model calculation on Cramben

00:24:29 by Brunger et al.

00:24:33 In the next slide, the radius of convergence of such calculations is illustrated, i.e.

00:24:37 their ability to start from an extended structure and end up with a native-like structure.

00:24:42 Shish et al. have shown the usefulness of including explicit water in the simulations,

00:24:46 particularly when the NMR data is not very extensive.

00:24:50 In the next slide, one can see that the presence of water gives a structure that is in much

00:24:57 better agreement with the X-ray structure than any other, because the water causes the

00:25:01 burial of hydrophobic groups and exposure of hydrophilic side chains.

00:25:06 That RMD aqueous refers to the deviation from the X-ray structure when water is explicitly

00:25:12 included.

00:25:14 In an equally exciting application, Brunger et al. have shown how one can use MD in refinement

00:25:19 of X-ray structures.

00:25:20 In the next slide, it's just the title of that paper.

00:25:24 Up to now, we have focused on methods which evaluate the energy of the system and use

00:25:28 this energy to move the atoms or evaluate their properties.

00:25:31 There are a number of non-energy-based methods that have become powerful tools in modeling

00:25:36 molecules.

00:25:37 First and foremost are computer graphics methods.

00:25:40 These use molecular models based on energy calculations or experimental data, such as

00:25:44 X-ray crystallography or nuclear magnetic resonance, to construct a three-dimensional

00:25:48 representation of the atoms in the molecule.

00:25:50 New molecules can be constructed using stereochemical principles derived from the databases of known

00:25:55 structures.

00:25:56 Polymers can be built from monomer fragments and manipulated in color and stereo with real-time

00:26:01 rotation and translation of the molecules.

00:26:04 One of our moderators, Olson, is an expert in this area.

00:26:08 These representations are essential not only to do theoretical science on complex systems,

00:26:13 but to present the results of numerical simulations to the rest of the scientific community.

00:26:18 A few examples of computer graphic applications include the development of electrostatic potential

00:26:23 molecular surfaces, the representation of electrostatic potential gradients, and the

00:26:28 applications to superoxide dismutase and the analysis why Hoogsteen-based pair makes the

00:26:35 DNA ligamer, and it's the confirmation that it exists.

00:26:40 The graphics makes clear the results suggested by the crystallographers and confirmed the

00:26:45 molecular mechanics calculations.

00:26:49 The next slide shows, the previous slide showed, the electrostatic potential representation

00:26:55 and the electrostatic potential and its gradient in the active site of superoxide dismutase.

00:27:02 In the next slide, we show two representations of one theoretical and one experimental, and

00:27:10 the one on the right shows why it exists, because one has a snug Van der Waals interaction

00:27:16 between the DNA and its drug in that confirmation and not on the one on the left.

00:27:23 The next slide will show the work by Blaney et al. back in 1982, where he used a hole

00:27:31 in the active site when a known ligand was bound to think about designing new molecules

00:27:38 that would interact more effectively with thyroxine analogs.

00:27:44 This slide illustrates the hole in the active site that Blaney were able to fill with the

00:27:49 design of new analogs.

00:27:52 A second set of methods which do not use energy functions is distance geometry.

00:27:57 These begin with representing the system in terms of atom-atom or group-group distances

00:28:01 and then by using mathematical projection methods from many-dimensional space into three-dimensional

00:28:05 space, turning these distances into a set of three-dimensional structures.

00:28:10 Although originally derived to simulate protein folding, such approaches have been found use

00:28:15 in docking ligands to macromolecules, analyzing the way various small molecules could fit

00:28:19 into unknown receptors, deriving quantitative structure activity relationships for ligand

00:28:24 binding to an unknown receptor, and fitting nuclear magnetic resonance-derived distances

00:28:28 to a three-dimensional structure.

00:28:30 The strength of such methods is that they have a potential to be more unbiased and give

00:28:33 the chemist insight into what he knows and what he doesn't know about the structure in

00:28:37 question.

00:28:39 The weakness is the qualitative nature of the structures, which currently require further

00:28:42 refinement by energy-based methods to be fully realistic.

00:28:46 The next slide illustrates the use of a distance geometry method dock by Desjardins and Kuntz

00:28:53 on some papain-binding ligands to the enzyme papain.

00:28:59 The idea is to examine a large base of ligands and to screen these on steric criteria, saving

00:29:05 the more subtle screen for electrostatic and hydrogen bond complementarity for later.

00:29:11 It is much more common in the pharmaceutical industry to have a number of biological activities

00:29:15 but no known structure of the relevant receptor.

00:29:18 Slides 19 and 20 are from a paper that applies ensemble distance geometry to nicotinic receptor

00:29:25 agonists with the idea of finding the best superposition of active analogs using distance

00:29:31 criteria.

00:29:34 A final set of non-energy-based methods are those in which one is attempting to fit some

00:29:38 biological activity or binding data of a number of small ligands to an unknown receptor.

00:29:44 One can use a variety of approaches here, including pharmacophoric pattern matching,

00:29:49 statistical methods, for example, QSAR, or a combination of these.

00:29:54 As mentioned above, Krippen's distance geometry QSAR falls in this category.

00:29:59 The next two slides are from a paper by Gose and Krippen in which the structure-activity

00:30:05 relationships of inhibitors of dihydrofolate reductase are used to build a hypothetical

00:30:10 receptor shown in the next slide.

00:30:13 Illustrated near the molecule are sight points, which have groups that can contribute positively

00:30:17 or negatively to binding.

00:30:19 Hansch and co-workers have used computer graphics visualization of known macromolecular structures

00:30:24 to rationalize QSAR equations.

00:30:27 A potentially powerful approach, COMFA, combines features of pharmacophoric matching and QSAR.

00:30:32 In the next two slides are the title of the paper, and the following slides show areas

00:30:39 near a steroid which can have positive or negative interactions with the three-dimensional

00:30:48 structure of the receptor.

00:30:51 What is the prognosis for one of the most difficult and challenging problems in molecular

00:30:56 modeling?

00:30:57 The prediction of three-dimensional structure of proteins from amino acid sequence.

00:31:02 As noted just now, there are many biological systems for which there is no three-dimensional

00:31:06 structure for the receptor.

00:31:08 With the advent of gene-cloning techniques, the sequence of many of these receptors are

00:31:11 becoming known.

00:31:14 Can we turn these into structures?

00:31:16 The prognosis for this is in general poor because of the fundamental difficulty which

00:31:19 exists in simulating all complex molecular systems, sufficient sampling, and correctly

00:31:24 ranking the free energies of all the local minima of the system.

00:31:28 There are a large number of methods in the literature for conformational searching, but

00:31:31 these are mainly applicable to systems with tens rather than thousands of degrees of freedom.

00:31:37 These include Monte Carlo methods in Cartesian or internal coordinate space, methods based

00:31:41 on distance geometry, followed by molecular mechanics and dynamics, systematic search

00:31:46 techniques, high-temperature molecular dynamics, and methods that use cyclic boundary conditions

00:31:50 and Fourier analysis.

00:31:52 Whatever the method of generating conformations, it still faces the difficulty of evaluating

00:31:57 the free energies of these conformations, which is very difficult to do for many conformations

00:32:01 of polar or ionic molecules in solvent.

00:32:05 To predict protein three-dimensional structures from amino acid sequences, probably most effectively

00:32:09 done with a pattern recognition approach and a docking approach to predict qualitative

00:32:13 secondary and tertiary structures, which can then be refined with energy-based methods.

00:32:18 But still, in a typical case, there are far too many possible solutions.

00:32:21 A more limited and feasible approach can be used if one is predicting the three-dimensional

00:32:25 structure for a protein when one knows the structure of a homologous protein.

00:32:29 Depending on the percent homology, one can use different techniques that impose similarity

00:32:33 in secondary and tertiary structure on the unknown protein.

00:32:37 A most exciting recent success in this area was the prediction of the structure of the

00:32:40 HIV protease by such methods, as shown in the next slides.

00:32:46 The next slide has the title of that paper, and the following slide is the predicted structure.

00:32:51 The subsequent X-ray structure by two groups have been amazingly consistent with the predicted

00:32:55 structure.

00:32:56 The next slide, the very small, gives a title of that paper.

00:33:01 Despite the very small percent homology, the structure used for the template was the known

00:33:06 aspartyl protease of length of more than 200 amino acids, and the unknown AIDS protease

00:33:10 was a dimer of 99 amino acids.

00:33:13 This shows an example of some steroids bound to the HIV site using the dock technique to

00:33:19 find some other inhibitors.

00:33:24 But by using secondary structure near the active site and the conserved active site

00:33:27 residues, including the catalytic aspartic acids, the model structure was built.

00:33:31 Even with its qualitative correctness, it's not clear that such methods will be useful

00:33:35 for inhibitor drug design.

00:33:38 In summary, one can see that there are a wide variety of methods and exciting applications

00:33:42 of molecular modeling to organic and biological systems.

00:33:46 I have attempted to give a broad brush overview of what I consider to be among the most interesting

00:33:50 and significant areas of recent research.

00:33:54 What will the next decade bring?

00:33:55 It is dangerous to predict, but I see the increased computer power brought about by

00:33:59 massive parallelism, allowing much progress to be made on solving the global minimum problem

00:34:04 for larger and larger systems, and increasing dramatically the ability to calculate free

00:34:09 energies of activation, association, and stability.

00:34:12 New and more accurate energy functions will also increase the accuracy of modeling methods,

00:34:17 and better methodologies for integrating quantum mechanics and molecular dynamics will appear.

00:34:22 The ability of theoretical and computer-based approaches for macromolecules to impact experiments

00:34:26 may not reach the predictive capability of quantum mechanical calculations on small molecules

00:34:31 in the gas phase, but molecular dynamics has become a nearly indispensable ingredient in

00:34:35 protein structure refinement, and its usefulness in this area is likely to increase.

00:34:40 Nonetheless, the success of these predictions has been most impressive.

00:34:52 It is now time to take your questions.

00:35:02 Once you have called, please stay on until we have asked for your question on air.

00:35:07 You will be able to hear the program over the phone, so don't hang up.

00:35:11 If we are not able to use your call, we'll tell you, but we won't hang up on you, so

00:35:16 please be patient.

00:35:17 Art, while we wait for that first call, why don't you go ahead?

00:35:21 Well, Peter, I guess part of your reputation that I didn't mention at the beginning was

00:35:28 your answer machine message, and so I want to give you the opportunity now to answer

00:35:34 probably the question that's on most people's minds.

00:35:37 Where were you during the earthquake?

00:35:39 Well, Art, I was looking forward to the World Series game in the Chicago airport, and I

00:35:44 was very lucky that I was not at my desk because a bookcase with tons of J-Med chem

00:35:53 fell on my desk, so it was lucky that I didn't have an epitaph so I couldn't have appeared

00:35:59 here that said, killed by a thousand J-Med chems.

00:36:04 But what a beautiful way to go, right?

00:36:05 Yes, that would have been a wonderful way to go.

00:36:08 But actually, getting back to the purpose of this teleconference, this is the time for

00:36:15 people to interact with you, both the people here in the audience and the people that are

00:36:21 viewing at the remote sites.

00:36:24 But while we wait for them to contact us or come forth to the microphone, you people out

00:36:30 there, I'd like to ask at least one question to start out with, and that is that in the

00:36:35 overview you mentioned a large number of computing techniques, modeling techniques,

00:36:41 and there are a lot of people out here that have problems that they want to address with

00:36:47 these techniques.

00:36:48 Now, you and I are both in the fortunate position of being able to pick our problems for, maybe

00:36:57 they're well-behaved or maybe you know a lot about them or there's a lot of experimental

00:37:01 data.

00:37:02 The situation for most people is, well, I've got this problem, what technique do I use?

00:37:09 How much effort do I invest?

00:37:12 How many people do I commit?

00:37:15 How do they answer that kind of question?

00:37:17 Well, I think that that's a good general question that I think is appropriate to really spend

00:37:27 a lot of time thinking about it and talking to an expert about whether the current theoretical

00:37:35 techniques can say anything useful.

00:37:36 Of course, the expert may not know either, but I think we're often so carried away with

00:37:42 the beauty of the color pictures and the technology is so exciting that we will just throw resources

00:37:48 at a problem which has no, we don't have enough experimental data to be able to attack usefully.

00:37:56 I mean, that's one of the differences I try to emphasize in my talk, that we can take

00:38:02 small molecules in the gas phase and do everything with Schrodinger's equation and relate beautifully

00:38:06 to experiment, but when we have complicated systems in solution, a reasonable amount of

00:38:12 experimental data is often critical to guide us to a sensible solution, to sort of make

00:38:17 our solution, to frame our solution in such a way that it can further be useful in designing

00:38:24 and carrying out other experiments.

00:38:26 So we work really hand-in-hand with experiments, but if there isn't enough experimental data

00:38:30 there, I think one just can't attack the problem, and there are unfortunately too many problems

00:38:34 that have this, you know, today that we can't attack because we don't have the data.

00:38:41 Yeah, I guess it's, you know, it's a good ad for consulting, but I think a real problem

00:38:54 is overselling a technique, in a way, because our business is to be in the, to write papers

00:39:01 and to extend the techniques.

00:39:06 I guess another way of asking the question is, are there examples where it's been picked

00:39:14 up and used very, very successfully by non-developers?

00:39:19 Yeah, well, I, again, I, you know, the people have their, all have their own anecdotes,

00:39:28 but I think that, you know, I get feedback from people who, you know, have used molecular

00:39:37 dynamics and just run many trajectories, and of course they, in some cases they've actually

00:39:43 stumbled on new confirmations, and in other cases they have, in other cases they have

00:39:47 not, you know, they haven't found anything.

00:39:50 So I think people, you know, I think that there are many examples of where people have

00:39:59 used the technique, you know, used the technique in a useful way, but usually, I'd say much

00:40:06 more often, they've been successful in cases where they have worked closely with another

00:40:14 chemist who has a sense of the technology.

00:40:19 I want to remind our viewers that you can call in.

00:40:23 Please don't be shy about doing that.

00:40:25 We have two experts right here on the set who are waiting to take your questions, so

00:40:29 pick up the phone and call, and as soon as you do, we'll try to get to you.

00:40:33 So why don't you continue, gentlemen?

00:40:36 Okay.

00:40:37 But you have to realize that if you don't call in, or if you, people don't come up and

00:40:41 talk and give your point of view, then you have to sit here and listen to our questions

00:40:46 and our points of view.

00:40:48 And actually, some of the questions that we're asking now, I think, could be held for the

00:40:53 more general discussion at the end, where...

00:40:55 Well, we do have a caller right now.

00:40:57 It's Bill on line six, so Bill, please go ahead with your question.

00:41:05 And where is Bill?

00:41:06 This always happens.

00:41:07 Bill's coming through.

00:41:08 Bill's coming through.

00:41:09 He must be calling from a faraway site.

00:41:12 From Canada.

00:41:13 Yes, from Canada.

00:41:14 No, I don't know that.

00:41:17 I'll let you know when Bill comes through.

00:41:20 Bill's not ready.

00:41:21 All right.

00:41:22 This always happens, by the way, with the very first caller, so as time goes by, we'll

00:41:26 get this worked out.

00:41:27 I can amplify, you know, the way I answered your last question, Art, and that is that

00:41:32 I think molecular modeling, the birth of molecular modeling as a widely applicable

00:41:38 tool or of interest to many organic chemists really happened with the development and refinement

00:41:45 of computer graphics.

00:41:46 In other words, when molecular modeling was mainly just empirical or numerical computation,

00:41:57 it was in the hands of the experts.

00:41:58 I think it has really been graphics of which, you know, graphics techniques development

00:42:06 where you could represent the results from calculations in a way that chemists started

00:42:11 to see them and get insight into them that has led to molecular modeling being a more

00:42:16 generally used and useful tool.

00:42:19 And I don't think there's any prescription for the sort of the overselling of something.

00:42:23 I mean, it's sort of like let the buyer beware of the techniques.

00:42:28 I mean, the techniques are out there.

00:42:30 They can be gotten at an ever decreasing cost, but it's like any other area of science.

00:42:38 You have to make a decision on whether you want to do this experiment, whether you want

00:42:43 to do this simulation, and ultimately the test of whether this is successful is whether

00:42:51 you gain some insight that will guide further experiments that will be predicted.

00:42:58 I always, I usually often mention in talks that the three roles of theoretical chemistry

00:43:07 as applied to molecules are first, you would like to be able to simulate the system in

00:43:12 reasonable agreement with known experiments.

00:43:15 Secondly, you want to use those results to gain insight into the system.

00:43:21 That is, in other words, get a qualitative picture that lets you manipulate it better,

00:43:26 that lets you explain it better, that lets you make predictions even in a qualitative

00:43:29 way.

00:43:30 And thirdly, you'd like to be able to make quantitative numerical predictions on the

00:43:35 system in advance of experiment.

00:43:37 That's when the theory is fully successful.

00:43:40 And again, coming back to the theme that Art and I mentioned, the fact is that today in

00:43:48 biological systems, we maybe only can have 10% of the systems where we can have partial

00:43:54 success by these three criteria.

00:43:57 Maybe it's going to be 20%.

00:43:58 I mean, we don't know what the future will hold.

00:44:00 Yes?

00:44:01 Okay, I believe that we have a call from Memphis.

00:44:07 So what do you have to say, Memphis?

00:44:12 Are we through now?

00:44:13 Oh, good.

00:44:14 You've put me on hold before.

00:44:15 This is Bill Purcell calling, and first of all, I'd like to say hello to Peter.

00:44:21 I think you did a super job.

00:44:23 Peter, do you have any idea of any examples where one has tried to correlate HPLC binding

00:44:33 with molecular modeling?

00:44:35 That is to say, if you look at the retention times of various molecules coming through

00:44:41 a column, for example, can that be correlated with the energy of binding that one might

00:44:47 approximate through molecular modeling?

00:44:50 I don't have an answer directly, Bill, for the HPLC correlation, but I know the methods

00:44:59 I mentioned on doing free energy calculations of transfer from the gas phase to water.

00:45:05 There have been applications, I think, both by Bill Jorgensen and a group at Oxford looking

00:45:10 at the free energies of transfer now between various small molecules between a water phase

00:45:17 and another nonpolar phase.

00:45:20 So I think that if we had a sense of what the phase the HPLC is involved in, if we could

00:45:26 represent that molecularly, or the two phases, I think current free energy methods could

00:45:32 be applied in a way to deal with this problem.

00:45:38 For instance, one of the students in my lab, Steve DeBold, is very interested in the idea

00:45:44 of how amino acids are soluble in membranes.

00:45:48 And so one would have to create the right molecular environment to represent this free

00:45:53 energy of transfer.

00:45:55 But I think our potential functions are now at the stage where we can calculate such free

00:46:00 energies of transfer between various phases.

00:46:03 We just have to define the molecular mix.

00:46:06 We're getting ever better at being able to calculate that.

00:46:09 Great.

00:46:10 Thank you very much, Peter.

00:46:13 I appreciate that.

00:46:14 And maybe we should get together sometime and talk about it privately.

00:46:20 Thank you, Bill, for calling.

00:46:21 Peter, we have a question from our audience.

00:46:23 Hi.

00:46:24 I'm Joe Major from Hybertech Incorporated here in San Diego.

00:46:27 I have a question for Dr. Coleman.

00:46:29 You mentioned a number of different molecular modeling techniques.

00:46:32 I wonder if you would care to comment on the comparative advantages, disadvantages of,

00:46:36 say, Monte Carlo simulations versus molecular dynamics.

00:46:40 When one would use one, what are the parameters that one should keep in mind in considering

00:46:45 those two?

00:46:47 Thanks very much for your question.

00:46:49 I think it's a very good question because we have, really, the world's expert on Monte

00:46:54 Carlo calculations, the world's advocate of Monte Carlo calculations, Bill Jorgensen,

00:46:58 later on our program.

00:47:01 And I think that we have done more molecular dynamics calculations, but we have Bill Jorgensen's

00:47:09 BOSS program running in our lab, and there are particular applications.

00:47:12 For instance, the derivation of parameters for liquids, where we use the Monte Carlo

00:47:23 programs because they are more effective in our mind at deriving the parameters for liquids

00:47:28 because Bill has sort of set the framework for that.

00:47:31 The advantage of molecular dynamics, though, is on multiply connected polymers, that in

00:47:36 one time step, all of the atoms are moved at once, whereas in Monte Carlo methods, you

00:47:42 move essentially one molecule at a time.

00:47:46 And the disadvantage of applying Monte Carlo methods to polymers is that, for instance,

00:47:51 if you rotate one bond and look whether that's an allowed energy, the whole system will move

00:47:56 a lot and you'll get a lot of disallowed moves.

00:48:00 So it's, in principle, relatively inefficient for polymers, although Nobuhiro Go of Japan

00:48:07 has developed a Monte Carlo method where he looks at the normal modes of vibration of

00:48:11 the polymer and moves, does his Monte Carlo moves in normal mode space rather than in

00:48:17 Cartesian space.

00:48:19 And he claims that he gets good efficiency for movement of a peptide chain that way.

00:48:25 So I think that it's like anything else, there'll be particular problems where the

00:48:29 dynamics is more useful.

00:48:31 It's a more general theory in that you can get dynamical properties, but there will be

00:48:34 cases where Monte Carlo is more efficient and more effective.

00:48:39 Thank you.

00:48:40 We have Robert from Columbia on line five.

00:48:43 Robert?

00:48:44 This is University of Missouri.

00:48:46 I'd like to ask whether it's possible to relate the three-dimensional structure of a peptide

00:48:53 to binding to, of course, the molecular model of a receptor or large molecular protein.

00:49:01 One case in point might be insulin binding its receptor, and if you can do that, then

00:49:06 can you project altered insulins or other small peptides with its receptor or enzyme?

00:49:15 Thank you.

00:49:18 Thanks very much.

00:49:19 That's a very good question, and I think it gives me the opportunity to make a point that

00:49:24 I didn't make in my talk, and that is I think the molecular modeling methods can be more

00:49:29 powerful if you actually have the structure of the receptor.

00:49:34 For instance, we have the structure of insulin, the three-dimensional x-ray structure.

00:49:37 We can do a lot of manipulations on it very accurately and effectively.

00:49:41 We don't have the structure of the insulin receptor.

00:49:46 Also, another problem with what you're suggesting is that it's often more difficult to determine

00:49:52 structures of small peptides than it is for larger proteins because these peptides are

00:49:56 floppy.

00:49:57 They don't actually have a well-defined confirmation.

00:50:00 I'm answering all the aspects of your question.

00:50:03 It had many facets.

00:50:06 One of the features of my talk was this ensemble distance geometry and other techniques when

00:50:10 you don't have the structure of the receptor.

00:50:13 For instance, for insulin, one could look at the insulin molecule itself and try to

00:50:19 think about the structure of the receptor around insulin, but it's a difficult problem.

00:50:28 All of these are difficult problems.

00:50:30 They become more difficult if you don't have the three-dimensional structure of at least

00:50:35 one of the molecules or perhaps both of the molecules that you're trying to model.

00:50:39 All right.

00:50:40 We have Charles on Line 8 in Minneapolis.

00:50:43 Charles, go ahead, please.

00:50:44 Thank you.

00:50:45 I'd be interested in hearing your thoughts regarding what we can expect in terms of new

00:50:51 developments in methodology and new developments in hardware and how these are going to affect

00:50:59 the limitations on the kinds of biological systems that we can actually look at in the

00:51:04 future.

00:51:12 This is a dangerous prediction.

00:51:13 I mean, as I said at the end of my talk, it's dangerous to make these predictions.

00:51:17 But I think that if you asked most chemists, computational chemists, they would agree that

00:51:24 if we can get to the stage of harnessing the massive parallelism that we see in some computers

00:51:30 where each processor is very ... There are not only millions of processors, but each

00:51:37 is very intelligent, can communicate.

00:51:39 You can write high-level languages to use them.

00:51:42 We would see a major increase in the size of molecules where we could essentially solve

00:51:49 the local minimum problem.

00:51:51 And this would actually really stimulate, I think, more and bring out to the fore inaccuracies

00:51:58 in the energy function that we have.

00:52:02 That's a second area of development, not only the increased computer power, but improving

00:52:06 the energy functions.

00:52:07 And there are many groups in the world working on improving the parameters and the actual

00:52:12 functional form to use to represent molecular systems.

00:52:17 I see those two as being important technical developments that are going to happen in the

00:52:24 next years.

00:52:25 And they will ... I also think that we will, as we increase our database of protein structures

00:52:31 and protein structural motifs, and get more clever about using homology to build new protein

00:52:38 structures, we will get ... We will make progress, and I see over the next 10 years a terrific

00:52:44 progress in coming closer to not necessarily solving the protein folding problem, but having

00:52:50 a lot more success as the example that Pearl and Taylor showed, that I showed of Pearl

00:52:56 and Taylor having a lot more success in deriving three-dimensional structures of protein from

00:53:03 known homologous protein.

00:53:04 I have to jump in here.

00:53:06 It's time to move on.

00:53:07 Peter, thank you.

00:53:08 We will see Peter back in about an hour when he talks about his newest applications.

00:53:13 Thank you.

00:53:16 We go now to William Jorgensen's talk on Structure and Binding in Bio-Organic Host-Guest Chemistry.

00:53:22 Art?

00:53:23 Bill's work branches, or bridges, between organic chemistry and molecular biology.

00:53:29 His contributions to the understanding of solvent effects have had wide impact, and

00:53:33 his recent work on hydrogen bonding promises to address a number of biological questions.

00:53:39 Bill?

00:53:40 Thank you, Art.

00:53:42 The theme of my talk today will be Understanding Intermolecular Interactions and Solution.

00:53:49 This is critically important for both controlling molecular recognition and the design of selective

00:53:54 hosts.

00:53:55 In view of the time limitations, I have chosen to focus on hydrogen bonding.

00:53:59 The key issue will be the competition of inter-solute and solute-solvent interactions.

00:54:04 Three examples will be considered.

00:54:06 The first is a simple one, the dimerization of N-methyl acid amide, NMA, in chloroform

00:54:12 and water.

00:54:13 We will then move on to a more elegant organic host, one of Rebix molecular clefts, which

00:54:18 can bind nitrogen heterocycles.

00:54:21 And finally, we will consider interactions of nucleotide bases in chloroform and some

00:54:25 striking variations in binding for triply hydrogen-bonded systems.

00:54:31 The methodology that is involved in all of these studies features Monte Carlo statistical

00:54:37 mechanics simulations using the BOAS program.

00:54:41 In each case, dilute solutions are being simulated.

00:54:44 The systems consist of the solutes plus about 300 solvent molecules in a box with periodic

00:54:50 boundary conditions.

00:54:52 The calculations were run at constant temperature, 25 degrees, and pressure of one atmosphere.

00:54:58 The binding thermodynamics are probed with statistical perturbation theory as described

00:55:02 by Professor Coleman.

00:55:04 Both absolute and relative free energies of binding can be obtained.

00:55:08 In figure one, there is a representation of the NMA dimer in chloroform from the simulations.

00:55:15 The OPLS potential functions that are used here employ a united atom model for CHN groups.

00:55:21 So the methyl hydrogens in NMA and the hydrogen in chloroform are implicit.

00:55:26 All other atoms are explicitly represented and interact intermolecularly via Coulomb

00:55:32 plus Leonard-Jones terms.

00:55:34 For this example, we have computed a free energy profile or potential of mean force

00:55:39 for separating the two NMA molecules as a function of a carbonyl O to amide N distance.

00:55:46 What we find in chloroform at short separations, about three angstroms, is that the amides

00:55:52 are indeed hydrogen-bonded as illustrated.

00:55:55 Of course, there is variation in the structure during the simulation owing to the thermal

00:56:00 configurational averaging that yields the proper Boltzmann weighted results.

00:56:05 However, in turning to water, we find that the amides do not hydrogen-bond at short separations.

00:56:13 What one gets is a preference for stacked arrangements that feature reasonable dipole-dipole

00:56:18 alignment for the amides.

00:56:20 The inter-solute interaction is weaker here at about minus three and a half kilocalories

00:56:25 per mole versus minus seven and a half kilocalories per mole for the hydrogen-bonded arrangement

00:56:31 in figure one.

00:56:33 However, the edges of the amides are now exposed for maximal hydrogen bonding with the solvent.

00:56:40 Both carbonyl oxygens are participating in about two hydrogen bonds with water, and the

00:56:45 NH groups act as a donor of another hydrogen bond each.

00:56:49 It is important to realize that in the second figure, just one snapshot of millions from

00:56:56 the simulations was shown, though it has been chosen as typical.

00:57:00 Nevertheless, these pictures provide no sense of whether the illustrated arrangements are

00:57:05 in fact stabilized relative to infinite separation of the amides and the solvents.

00:57:12 That information is clearly provided by the potentials of mean force, which are showed

00:57:16 in the next illustration.

00:57:18 In two series of simulations, the amides were gradually perturbed apart, and the free

00:57:23 energy changes were accumulated.

00:57:25 In chloroform, there is indeed a substantial hydrogen bonding well with a maximum depth

00:57:31 of minus three and a half kilocalories per mole at a separation of about 2.9 angstroms.

00:57:38 It might be noted that with these OPLS potentials, the optimal NMA-NMA interaction in the gas

00:57:45 phase is minus nine kilocalories per mole.

00:57:48 So there is substantial damping of that figure via the configurational averaging under ambient

00:57:54 conditions and in view of the solvent competition.

00:57:58 On the other hand, in water, we find that the amides exhibit no net attraction.

00:58:03 Their approach is innocuous, up to about 3.5 angstroms, and then it becomes repulsive at

00:58:09 shorter distances.

00:58:11 The 9 kcal per mole optimal interaction is entirely wiped out by the thermal and solvent

00:58:16 effects.

00:58:17 It is clearly very difficult to predict this a priori.

00:58:22 Appropriate experiments or computations are required.

00:58:25 But how do we know that the computed results have any validity?

00:58:29 A comparison between theory and experiment is desired.

00:58:33 It can be provided by computing an association constant, Ka, from the potentials of mean

00:58:38 force.

00:58:40 The equation shown in the figure gives the desired relationship that has been known since

00:58:46 the 1920s during the development of Debye-Huckel theory and Bjerrum's modifications to it.

00:58:52 Integration over the orientationally averaged potential of mean force is performed to a

00:58:57 cutoff distance C that defines the geometric limit to association.

00:59:02 The results are given as a function of C in the next illustration.

00:59:07 Fortunately, the dependence on C is not strong, so reasonable choices that encompass the free

00:59:13 energy well give a computed Ka of about 4 liters per mole for NMA in chloroform.

00:59:20 There is a corresponding value available from IR measurements.

00:59:24 It is about 3 liters per mole.

00:59:27 So theory and experiment are in good accord.

00:59:30 In water, the classic experiments of Klotz and Franzen yielded a Ka for NMA of .005 liters

00:59:37 per mole.

00:59:39 There is much difficulty in measuring such small values.

00:59:42 However, the qualitative conclusion is clearly consistent with the present computed results.

00:59:49 Amides exhibit negligible attraction in water.

00:59:52 Quantitatively, the appropriate choice for the cutoff C is not obvious.

00:59:56 In this case, the values of 4 to 5 angstroms give KAs of about a tenth of a liter per mole.

01:00:02 Thus, the overall comparisons provide confidence in the computed potentials of mean force.

01:00:08 This example also clearly illustrates the difficulties in just considering gas phase

01:00:13 interactions and then coming to any conclusions about binding and solution.

01:00:18 The effect of solvent needs to be evaluated in molecular detail.

01:00:24 Moving to the next system, we see an organic host that has been prepared by Rebic and co-workers.

01:00:31 The molecule features a cleft with the two acid groups and the lone pair on the acridine

01:00:36 nitrogen directed inward.

01:00:39 This is nicely reminiscent of the binding clefts that are usually present in the active

01:00:43 site region of enzymes.

01:00:45 Thus, Rebic's design has clear advantages for potential molecular recognition and catalysis

01:00:51 over some alternative bioorganic host molecules, such as cyclodextrins, where the functionalization

01:00:58 must occur on the rim of the cup.

01:01:01 Rebic has studied binding of various nitrogen heterocycles to this host, including pyridine,

01:01:07 pyrimidine, and pyrazine.

01:01:10 The cleft has been proposed to be ideal for binding pyrazine in the two-point fashion

01:01:16 that is illustrated.

01:01:18 With the gas nuzzled into the cleft and anchored by the two hydrogen bonds.

01:01:23 The question is, how can one tell if this is really happening?

01:01:28 What has been measured by Rebic is the binding data shown in the next illustration obtained

01:01:34 by NMR titrations.

01:01:36 In particular, we see that pyrazine is well bound with a Ka of 1400 liters per mole in

01:01:42 deuterochloroform and that pyridine, which could only participate in one hydrogen bond

01:01:47 with the host, has a smaller Ka of 120 liters per mole.

01:01:52 The factor of 12 enhancement translates to a free energy preference of 1.45 kcals per

01:01:59 mole for binding pyrazine.

01:02:01 The qualitatively reasonable, the more refined question is whether 1.45 kcals per mole is

01:02:08 quantitatively reasonable for the gain of the second hydrogen bond.

01:02:13 One should realize