Transcript: Molecular Modeling for Biological Systems (Supercomputer Teleconference) Part 2
1990-Jan-24
These captions and transcript were generated by a computer and may contain errors. If there are significant errors that should be corrected, please let us know by emailing digital@sciencehistory.org.
00:00:00 One should realize that when a complex is formed,
00:00:03 the price for the loss in translational entropy
00:00:06 is only paid once, so the entropic cost
00:00:09 of the second hydrogen bond is relatively small.
00:00:12 Also, in the gas phase with our potential functions,
00:00:15 acetic acid and pyrazine have an optimal interaction energy
00:00:19 of minus 6.9 kilocalories per mole,
00:00:22 while the interaction with the more basic pyridine
00:00:25 is somewhat stronger at minus 7.3.
00:00:28 So these hydrogen bonds have strengths
00:00:31 around 7 kilocalories per mole in the gas phase.
00:00:34 Has this been damped out by the solvent
00:00:37 so that only 1.45 kcals per mole in free energy
00:00:41 remains for the loss of a full hydrogen bond
00:00:44 in going from pyrazine to pyridine with Rebix diacid?
00:00:49 It was our feeling that this damping seemed too great.
00:00:53 Consequently, we decided to explore
00:00:56 the structure in energetics with fluid simulations.
00:00:59 The system was set up using an x-ray structure
00:01:02 for the host obtained from Professor Rebick,
00:01:05 and we had to develop the requisite torsional potentials,
00:01:08 though the host is and was designed to be quite rigid,
00:01:12 including the buttressing methyl groups on the acridine.
00:01:15 The simulations were aimed at obtaining
00:01:18 the relative free energy of binding
00:01:20 for pyrazine versus pyridine.
00:01:22 Thus, the calculations involved mutating pyrazine to pyridine
00:01:25 both bound to the host and unbound
00:01:28 as represented in the illustrated thermodynamic cycle.
00:01:32 Roughly 300 chloroform molecules were included,
00:01:37 and the calculations for the bound system
00:01:39 were initiated with the guest in the cleft of the host.
00:01:43 In the thermodynamic cycle, we have on the top line
00:01:46 the host-binding pyrazine and on the bottom line
00:01:49 the binding of pyridine.
00:01:51 Rather than trying to compute the absolute free energies
00:01:54 of binding corresponding to the horizontal arrows,
00:01:58 the computationally more straightforward
00:02:00 vertical mutations were performed.
00:02:03 It was found that pyrazine is better solvated than pyridine
00:02:06 by 0.3 kilocalories per mole,
00:02:09 while the conversion of the bound complex with pyrazine
00:02:12 to the one with pyridine costs 1.5 kcals per mole.
00:02:17 The difference in these numbers, 1.2 kilocalories per mole,
00:02:21 is then the calculated difference
00:02:23 in free energy of binding favoring pyrazine.
00:02:27 The agreement with the experimental value of 1.45
00:02:30 is very good.
00:02:32 So we have reproduced the experimental thermodynamic data.
00:02:37 The remaining issue is the structure.
00:02:40 The beauty of the simulations is that one obtains
00:02:43 total structural information as well as the thermodynamics.
00:02:48 The next picture shows the pyrazine complex,
00:02:52 just one snapshot from the simulation.
00:02:55 However, we have averaged the structure for the complex
00:02:59 over the entire run, and in fact,
00:03:01 the illustrated structure was chosen
00:03:04 because it provides an excellent depiction
00:03:06 of the average structure.
00:03:08 There's only one good hydrogen bond on average.
00:03:12 The cleft appears to be too small
00:03:14 to optimally accommodate pyrazine.
00:03:16 So there is the one good hydrogen bond
00:03:18 and then a weaker secondary electrostatic interaction
00:03:21 between the other acid group
00:03:23 and the more remote pyrazine-nitrogen.
00:03:26 However, the secondary interaction
00:03:28 can account for the 1.45 kilocalories per mole
00:03:31 preference for binding pyrazine.
00:03:34 During the simulation, of course, everything moves.
00:03:37 The pyrazine tumbles,
00:03:39 sometimes forming the hydrogen bond
00:03:41 on one side and then on the other.
00:03:43 Though it never sits smack
00:03:45 between the acid groups buried in the cleft.
00:03:48 It floats above the cleft
00:03:50 and occasionally sits more symmetrically
00:03:52 with two highly bent hydrogen bonds
00:03:55 to the acid groups.
00:03:57 Incidentally, the computations were performed
00:04:00 for the acid groups both anti, as was shown, and syn.
00:04:04 The binding results were the same in both cases.
00:04:08 With pyridine, the structure is, on average,
00:04:11 even more tipped up than is illustrated,
00:04:13 though there is always one good hydrogen bond
00:04:16 between the pyridine-nitrogen and one of the acid groups.
00:04:20 So we believe that the pyrazine complex
00:04:23 does not exhibit the two-point binding
00:04:25 that has been postulated,
00:04:27 but there is one strong hydrogen bond
00:04:29 and a weaker secondary electrostatic interaction.
00:04:32 Can this position be supported?
00:04:35 We have made an attempt
00:04:37 by trying to address the following question.
00:04:40 If we had the two-point binding for pyrazine,
00:04:43 what would be the binding preference over pyridine?
00:04:46 A bigger cleft is needed,
00:04:48 and we have considered the model system that is illustrated.
00:04:52 Gas phase calculations yielded an optimal separation
00:04:56 of 8.4 angstroms between oxygens
00:04:59 for the complex with two acetic acid molecules and pyrazine.
00:05:03 The average separation in Rebix hosts
00:05:06 is under 8 angstroms according to the simulations.
00:05:09 Noting this, we then pinned two acetic acid molecules
00:05:13 in a box of chloroform
00:05:15 with the oxygens fixed at a separation of 8.4 angstroms.
00:05:19 Everything else was allowed to move,
00:05:21 and the mutation of pyrazine to pyridine was repeated.
00:05:25 This time, the pyrazine stays between the acids,
00:05:28 as shown in the next illustration.
00:05:31 It might be noted from this figure
00:05:33 that the torsional motion for the acid hydrogens
00:05:36 is witnessed by the non-planarity
00:05:39 of the acetic acid molecule on the right.
00:05:42 Also, the pyrazine wobbles in the cleft,
00:05:45 though it maintains the two hydrogen bonds.
00:05:48 The cost in mutating the complex pyrazine to pyridine
00:05:51 is now 3.9 kilocalories per mole,
00:05:54 giving a net difference in binding-free energy
00:05:57 of 3.6 kilocalories per mole.
00:05:59 This translates to a Ka ratio of 405
00:06:03 as compared with the factor of 12 for Rebic's host.
00:06:07 So we feel that a Ka ratio on the order of 400
00:06:11 would be diagnostic of the two-point binding,
00:06:14 and that the ratio of 12 is inadequate.
00:06:17 Furthermore, this example nicely illustrates
00:06:20 how theory and experiment can go hand-in-hand
00:06:23 in fully characterizing the thermodynamic
00:06:26 and structural issues associated
00:06:28 with bioorganic host-guest chemistry.
00:06:31 The recent interest in DNA triple helices, ribozymes,
00:06:36 and now aggregates such as the guanine quartet
00:06:40 require renewed inquiry
00:06:42 into the association of nucleotide bases and analogs.
00:06:46 In this vein, we have been intrigued
00:06:49 by triply hydrogen-bonded systems
00:06:51 and their association in chloroform.
00:06:53 Nature has provided the beautiful example
00:06:56 of guanine cytosine,
00:06:58 which has been estimated by Lord and Rich
00:07:01 using IR spectroscopy
00:07:03 to have a Ka of 10 to the 4th to 10 to the 5th
00:07:06 in deuterochloroform, as illustrated.
00:07:09 Recently, Kelly has obtained a similar Ka,
00:07:13 1.7 times 10 to the 4th,
00:07:15 for a related complex
00:07:17 where the amino pyrimidinone ring of guanine
00:07:21 has been replaced by an amino pyridone.
00:07:24 The impact of the third hydrogen bond
00:07:27 appears clear in comparisons
00:07:29 with the Ka's of about 10 to the 2
00:07:32 for doubly hydrogen-bonded systems
00:07:34 such as adenine with uracil.
00:07:37 However, the data in the next figure
00:07:40 make the situation look more complicated.
00:07:43 These triply hydrogen-bonded cases
00:07:46 show Ka's in the doubly hydrogen-bonded range,
00:07:49 some 2 to 3 orders of magnitude weaker
00:07:52 than for guanine cytosine.
00:07:54 These systems involve binding of uracil or thymine
00:07:57 with a diaminopurine or a diaminopyridine.
00:08:01 The former was studied by Lord and Rich,
00:08:04 and the latter has been studied by Hamilton and others.
00:08:08 The basic types of hydrogen bonds
00:08:10 are the same in all cases
00:08:12 with two O to HN and one NHN hydrogen bonds.
00:08:18 It should be noted that analogs of the nucleotide bases
00:08:21 continue to be important chemotherapeutics.
00:08:24 2,6-diaminopurine
00:08:26 was one of the earliest anti-leukemia drugs
00:08:29 developed by Gertrude Elion,
00:08:31 while AZT and DDI
00:08:33 are the current preferences for control of AIDS.
00:08:36 The binding differential provides an obvious challenge,
00:08:39 which we pursued through Monte Carlo simulations
00:08:42 using a thermodynamic cycle
00:08:44 to address the relative binding-free energies.
00:08:47 Simulations were run for guanine cytosine, GC,
00:08:52 in chloroform,
00:08:54 being mutated to the complex between uracil
00:08:57 and the parent 2,6-diaminopyridine, DAP,
00:09:02 illustrated in the next figure,
00:09:05 and also GC being mutated to AU.
00:09:10 The thermodynamic results are summarized as shown.
00:09:15 The calculations have some added novelty
00:09:18 in that each cycle involved three separate mutations.
00:09:21 For example, the mutations G to DAP,
00:09:24 C to U, and GC to U-DAP
00:09:28 were performed to obtain the required Ka ratio.
00:09:32 Naturally, potential functions
00:09:34 for the nucleotide bases were needed.
00:09:36 These were obtained by fitting the OPLS parameters
00:09:39 in an all-atom representation
00:09:41 to the results of ab initio molecular orbital calculations
00:09:44 on complexes of the bases or fragments
00:09:47 and a water molecule in many orientations.
00:09:50 The key thermodynamic result from the simulations
00:09:53 was that GC is predicted to be better bound
00:09:57 by a Ka ratio of 10 to the 5th
00:10:00 than either U with DAP or A with U.
00:10:05 This implies that the predicted binding of U with DAP
00:10:09 and A with U is similar,
00:10:12 which is consistent with the Ka ratios
00:10:15 of about 100 that are observed for complexes of these types.
00:10:20 The much stronger binding for guanine cytosine
00:10:23 has also been reproduced,
00:10:25 though the predicted difference is even greater
00:10:28 than what has been estimated experimentally.
00:10:31 The simulations reveal that a key component
00:10:34 of the stronger binding for GC
00:10:36 is an intrinsically strong interaction for this complex.
00:10:40 This is evident from the results
00:10:42 of gas phase optimizations as illustrated.
00:10:46 The optimal interaction energy for GC
00:10:48 is minus 22 kilocalories per mole
00:10:52 with the present potential functions.
00:10:54 This value is within a Kcal per mole
00:10:57 of other theoretical results
00:10:59 and an experimental value for mass spectroscopy.
00:11:02 In contrast, the optimal interaction energy
00:11:06 for the U-DAP complex
00:11:08 is only minus 11 Kcals per mole.
00:11:12 The shock of this figure
00:11:15 is reinforced by the optimal interaction
00:11:18 of minus 10.6 Kcals per mole
00:11:22 that is found for adenine with uracil
00:11:25 as shown in the next illustration.
00:11:28 That is, addition of the third hydrogen bond
00:11:31 in DAP with uracil
00:11:33 contributes essentially nothing
00:11:35 to the interaction energy.
00:11:38 Since the primary interactions
00:11:40 have the same types
00:11:42 in the triply hydrogen-bonded complexes,
00:11:44 a more subtle origin
00:11:46 must be sought for the discrepancies.
00:11:48 Though a multipole analysis could be given,
00:11:51 a simpler approach is provided
00:11:53 by considering the secondary interactions
00:11:55 shown for GC.
00:11:57 In multiply hydrogen-bonded systems,
00:12:00 not only are the directly hydrogen-bonded atoms
00:12:03 close together,
00:12:04 but also the atoms in adjacent hydrogen bonds
00:12:07 are close as well.
00:12:09 The significant partial charges and short distances
00:12:12 make these secondary electrostatic interactions important.
00:12:16 For GC, the illustrated secondary distances
00:12:20 are all under 3.6 angstroms.
00:12:23 Taking a closer look,
00:12:24 the nature of each secondary interaction
00:12:26 can be ascertained from the partial charges.
00:12:29 Starting at the top with GC,
00:12:31 the HH and NO interactions would be repulsive,
00:12:35 while the NH and HO interactions are attractive.
00:12:40 Thus, the secondary interactions cancel,
00:12:42 and there is no net secondary effect.
00:12:46 If the situation were the same for DAP with uracil,
00:12:50 we would obviously not have much to talk about.
00:12:54 However, as implied in the next figure,
00:12:57 all four secondary interactions
00:12:59 are in fact repulsive for DAP with uracil.
00:13:04 At this point, all we have to do is postulate
00:13:07 that the primary hydrogen-bonding interactions
00:13:09 are worth about 7.5 kilocalories per mole each
00:13:13 and that the secondary interactions are worth 2.5.
00:13:17 This leads to a net interaction
00:13:19 of 3 times minus 7.5 or minus 22.5 for GC,
00:13:25 and the minus 22.5 plus 4 times 2.5
00:13:30 for the four destabilizing interactions,
00:13:33 giving a net of minus 12.5 for DAP with uracil.
00:13:39 The results can also be generalized
00:13:41 for triply hydrogen-bonded systems
00:13:43 as summarized in the next illustration.
00:13:46 There are three possible arrangements.
00:13:48 The worst case has the partially positive and negative sites
00:13:51 alternating on each molecule.
00:13:54 This leads to the four repulsive secondary interactions,
00:13:57 as in DAP with uracil.
00:14:00 And for GC, on the other hand,
00:14:04 we have the intermediate situation
00:14:06 with a net index of zero secondary interactions.
00:14:10 The best situation would be to have one molecule
00:14:12 with all of the hydrogen-bond donor sites
00:14:15 and the other with all of the acceptor sites.
00:14:17 This leads to a net
00:14:19 of four constructive secondary interactions.
00:14:22 Based on the above results,
00:14:24 systems of the last type might exhibit optimal interactions
00:14:27 around 10 kcal per mole stronger than for GC.
00:14:32 The key message for molecular recognition
00:14:35 is that very strong binding can be obtained
00:14:38 by getting all of the plus ducts on one side
00:14:41 and all of the minus ducts on the other.
00:14:44 This is the end of my short discourse on hydrogen bonding.
00:14:49 It is clear that fluid simulations
00:14:51 are going to play an increasingly important role
00:14:53 in molecular design.
00:14:55 Detailed consideration of solvent effects is critical,
00:14:58 and the inadequacies of conclusions
00:15:01 just based on gas phase data are extreme.
00:15:05 My co-workers on these projects are...
00:15:09 are shown on the slide,
00:15:11 including Stefan Budan,
00:15:13 Tuan Nguyen,
00:15:14 Scott Wierschke,
00:15:15 Jim Blake,
00:15:17 Julian Pernada,
00:15:18 and Julian Tirado-Rivas.
00:15:20 Support has been provided by the NIH and NSF.
00:15:39 Again, we're ready to hear from you, the viewers.
00:15:42 Go to the phones now
00:15:43 and call in your questions and comments.
00:15:45 This is the time that you have to interact
00:15:47 with today's speaker.
00:15:49 Art, while we are waiting for calls for Bill,
00:15:52 do you have any comments or follow-up questions?
00:15:54 Yeah, those ducks you were talking about,
00:15:57 if I'm not mistaking,
00:15:59 they were swimming around in chloroform.
00:16:01 All right.
00:16:02 What happens if you put them in water?
00:16:07 Okay.
00:16:08 Well, trying to think of something cute here.
00:16:14 The ducks float better in water,
00:16:17 and the problem with that
00:16:20 is that if we tried to study
00:16:22 the base pair association in water,
00:16:25 it's well known from experiments
00:16:27 that the bases don't hydrogen bond in water.
00:16:30 That is, the isolated bases, they stack.
00:16:33 And we were, in these studies,
00:16:36 interested in the hydrogen bonding aspects
00:16:40 of the base pair association,
00:16:42 so that's why we were studying chloroform.
00:16:44 It's why people who are interested
00:16:46 in hydrogen bonding are doing
00:16:48 their experimental work in chloroform as well.
00:16:51 The point is, water competes very effectively
00:16:54 for the hydrogen bonding,
00:16:56 and until you have the structure
00:16:58 imposed in a full-blown nucleic acid,
00:17:01 one doesn't see this hydrogen bonding in water.
00:17:05 So, as a follow-up,
00:17:07 what are the prospects
00:17:09 of actually doing that simulation?
00:17:11 And maybe in the context of an earlier question
00:17:14 about, say, Monte Carlo
00:17:16 versus dynamics techniques.
00:17:18 We have no technical problem
00:17:20 running the simulation in water.
00:17:22 The problem is that we can't
00:17:24 address the hydrogen bonding.
00:17:26 We'll see stacked structures.
00:17:28 So that, I mean, there is no
00:17:30 technical problem, and we may,
00:17:32 in fact, do some simulations in water
00:17:34 where we constrain the system
00:17:36 to be hydrogen bonded.
00:17:38 Your other question, going back
00:17:40 to the Monte Carlo versus
00:17:42 dynamics issue,
00:17:44 we... all of the results
00:17:46 I showed today
00:17:48 were from Monte Carlo simulations
00:17:50 with our BOAS program.
00:17:52 We do find that
00:17:54 for these smaller molecule problems,
00:17:56 that is, non-protein problems,
00:17:58 that the Monte Carlo procedures
00:18:01 are very effective.
00:18:03 We represent the...
00:18:05 everything in internal coordinates.
00:18:07 In molecular dynamics,
00:18:09 you're doing Cartesian dynamics.
00:18:11 That is, each particle
00:18:13 moves effectively independently.
00:18:15 Uh...but in Monte Carlo,
00:18:17 we do our moves using
00:18:19 the internal coordinates.
00:18:21 And that has some advantages,
00:18:23 particularly with the free energy calculations.
00:18:25 Now, we do protein dynamics ourselves
00:18:27 using, uh...Peter's AMBER program.
00:18:29 Uh...and we find for large systems,
00:18:31 coupled systems like that,
00:18:33 that molecular dynamics
00:18:35 is certainly, at this point in time,
00:18:37 a more efficient way to go.
00:18:39 But we like Monte Carlo
00:18:41 for the smaller systems
00:18:43 because we can use
00:18:45 the internal coordinate representation,
00:18:47 and frankly, it makes the setup
00:18:49 and execution of the free energy
00:18:51 calculations quite a bit easier.
00:18:53 Bill, your first caller
00:18:55 is Barry, who is on line 5.
00:18:57 Barry, please go ahead.
00:18:59 I had a question about your, uh...
00:19:01 work with the, uh...
00:19:03 I have a problem here.
00:19:05 I'm hearing you.
00:19:07 About the work with the pyrazine.
00:19:09 Pyrazine is known to be very different
00:19:11 than pyridine, uh...quantitatively,
00:19:13 because it's much weaker,
00:19:15 um...of lone pairs on the nitrogen.
00:19:17 Uh...hence, uh...
00:19:19 pKa of pyrazine
00:19:21 is three or more orders of magnitude
00:19:23 lower than pyridine, and one would
00:19:25 assume that the hydrogen bonding
00:19:27 to any, uh...carboxylic acid
00:19:29 would be much weaker.
00:19:30 How are these, uh...
00:19:32 Um...
00:19:34 I'm still having problems here.
00:19:36 Uh...technical problems.
00:19:38 Uh...how are these, uh...
00:19:40 these weaker, uh...
00:19:42 lone pairs on pyrazine,
00:19:44 uh...
00:19:46 are they explicitly included
00:19:48 in the calculations?
00:19:49 Do they naturally fall out
00:19:50 as a result of what you, uh...
00:19:52 of what you do, or, uh...
00:19:54 how are they included?
00:19:58 The question about pyrazine
00:20:00 versus pyridine, we, of course,
00:20:02 use different potential function
00:20:04 parameters for the two molecules,
00:20:06 and they have been optimized
00:20:08 by doing, for example,
00:20:10 fluid simulations of pure liquid pyrazine
00:20:12 and pure liquid pyridine.
00:20:14 We've also studied, uh...ab initio,
00:20:17 or compared our, uh...results
00:20:19 with ab initio results
00:20:21 on pyrazine water and pyridine water complexes.
00:20:24 As it turns out,
00:20:26 our potential functions do give the order
00:20:29 that's, uh...expected by your comments.
00:20:31 That is, uh...we find
00:20:33 that in the gas phase
00:20:35 with our potential functions
00:20:37 that a pyrazine water-hydrogen bond
00:20:39 is about 1 to 1 1⁄2 kcals per mole
00:20:42 weaker than a pyridine,
00:20:44 uh...water-hydrogen bond.
00:20:46 So that effect,
00:20:48 the lower basicity of the pyrazine,
00:20:50 is built into the potential functions
00:20:52 and is incorporated, therefore,
00:20:54 into the calculations.
00:20:56 All right, we have Frank
00:20:58 on line 6, and I would like
00:21:00 to remind the callers not to worry
00:21:02 if you do hear something on the line.
00:21:04 We can hear you fine here,
00:21:06 so just keep talking.
00:21:08 Okay, Frank, what is your question?
00:21:10 Hi, Bill, this is Frank Brown.
00:21:12 I was asking a question
00:21:14 about the pyrazine example again.
00:21:16 You were saying that some of this
00:21:18 came from secondary interaction
00:21:20 with the other nitrogen.
00:21:22 I was wondering if you would
00:21:24 comment on the possible fact
00:21:26 of bumping the electrostatics
00:21:28 on the nitrogen up
00:21:30 being able to do this
00:21:32 with only one nitrogen in the ring.
00:21:34 In other words,
00:21:36 bump up the binding constant
00:21:38 by having donating groups
00:21:40 as you would in a drug molecule.
00:21:42 Uh...certainly, Frank.
00:21:44 In the case of pyridine,
00:21:46 pyridine does bind
00:21:48 to the, uh...Revix diacid
00:21:50 with a respectable Ka of 120
00:21:52 through the one-hydrogen bond.
00:21:54 So there's no question
00:21:56 that one could see enhanced, uh...binding
00:21:58 to Revix diacid
00:22:00 with substituted pyridines,
00:22:02 and as you suggest,
00:22:04 putting donating groups
00:22:06 on the pyridine ring
00:22:08 should facilitate that.
00:22:10 So unquestionably,
00:22:12 one can play
00:22:14 structure reactivity games,
00:22:16 structure binding games
00:22:18 with this system
00:22:20 as well as with any host-guest system
00:22:22 including drug-like systems.
00:22:24 We have a call from Cleveland.
00:22:26 It is Ken on line 8.
00:22:28 Please go ahead.
00:22:30 Hello, Dr. Jorgensen?
00:22:32 I have a general question,
00:22:34 um...relating to what sort of
00:22:36 modifications were, uh...
00:22:38 were done
00:22:40 for the OPSA field
00:22:42 in non-aqueous solvents.
00:22:44 The...
00:22:46 the force field that we use,
00:22:48 the OPLS potentials,
00:22:50 uh...remain the same
00:22:52 independent of the solvent.
00:22:55 They are considered to be transferable.
00:22:57 So the potential function parameters,
00:22:59 the charges and Lennard-Jones parameters,
00:23:01 we would use on a solute,
00:23:03 would remain the same
00:23:05 independent of the solvent that we choose.
00:23:08 And then, the solvent atoms
00:23:10 have their own, uh...charges
00:23:12 and Lennard-Jones parameters,
00:23:14 and the cross interactions
00:23:16 are then determined by, uh...standard mixing rules.
00:23:19 So there's no difference in the potential functions
00:23:22 dependent on the medium.
00:23:24 How do you, uh...
00:23:26 treat, uh...ions in...
00:23:28 in solution?
00:23:30 Uh...ions in solution
00:23:32 are, uh...
00:23:34 potentially more hazardous
00:23:36 in the sense that the
00:23:38 many-body effects become
00:23:40 more significant.
00:23:42 In our own case, we try to avoid
00:23:44 studying ions
00:23:46 where the individual ion-solvent
00:23:48 interactions are above
00:23:50 about 20 kilocalories per mole,
00:23:52 because then the fact
00:23:54 that we are using
00:23:56 two-body potential functions,
00:23:58 I believe, becomes more of a problem
00:24:00 above that, uh...energy range
00:24:02 because of the, uh...strong polarization
00:24:04 of the solvent molecules by the ions,
00:24:06 which we don't have properly represented.
00:24:08 Okay, we need to go to New Jersey
00:24:10 where Willis is on the line.
00:24:12 Willis.
00:24:14 Are you still there?
00:24:18 Okay, perhaps he'll call back.
00:24:20 Would you like to continue?
00:24:22 Getting back to the...
00:24:24 kind of the larger end of the computations,
00:24:26 Peter mentioned something
00:24:28 about parallel computation.
00:24:30 I was wondering what you thought
00:24:32 about it, especially considering that
00:24:34 you use a lot of Monte Carlo.
00:24:36 Yeah.
00:24:38 We're certainly also very excited
00:24:40 about the prospects of
00:24:42 highly parallel computers
00:24:44 and their impact on computational
00:24:46 chemistry.
00:24:48 The Monte Carlo programs
00:24:50 are, uh...sort of notorious
00:24:52 for not being that easy
00:24:54 to vectorize.
00:24:56 We do see significant enhancements
00:24:58 on our Cyber 205 at Purdue
00:25:00 when we have vectorized
00:25:02 our Monte Carlo codes.
00:25:04 But parallelism is very exciting,
00:25:06 particularly for the free-energy
00:25:08 calculations, where we have to run
00:25:10 these multiple simulations
00:25:12 for different values
00:25:14 of the reaction coordinate
00:25:16 or this lambda parameter
00:25:18 in the free-energy calculations.
00:25:20 And those we could run
00:25:22 if we had to do 20 such
00:25:24 calculations, just trivial
00:25:26 parallelism, put them on 20
00:25:28 different processors, and away
00:25:30 we go.
00:25:32 So that would be very exciting
00:25:34 for us, just to have systems
00:25:36 with many processors that we
00:25:38 could even use independently.
00:25:40 All right, we have Willis back
00:25:42 on the line again.
00:25:44 We'll have your question.
00:25:46 The question I wanted to ask
00:25:48 Dr. Jorgensen is, did he
00:25:50 consider using pyridazine where
00:25:52 the two nitrogens are beside each other
00:25:54 in the ring for some of your
00:25:56 ghost test binding, guest host
00:25:58 binding studies?
00:26:00 Yes, Rebeck has
00:26:02 studied the binding of
00:26:04 pyridazine as well with his
00:26:06 host, and the binding
00:26:08 of it is about the same as
00:26:10 pyridine, as I recall.
00:26:12 The problem with
00:26:14 pyridazine in the
00:26:16 context of this host is
00:26:18 you're not going to be able to have
00:26:20 two linear hydrogen bonds to the
00:26:22 pyridazine. So even though they are
00:26:24 closer, that is an advantage. The hydrogen
00:26:26 bonds are going to be more bent
00:26:28 so that the measure Ka
00:26:30 doesn't reflect greater binding
00:26:32 in that case.
00:26:34 All right, Walt, what is your question, please?
00:26:36 With regard to
00:26:38 the
00:26:40 three bonded models that you have,
00:26:42 three hydrogen bonded models that you
00:26:44 have... Okay, just a second.
00:26:48 We're trying to iron out
00:26:50 the bugs. Perhaps Walt
00:26:52 will call back in just a moment.
00:26:56 I wish I had heard the beginning of the
00:26:58 question.
00:27:00 He's starting to talk about the
00:27:02 triply hydrogen bonded systems. They have
00:27:04 been receiving quite a bit of
00:27:06 attention lately. A paper
00:27:08 that may interest some of the viewers
00:27:10 is by Steve Benner
00:27:12 in the second to last issue
00:27:14 of Nature where he has a very nice
00:27:16 discourse
00:27:18 and experimental results
00:27:20 on making nucleotide
00:27:22 base analogs.
00:27:24 Some of our
00:27:26 work
00:27:28 coincides very nicely
00:27:30 with his studies.
00:27:32 Just about
00:27:34 to take a cough drop here.
00:27:36 I was still curious
00:27:38 in pursuing that
00:27:40 avenue because
00:27:42 of the obvious relevance to
00:27:44 control of DNA
00:27:46 structure and so forth.
00:27:48 Sorry.
00:27:54 What do you think you would need
00:27:56 in terms of computing power
00:27:58 to actually
00:28:00 attempt those kinds of calculations
00:28:02 where given a sequence you
00:28:04 would try to order the
00:28:06 relative binding strengths?
00:28:12 I think we can do
00:28:15 reasonable computations
00:28:17 at this time on predicting
00:28:19 let's say the effects
00:28:21 on DNA melting temperature
00:28:23 for base pair
00:28:25 mismatches or the use
00:28:27 of some of these nucleotide
00:28:29 base analogs. I think that the
00:28:31 technology is in place
00:28:33 for that. Some of the
00:28:35 biggest problems with simulations in
00:28:37 nucleic acids are what to do
00:28:39 with the ion atmosphere. That is
00:28:41 where to place the counter ions
00:28:43 and the fact that you
00:28:45 are dealing with a highly charged system.
00:28:47 And perhaps during the panel
00:28:49 discussion we could ask Peter
00:28:51 Coleman a little bit about that since he
00:28:53 is more experienced in that area
00:28:55 than I am. Alright. We have a
00:28:57 caller now on line 8. It is Chang Long.
00:28:59 Would you go ahead please?
00:29:01 Yes. In your simulations
00:29:03 it will be very important
00:29:05 for the criteria of a
00:29:07 hydrogen bonding. Definition
00:29:09 of a hydrogen bonding
00:29:11 especially in the distance and angles
00:29:13 and I wonder whether your program
00:29:15 has the flexibility
00:29:17 that throughout the
00:29:19 variability of distances your energy
00:29:21 can be changed accordingly?
00:29:24 Hydrogen bonding
00:29:26 from my viewpoint is
00:29:28 predominantly an electrostatic
00:29:30 phenomenon. That is it is
00:29:32 controlled by
00:29:34 the partial charge, partial charge
00:29:36 interactions. Now we
00:29:38 do not tell our program
00:29:40 at any point, ah, this is a hydrogen
00:29:42 bond or this isn't.
00:29:44 The interactions are simply controlled
00:29:46 by Coulomb's Law
00:29:48 between the molecules.
00:29:50 The only time when we would get involved
00:29:52 in definitions of a hydrogen bond
00:29:54 would be after the fact
00:29:56 in some analysis program
00:29:58 where we would be trying to actually
00:30:00 analyze for the
00:30:02 hydrogen bonding. And then
00:30:04 we might have to use an energetic
00:30:06 definition that, you know, only
00:30:08 consider something to be hydrogen bonded
00:30:10 if the interaction energy is below
00:30:12 let's say 3 kilocalories
00:30:14 per mole. Or we might
00:30:16 use a geometric definition.
00:30:18 But in the actual course of the simulations
00:30:20 we never say, ah, this is a hydrogen
00:30:22 bond. That is all worked out
00:30:24 by the
00:30:26 intermolecular interactions
00:30:28 as a whole. Alright.
00:30:30 Walt, you're on line 6. What is your question
00:30:32 please? Yeah, as I understand
00:30:34 your modeling with the nucleic
00:30:36 acid bases, and particularly
00:30:38 the three bonded bases,
00:30:40 the third base contributes little to the stability
00:30:42 of the complex.
00:30:44 Could you say something about
00:30:46 the implications that that has for
00:30:48 high
00:30:50 AT and high GC base pairs
00:30:52 in our understanding, or
00:30:54 understanding that high GC base
00:30:56 pairs are much more stable
00:30:58 and we
00:31:00 impute that to their three bonded
00:31:02 character, but that doesn't seem
00:31:04 to be true from your calculations.
00:31:06 Thank you. Yeah, I think that
00:31:08 you missed something
00:31:10 in the results there.
00:31:12 GC does have the strongest
00:31:14 interaction about, in the gas
00:31:16 phase, of 22 kilocalories per mole
00:31:18 in a binding constant in chloroform
00:31:20 of at least 10 to the 5th.
00:31:22 AT, or AU, has a
00:31:24 binding constant in chloroform of 10 to the
00:31:26 2, and in our calculations
00:31:28 the interaction is about 10.5
00:31:30 kilocalories per mole. The difference
00:31:32 is in comparing GC
00:31:34 with the uracil
00:31:36 diaminopyridine type systems.
00:31:38 There, in the uracil diaminopyridine
00:31:40 systems, you do have the third
00:31:42 hydrogen bond, but it really
00:31:44 doesn't give you stronger
00:31:46 net binding than we see
00:31:48 with the AT
00:31:50 type pairs.
00:31:52 The difference is really GC
00:31:54 versus the uracil diaminopyridine.
00:31:56 Again, in the Benner
00:31:58 paper, if you look that one up,
00:32:00 you'll see that his
00:32:02 alternates for triply hydrogen
00:32:04 bonded systems are
00:32:06 predominantly of the uracil
00:32:08 diaminopyridine type,
00:32:10 so one isn't going to see
00:32:12 the strong GC
00:32:14 interactions with those systems either.
00:32:16 Alright, our last call is from
00:32:18 Shock, and he is on line 5. Shock, please
00:32:20 go ahead. Hello?
00:32:22 Yes, we hear you.
00:32:24 What is your question, please?
00:32:26 Hello? We can hear you.
00:32:28 Can you hear us?
00:32:30 The name is
00:32:32 Jack, and the question is
00:32:34 as follows.
00:32:36 The rationale that Dr. Jorgensen has
00:32:38 provided for the
00:32:40 difference in energy of
00:32:42 the GC pair
00:32:44 and 2,6-DAP
00:32:46 U
00:32:48 is clear.
00:32:50 On the other hand, the fact that
00:32:52 2,6-DAP U and AU
00:32:54 are so similar is puzzling.
00:32:56 May the explanation not lie
00:32:58 in the fact that in 2,6-DAP
00:33:00 you have a 2-amino
00:33:02 to a
00:33:04 carbonyl hydrogen bond
00:33:06 whereas in AU you have
00:33:08 a 2-CH
00:33:10 to the
00:33:12 carbonyl hydrogen bond
00:33:14 and the 2-CH
00:33:16 carbonyl hydrogen bond
00:33:18 isn't all that much difference in energy
00:33:20 from the 2-amino
00:33:22 carbonyl hydrogen bond.
00:33:26 Yes, your first
00:33:28 statement is correct that the
00:33:30 AU systems have about
00:33:32 the same gas phase interaction
00:33:34 as the uracil diamino
00:33:36 pyridine. Trying
00:33:38 to dissect that is a little bit harder
00:33:40 comparing a doubly hydrogen
00:33:42 bonded system with a triply hydrogen bonded system
00:33:44 but I think one point that you did raise
00:33:46 is absolutely true.
00:33:48 The hydrogen on C2
00:33:50 in adenine
00:33:52 because it's between the two
00:33:54 pyrimidine nitrogens is quite acidic
00:33:56 and the interaction
00:33:58 of that hydrogen with
00:34:00 the non-formally
00:34:02 hydrogen bonded carbonyl
00:34:04 on uracil or thymine
00:34:06 is significant
00:34:08 and thank you for raising that.
00:34:10 It's time to move on. Thank you so much
00:34:12 Bill and Art and we'll be back with you shortly.
00:34:14 We did not have time
00:34:16 to take your call. I apologize.
00:34:18 I do want to remind you that we'll
00:34:20 see all of our speakers at the end
00:34:22 of the program for a panel discussion.
00:34:24 At that time the phones will be
00:34:26 open for questions you might have.
00:34:28 Now we will hear from Dr.
00:34:30 Jeffrey Blaney with his presentation
00:34:32 on a distance geometry
00:34:34 approach to ligand macromolecular
00:34:36 docking. Art?
00:34:38 Jeff represents a user's
00:34:40 point of view in this discussion
00:34:42 today being a
00:34:44 practicing medicinal chemist.
00:34:46 Although he did see the light early as
00:34:48 Peter Coleman referred to in his talk
00:34:50 and began applying modeling techniques
00:34:52 to the study of structure activity
00:34:54 relationships in understanding
00:34:56 drug activity. Jeff?
00:35:02 Thanks Art.
00:35:04 I'll begin today by focusing on
00:35:06 a key point in molecular modeling
00:35:08 and the different approaches that we might take.
00:35:10 That is one of comparing analytical methods
00:35:12 with methods that are oriented towards design.
00:35:14 The vast majority of modeling tools that we
00:35:16 have available to us, graphical and
00:35:18 computational tools, are analytical.
00:35:20 We can ask a question about a given structure
00:35:22 and get back answers to varying degrees of
00:35:24 accuracy about it.
00:35:26 To be able to use any of these methods
00:35:28 we really need to have a molecule to start with.
00:35:30 Typically we use these methods
00:35:32 to combine information about a family of active
00:35:34 molecules and ask what is it that
00:35:36 makes these molecules active
00:35:38 and can we correlate some
00:35:40 structural, measurable, or calculable
00:35:42 property of these molecules and rationalize
00:35:44 the activity of them in a structure activity
00:35:46 relationship and come up with
00:35:48 a model for
00:35:50 what the common receptor for what the set of
00:35:52 molecules might look like and what their respective
00:35:54 binding modes are. If we're extremely
00:35:56 fortunate we'll have this information in the form
00:35:58 of a high resolution X-ray crystal structure
00:36:00 of the receptor.
00:36:02 In either case, the next goal is to
00:36:04 extrapolate from the receptor model to the
00:36:06 design of new molecules and predict
00:36:08 their activity prior to synthesis.
00:36:10 You've already heard a little bit today about
00:36:12 free energy perturbation methods which are
00:36:14 very powerful and show lots of promise
00:36:16 but are currently limited to making predictions
00:36:18 about fairly small changes between related
00:36:20 molecules. We'd like to be able to ask
00:36:22 that question of molecules that are structurally
00:36:24 entirely different from each other.
00:36:26 We'd really like to be able to predict
00:36:28 absolute free energies of binding of small
00:36:30 molecules to receptor. But it's clear
00:36:32 that we're not quite yet at that point
00:36:34 and as a result it's still incredibly difficult
00:36:36 to predict the activity of structures prior
00:36:38 to synthesis. It turns out
00:36:40 it's also quite difficult to design new structures.
00:36:42 The software
00:36:44 tools that we have tend to be entirely analytical
00:36:46 although they can help give you some
00:36:48 insight into what kind of molecule you might
00:36:50 want to design. But it still really
00:36:52 requires having a very creative and
00:36:54 clever organic or medicinal
00:36:56 chemist working with modeling tools
00:36:58 and chemical intuition
00:37:00 to successfully design new compounds.
00:37:02 In fact, the only approaches
00:37:04 revealed to date that have been successful in designing
00:37:06 new potential drug molecules
00:37:08 not related to any other known active compounds
00:37:10 have come from an intuitive
00:37:12 empirical modeling approach.
00:37:14 These have come from the work of
00:37:16 Bedell's group at Wellcome Labs
00:37:18 on the design of anti-sickling compounds
00:37:20 based on the crystal structure of hemoglobin
00:37:22 and from Ripke's group
00:37:24 at DuPont on the design of phospholipase A2
00:37:26 inhibitors. Both
00:37:28 are included in the list of references.
00:37:30 In both cases, they used fairly
00:37:32 simple qualitative modeling methods.
00:37:34 In Bedell's work, I think
00:37:36 they actually may have used Weyer-Kendrew models.
00:37:38 In Ripke's work at
00:37:40 DuPont, computer graphics methods were used.
00:37:42 But I think the common theme in both
00:37:44 of them is that there was a lot of what you might call
00:37:46 chemical intuition used
00:37:48 and a trial and error approach of building many small
00:37:50 molecules and fitting them into the site
00:37:52 and looking in a qualitative way to see what
00:37:54 could make good hydrogen bonding
00:37:56 interactions, steric interactions
00:37:58 and hydrophobic interactions.
00:38:00 There's been a steadily increasing amount
00:38:02 of work on developing more automated approaches
00:38:04 to designing molecules that have less
00:38:06 bias. A clear problem
00:38:08 even if you have several
00:38:10 very experienced people working
00:38:12 on designing molecules, each of them
00:38:14 given the same information and same set of tools,
00:38:16 that it's very likely they'll find different
00:38:18 answers. We'd like to come up
00:38:20 with a more rigorous and a less biased
00:38:22 automated way of designing structures and evaluating
00:38:24 them.
00:38:26 One of the most successful approaches to date
00:38:28 are those for searching three-dimensional
00:38:30 chemical databases.
00:38:32 We can describe a receptor model in terms of
00:38:34 distances, angles or planes
00:38:36 between functional groups and then search
00:38:38 a 3D database to find molecules
00:38:40 that satisfy those geometric constraints.
00:38:42 These molecules could then be candidates
00:38:44 to bind to the receptor. Despite
00:38:46 the limitations of fixed confirmation
00:38:48 for each of the
00:38:50 molecules in the 3D database,
00:38:52 these methods can successfully generate new
00:38:54 ideas for synthesis.
00:38:56 There are a variety of academic, industrial
00:38:58 and commercial efforts in 3D database
00:39:00 searching. I've listed a number of them
00:39:02 in the references.
00:39:04 Until recently, there's been very little practical
00:39:06 work done in the area of designing structures from scratch
00:39:08 as opposed to pulling them out
00:39:10 of existing databases.
00:39:12 This remains an enormously difficult problem.
00:39:14 We're seeing the beginnings of what
00:39:16 may become feasible approaches here,
00:39:18 but we're still a long ways from a general solution
00:39:20 to the de novo design of drugs.
00:39:22 I've also included a few references
00:39:24 to these approaches.
00:39:26 For the remainder of the talk,
00:39:28 I'll tell you briefly how distance geometry
00:39:30 works and a couple of applications
00:39:32 using it, including our own work
00:39:34 in using it for docking small molecules
00:39:36 into protein binding sites
00:39:38 and how it could be used in design.
00:39:40 Distances are a very
00:39:42 natural way to describe structures.
00:39:44 We tend to think in terms of hydrogen bond
00:39:46 lengths, Van der Waals contacts,
00:39:48 etc.
00:39:50 Several different experimental methods
00:39:52 give us information back in terms of distances.
00:39:54 For example, the
00:39:56 NOE measurement from 2D NMR
00:39:58 can be related to a distance
00:40:00 measurement. In addition,
00:40:02 we don't need to have a starting confirmation
00:40:04 for a molecule to use distance geometry.
00:40:06 We don't need to build a reasonable model
00:40:08 of it to begin with, which is a requirement
00:40:10 for molecular mechanics or dynamics.
00:40:12 We start off with something that's
00:40:14 a reasonably good structure.
00:40:16 We don't need any force field parameters,
00:40:18 torsion, bond angle,
00:40:20 partial charges, etc.
00:40:22 That's because distance geometry
00:40:24 isn't a physical chemical method. It's not
00:40:26 based on any theory of molecular interactions
00:40:28 or energetics. It's a purely geometric
00:40:30 model builder. Flexible
00:40:32 rings are handled very naturally by distance
00:40:34 geometry without doing anything special,
00:40:36 without solving what's typically a fairly
00:40:38 difficult ring-closing problem if we're working
00:40:40 in Cartesian or internal coordinates.
00:40:42 Distance geometry is a random
00:40:44 method, and therefore can be used to determine
00:40:46 whether a given model even exists,
00:40:48 and if so, how unique is it?
00:40:50 Is there one way
00:40:52 of achieving a given set of intermolecular
00:40:54 or intramolecular interactions
00:40:56 or many? Since
00:40:58 it is random and unbiased, occasionally we get
00:41:00 surprises. And I think that's where the
00:41:02 real exciting and interesting work can come from
00:41:04 is when you get an unexpected result.
00:41:06 Computational methods
00:41:08 usually give us answers that quantify or help
00:41:10 confirm an idea we already had.
00:41:12 It's much more exciting when you actually get some new
00:41:14 ideas out.
00:41:16 Distance geometry has been used in several areas.
00:41:18 The major ones that come to mind
00:41:20 include conformational analysis of small molecules,
00:41:22 where distance geometry is
00:41:24 used to generate random conformers that are subsequently
00:41:26 energy minimized.
00:41:28 It's been used most extensively and is probably
00:41:30 best known for solving the solution
00:41:32 structures of small to medium-sized proteins
00:41:34 and nucleic acids from 2D NMR data,
00:41:36 which Dave Case will
00:41:38 talk about in his talk.
00:41:40 It's been used in a few cases in protein
00:41:42 homology model building. This is where
00:41:44 one tries to estimate the three-dimensional structure
00:41:46 of a protein based on a
00:41:48 sequence homology to another protein whose structure
00:41:50 is known. I'll describe in
00:41:52 some detail the last two applications
00:41:54 on this chart, the ensemble
00:41:56 method for modeling pharmacophores and
00:41:58 superimposing molecules, and
00:42:00 finally the docking work that's been done at DuPont
00:42:02 over the last few years.
00:42:04 How does distance geometry actually work?
00:42:06 We describe a molecular
00:42:08 structure not in terms of three-dimensional
00:42:10 Cartesian coordinates or internal coordinates.
00:42:12 Instead, we describe it as a set of
00:42:14 all interatomic distances.
00:42:16 We don't actually set specific distances
00:42:18 initially. We set distance ranges
00:42:20 or bounds.
00:42:22 By specifying the maximum attainable distance
00:42:24 between a pair of atoms and the minimum
00:42:26 attainable distance, it's clear that if
00:42:28 we do this for all pairs of atoms in the structure,
00:42:30 that all possible conformations must
00:42:32 fall in between, as shown in the distance
00:42:34 bound matrix on this slide.
00:42:36 So here we have a very compact way of
00:42:38 describing the entire conformation space of a
00:42:40 molecule.
00:42:42 The challenge of distance geometry is to extract
00:42:44 from this representation, in a random
00:42:46 and efficient way, what those conformations
00:42:48 are.
00:42:50 How do we set the initial upper and lower distance bounds?
00:42:52 For covalently bonded atoms,
00:42:54 we set the upper and lower bound equal
00:42:56 to the bond length. For atoms that
00:42:58 define a bond angle that are 1,3 to each
00:43:00 other, we also set their upper and lower
00:43:02 bounds equal to each other, based on the bond
00:43:04 angle. For atoms that are
00:43:06 1,4 to each other, that have a rotatable
00:43:08 bond between them, we set their lower
00:43:10 bound to the distance they'd have, typically
00:43:12 in a gauche conformation, and the upper
00:43:14 bound to the distance they would have in a trans
00:43:16 conformation.
00:43:18 For atoms farther apart than 1,4,
00:43:20 we set their lower bound to the sum of their
00:43:22 Van der Waals radii, and their upper bound
00:43:24 to the distance they would have in a fully extended
00:43:26 chain. We randomly
00:43:28 select a discrete inter-atomic
00:43:30 distance between each upper and lower
00:43:32 bound, and then convert these distances
00:43:34 back into a set of three-dimensional
00:43:36 coordinates, and then refine the
00:43:38 coordinates against the upper and lower bounds
00:43:40 until they converge, and we have a structure that
00:43:42 satisfies our original distance constraints.
00:43:44 I've gone over this pretty quickly.
00:43:46 There is a lot of work involved in this,
00:43:48 and it's gone behind it,
00:43:50 the vast majority of which is due to
00:43:52 the efforts of Gordon Crippen and Tim Havel, now
00:43:54 both at the University of Michigan.
00:43:56 Thanks largely to their work, the method's
00:43:58 now robust enough to handle chemical structure
00:44:00 problems for small to medium-sized
00:44:02 molecules, up to 1,000
00:44:04 and perhaps 1,200 to 1,500 atoms.
00:44:06 For more
00:44:08 information on how the method actually works,
00:44:10 you'll find excruciating
00:44:12 detail in the references.
00:44:14 Something that became apparent to us
00:44:16 as we started using distance geometry several years
00:44:18 ago at DuPont, and suddenly had the
00:44:20 ability to generate models very quickly,
00:44:22 was that our ability to generate them
00:44:24 far outstripped our ability to analyze them.
00:44:26 So we needed an automated
00:44:28 method for doing that.
00:44:30 We tried a number of approaches, and finally settled on
00:44:32 using cluster analysis for grouping
00:44:34 structures into conformationally related families.
00:44:36 This is a simple approach
00:44:38 that we found to work very well.
00:44:40 The idea is that for a series of
00:44:42 conformers, we calculate the
00:44:44 root mean square, least squares
00:44:46 fit error between all of them,
00:44:48 by superimposing each of the conformers onto each
00:44:50 other, so that if, say, we
00:44:52 generated ten conformers, we'd fill in a symmetric
00:44:54 matrix, a ten by ten matrix,
00:44:56 with all of those RMS least squares
00:44:58 fit values. From that matrix,
00:45:00 we can calculate the distances between
00:45:02 the conformers, using the RMS
00:45:04 matrix as a coordinate matrix
00:45:06 to calculate Euclidean distances between
00:45:08 each conformer.
00:45:10 The result is that we now have a distance matrix
00:45:12 which shows how far apart
00:45:14 each of the conformers are from each other,
00:45:16 and we can take that directly into
00:45:18 a standard cluster analysis program
00:45:20 that produces a tree chart, or a
00:45:22 dendrogram, like the one shown
00:45:24 here. On the left of the
00:45:26 chart are all the individual conformers,
00:45:28 and as we move to the right,
00:45:30 they merge together into progressively larger
00:45:32 and larger clusters.
00:45:34 So we can choose on this chart
00:45:36 what we deem an appropriate level of resolution.
00:45:38 For example, down at the bottom, we find
00:45:40 four conformers that are clustered together,
00:45:42 and for subsequent analysis, we might conclude
00:45:44 that it's sufficient to just take one of
00:45:46 the four as representative, since all four
00:45:48 must be very similar.
00:45:50 Now, I've told you how we can
00:45:52 generate structures with distance geometry
00:45:54 in a simple way for classifying the random structures
00:45:56 into unique families.
00:45:58 I'd like to tell you briefly about the ensemble
00:46:00 approach for pharmacophore modeling,
00:46:02 developed by Scott Dixon and Bob
00:46:04 Sheridan at Letterly. It's also listed
00:46:06 in the references. Pharmacophore
00:46:08 to medicinal chemist
00:46:10 means the set of atoms or functional groups that are
00:46:12 required for biological activity at a receptor.
00:46:14 A simple
00:46:16 example of a pharmacophore might be a basic
00:46:19 nitrogen, a phenolic hydroxyl group,
00:46:21 and an aromatic ring
00:46:23 with specific geometric relationships between
00:46:25 them.
00:46:27 Once we've come up with an idea for what a pharmacophore
00:46:29 might be, we'd then like to be able to take a whole
00:46:31 series of active molecules and ask
00:46:33 how might they bind to the receptor?
00:46:35 To answer that question, we try to superimpose
00:46:37 the molecules such that their common
00:46:39 pharmacophoric groups would overlap.
00:46:41 Ordinarily, this would be a
00:46:43 very complicated problem because you have
00:46:45 six degrees of freedom
00:46:47 for each one of the molecules, three
00:46:49 rotational and three translational,
00:46:51 plus all their internal degrees of freedom due to their
00:46:53 bond rotations.
00:46:55 So the combinatorial possibilities of searching
00:46:57 all orientations and conformations are enormous.
00:46:59 And this is a complicated
00:47:01 problem with conventional methods.
00:47:03 But in distance geometry, it becomes quite simple.
00:47:05 Rather than putting one molecule
00:47:07 into the distance bounds matrix, we put several
00:47:09 into it at once. The chart shown
00:47:11 here shows a distance bound
00:47:13 matrix that now contains three molecules.
00:47:15 The cross-hatched areas show where
00:47:17 the intramolecular constraints are
00:47:19 and the clear areas show the
00:47:21 intermolecular distance bounds.
00:47:23 We set the lower intermolecular bounds
00:47:25 to zero, which allows the molecules to
00:47:27 pass through each other and superimpose.
00:47:29 And we set the upper bounds
00:47:31 to force the specific atoms that define
00:47:33 the pharmacophore common to the three molecules
00:47:35 to superimpose.
00:47:37 Then we randomly sample
00:47:39 from the discrete distance bounds
00:47:41 sorry, from the distance bounds to get
00:47:43 discrete distances and convert
00:47:45 them into three-dimensional coordinates.
00:47:47 In this way, we generate random
00:47:49 conformations of each of the molecules
00:47:51 subject to the constraint that their common
00:47:53 pharmacophoric atoms must superimpose.
00:47:55 And we can determine both whether the
00:47:57 proposed pharmacophore model
00:47:59 is even possible, and if so,
00:48:01 how unique is it?
00:48:03 Is there just one solution that we might have
00:48:05 some faith in and decide to use
00:48:07 to actually design some new molecules?
00:48:09 Or are there tens or possibly hundreds
00:48:11 meaning that we have a very undetermined
00:48:13 ill-defined problem?
00:48:15 The next chart shows a simple example
00:48:17 of this application that's taken from the
00:48:19 JMedChem reference of Sheridan and Dixon
00:48:21 the four ligands that bind to the nicotinic
00:48:23 receptor. They used a
00:48:25 three-point pharmacophore overlapping
00:48:27 basic nitrogens
00:48:29 and then a bond dipole in three of the
00:48:31 molecules. It's a
00:48:33 carbonyl group. In the fourth, it's taken from a
00:48:35 dummy atom at the center of the pyridine ring
00:48:37 to the pyridine nitrogen. The dashed
00:48:39 lines on the chart show the constraints used
00:48:41 to hold the corresponding atoms together.
00:48:43 On the first
00:48:45 two of the next color slides,
00:48:47 we see the four structures
00:48:49 with the colored spheres now
00:48:51 highlighting the pharmacophoric groups.
00:48:53 And in the next view,
00:48:55 we see one possible superimposition
00:48:57 of them in
00:48:59 a potential solution to the 3D
00:49:01 pharmacophore problem.
00:49:03 The individual structures are shown in the four
00:49:05 corners, and then all four of them are
00:49:07 superimposed based on those pharmacophore constraints
00:49:09 in the center.
00:49:11 Finally, with all
00:49:13 this as background, I'll get to my actual
00:49:15 title, which is Using Distance Geometry to Dock
00:49:17 Small Molecules and Proteins.
00:49:19 This is a modified distance geometry
00:49:21 approach that's optimized
00:49:23 specifically for docking.
00:49:25 The idea is that we're going to generate random
00:49:27 fits of conformationally flexible ligands
00:49:29 into a rigid binding site.
00:49:31 Next, we'll rank those dockings
00:49:33 with a simple molecular mechanics interaction
00:49:35 energy. Our goal is to rapidly
00:49:37 search a large chemical database to find
00:49:39 the best structures to fit a given site model
00:49:41 or receptor. The difference
00:49:43 between this and other three-dimensional search
00:49:45 approaches currently used is that
00:49:47 we allow the ligand to be conformationally flexible
00:49:49 and we aren't restricted to a
00:49:51 fixed ligand geometry.
00:49:53 I'll show you how this works with an example of
00:49:55 docking methotrexate,
00:49:57 an antitumor drug, to
00:49:59 alkazide dietofolyreductase,
00:50:01 solved by Kraut and Matthews at UCSD
00:50:03 about ten years ago.
00:50:05 Methotrexate has many rotatable bonds
00:50:07 so it makes a challenging docking problem.
00:50:09 We start out by
00:50:11 finding the geometric center of the binding site
00:50:13 and defining a sphere of sufficient
00:50:15 volume, shown in this view in yellow,
00:50:17 to enclose all the molecular
00:50:19 surface of the binding site.
00:50:21 We'll constrain the ligand,
00:50:23 the methotrexate, shown in red
00:50:25 in this view, to lie inside of the sphere
00:50:27 and outside of the protein
00:50:29 and then generate random conformations
00:50:31 of the methotrexate in the sphere
00:50:33 such that it doesn't bump into the protein.
00:50:35 We've put a lot of effort into doing this as rapidly
00:50:37 as possible and I'll briefly show
00:50:39 you the approach in the next few slides.
00:50:41 In the next few, we see
00:50:43 the molecular surface of the
00:50:45 dietofolyreductase active site
00:50:47 is calculated using Mike Connelly's program
00:50:49 which rolls a probe sphere over the
00:50:51 surface of the protein and lays down
00:50:53 a series of dots wherever the
00:50:55 probe sphere has a point of tangency with the protein.
00:50:57 The surface that we
00:50:59 actually use in our docking
00:51:01 calculations is shown in the next
00:51:03 view. It's what's called an
00:51:05 extra-radius surface
00:51:07 which we generate by adding one van der Waals
00:51:09 radius to every atom in the protein
00:51:11 and then calculating the surface at this
00:51:13 extra-radius. You can see that this surface
00:51:15 collapses down just onto the
00:51:17 volume shown by the simple stick model
00:51:19 of the methotrexate.
00:51:21 In the next step, we pack a set of
00:51:23 spheres into this extra-radius
00:51:25 surface using the algorithm described
00:51:27 by Kuntz in the 1982
00:51:29 JMB
00:51:31 paper listed in the references.
00:51:33 That's shown on the next slide where
00:51:35 the yellow region is the union of the
00:51:37 set of the 48 spheres that were used
00:51:39 to fill the extra-radius surface
00:51:41 that collectively define the shape of the binding site.
00:51:43 We use
00:51:45 these spheres as an additional constraint on the
00:51:47 distance geometry refinement to
00:51:49 accelerate convergence.
00:51:51 We actually constrain each
00:51:53 ligand atom to lie in one or more of these
00:51:55 spheres shown in the yellow region and
00:51:57 in doing so, we'll of course force the ligand to
00:51:59 lie in the binding site.
00:52:01 What kind of dockings do we generate and how long
00:52:03 does it take to get them?
00:52:05 In the next view, we see one of the best
00:52:07 dockings out of 100 random trials
00:52:09 ranked by molecular mechanics
00:52:11 interaction energy. You can see that
00:52:13 it's fairly close to the actual crystal structure of
00:52:15 methotrexate which is the structure
00:52:17 shown in red in this view.
00:52:19 We've got the location
00:52:21 of the pteridine ring correct. You can
00:52:23 see that the arrow is pointing to the superimposed
00:52:25 green and red pteridine rings here
00:52:27 but we haven't quite got the rotatable
00:52:29 bonds in the glutamate portion
00:52:31 of the molecule right.
00:52:33 Still, this is pretty encouraging since we aren't using
00:52:35 any energetic terms in the docking generation
00:52:37 refinement, only in the ranking.
00:52:39 In the next view,
00:52:41 we see an example
00:52:43 of how unbiased the docking actually is
00:52:45 and that we've now got approximately the same
00:52:47 binding mode but the pteridine ring
00:52:49 is flipped over by 180 degrees
00:52:51 which is in fact the way the natural substrate
00:52:53 folic acid binds. Methotrexate
00:52:55 binds with that ring upside down.
00:52:57 In the next view,
00:52:59 we see a completely alternate binding
00:53:01 mode which although it probably isn't a reasonable
00:53:03 one for methotrexate, it
00:53:05 shows another part of the site that could accept a group
00:53:07 of similar size to the pteridine ring system
00:53:09 and actually has groups that could hydrogen
00:53:11 bond to it which could suggest
00:53:13 ideas for new analogs.
00:53:15 How long does it all take?
00:53:17 Approximately
00:53:19 10-20% of the fits that we generate
00:53:21 have reasonable binding modes
00:53:23 which means that they're in tight contact
00:53:25 with the enzyme surface.
00:53:27 However, only about 1-5% of them are
00:53:29 close to the crystallographic result
00:53:31 so our overall throughput
00:53:33 of good dockings is rather low.
00:53:35 About 70% of these random trials
00:53:37 converge for an average of about 1.5
00:53:39 seconds per fit on a cray.
00:53:41 While this is very fast,
00:53:43 it's clearly not fast
00:53:45 enough to search through tens or hundreds
00:53:47 of thousands of structures.
00:53:49 This is encouraging given that we're searching with
00:53:51 complete conformational freedom in the ligand
00:53:53 and that fast computing is rapidly becoming much cheaper.
00:53:55 We've tried alternate
00:53:57 approaches. The one that's
00:53:59 worked the best so far
00:54:01 tries to add a little chemical information to the docking
00:54:03 by assigning to each atom in the ligand
00:54:05 whether it's polar or nonpolar
00:54:07 or positively or negatively charged.
00:54:09 We don't try to quantify any more
00:54:11 than that, just polar, nonpolar
00:54:13 and whether it's positive or negative.
00:54:15 We do the same for each one of those
00:54:17 small set of the 48 spheres
00:54:19 that we used to describe the site
00:54:21 and then we add an additional constraint
00:54:23 that requires each ligand atom
00:54:25 to lie inside a complementary sphere.
00:54:27 If we do that, we now get all of the
00:54:29 methotrexate and dihydrofolate reductase dockings
00:54:31 close to the crystallographic binding mode.
00:54:33 The number of randomly generated
00:54:35 structures that actually converge is decreased
00:54:37 but all of them have the correct binding mode.
00:54:39 This is probably an artificially
00:54:41 good result because of the nature
00:54:43 of the structures of methotrexate and dihydrofolate
00:54:45 reductase. Methotrexate is a
00:54:47 strongly polar molecule with its
00:54:49 positively charged pteridine end
00:54:51 and its negatively charged glutamate end
00:54:53 and the enzyme site is exactly complementary to it.
00:54:55 So by forcing the ligand
00:54:57 atoms to lie in complementary site spheres
00:54:59 we can hardly miss in this case.
00:55:01 The approach hasn't worked quite as
00:55:03 well on less polar binding sites such as
00:55:05 phospholipase A2.
00:55:07 The eventual goal of this work
00:55:09 is to build new molecules from scratch.
00:55:11 This is a huge combinatorial problem
00:55:13 and it's still not clear what the best approach is.
00:55:15 You'll see some other ideas along
00:55:17 these lines in the papers and the references
00:55:19 from Dean's group.
00:55:21 Here the idea is rather than docking completely
00:55:23 pre-formed structures into the site,
00:55:25 we'll instead try to dock fragments.
00:55:27 These could be phenyl rings, cyclohexyl rings,
00:55:29 amide groups, nitro groups and so on
00:55:31 into the site and then search
00:55:33 for combinations of optimally
00:55:35 docked fragments that could be
00:55:37 assembled together into complete molecules.
00:55:39 From the large number of possible dockings
00:55:41 and the huge number of ways of docking and combining
00:55:43 them, it's clear that the combinatorics here
00:55:45 are enormous and the pruning strategy is going to be required
00:55:47 to keep the possibilities manageable.
00:55:49 In summary,
00:55:51 I hope I've shown you a little bit about
00:55:53 distance geometry and shown you that it's a powerful
00:55:55 model building tool with a wide
00:55:57 range of applications beyond
00:55:59 the usual ones for 2D NMR structure
00:56:01 determination that aren't usually
00:56:03 handled by conventional approaches.
00:56:05 Distance geometry is still just a geometric
00:56:07 model builder and doesn't have any energetic terms
00:56:09 and so it can occasionally generate
00:56:11 high energy conformations.
00:56:13 So distance geometry models should be refined
00:56:15 with molecular mechanics and or dynamics.
00:56:17 I've been
00:56:19 encouraged by the number of recent different approaches
00:56:21 for developing
00:56:23 methods to design new structures.
00:56:25 I think this is where the real challenge of
00:56:27 molecular modeling for drug design lies.
00:56:29 And finally, I'd like to thank Bill Ripken
00:56:31 DuPont for making it possible for me to do this work,
00:56:33 for Gordon Crippen for teaching me
00:56:35 how to do distance geometry and contributing
00:56:37 several important ideas,
00:56:39 and Peter Coleman for being kind enough to invite me here
00:56:41 and let me get warm for a few days.
00:56:43 Thank you.
00:56:57 We are here again to answer your questions
00:56:59 and take your comments. While you are going
00:57:01 to the phone to call in your questions,
00:57:03 I want to alert you to the short break
00:57:05 that will follow the question and answer session.
00:57:07 We will stop for 20 minutes
00:57:09 so you can get a quick bite to eat, maybe
00:57:11 something to drink or just stretch your legs a bit.
00:57:13 So if you're wanting to do any of
00:57:15 those things, please wait for another
00:57:17 15 minutes so that you'll not miss any
00:57:19 of the program. Art, do you have any
00:57:21 comments about Jess' talk?
00:57:23 Yes, I found it very interesting and
00:57:25 actually I'd like to follow up on
00:57:27 some of the latter material you talked
00:57:29 about. In modeling
00:57:31 the receptor, obviously, you have
00:57:33 to use some good three-dimensional
00:57:35 information. Typically, you model
00:57:37 from an x-ray structure.
00:57:43 There's a larger source
00:57:45 of x-ray structures and that's very good,
00:57:47 but it doesn't include all of the information,
00:57:49 especially the dynamics of
00:57:51 the receptor site.
00:57:53 Can you see any way of folding
00:57:55 in the dynamics of the site
00:57:57 into
00:57:59 the modeling and evaluation of
00:58:01 the energies?
00:58:03 During the docking, it's
00:58:05 running the
00:58:07 docking against a moving target that you'd
00:58:09 have in dynamics
00:58:11 is still extremely difficult. You could imagine
00:58:13 generating a dynamics trajectory,
00:58:15 saving it every few steps,
00:58:17 and then generating dockings into those.
00:58:19 A more sensible approach, perhaps,
00:58:21 would be to run a
00:58:23 dynamic trajectory, cluster
00:58:25 those structures that you produced during
00:58:27 the dynamics into families and
00:58:29 then use each one of those different ones
00:58:31 as a target to dock against.
00:58:33 We haven't tried any of this in our work.
00:58:35 At this point, it should be clear
00:58:37 that we're aiming for a pretty low level of resolution
00:58:39 and we're really just trying to get
00:58:41 ideas for structures that could fit well
00:58:43 into a site, which we could
00:58:45 then refine using dynamics
00:58:47 to determine if, in fact, they are
00:58:49 likely to bind tightly.
00:58:51 Do you think that it's going to be a
00:58:53 strong effect?
00:58:55 In other words, that dynamics are going to influence
00:58:57 greatly the
00:58:59 relative order of
00:59:01 the pharmacophores.
00:59:03 I don't think that's clear
00:59:05 and we haven't done that much on it yet.
00:59:07 Okay, we have a caller on the line.
00:59:09 It's Alfred, line 5. Alfred, what is
00:59:11 your question? Yes, this is
00:59:13 Alfred Lowry for Jeff Blaney.
00:59:15 There's a rumor that
00:59:17 your version of the distant
00:59:19 geometry code will
00:59:21 be released by Quantum Chemistry
00:59:23 Program Exchange. Is there a
00:59:25 deadline or a date?
00:59:27 Yeah, as a matter of fact, there is.
00:59:29 We will have that
00:59:31 submitted to QCP sometime within the next
00:59:33 couple weeks.
00:59:35 Which is longer than we'd expected,
00:59:37 but things often
00:59:39 take longer than you'd expect. So it will be
00:59:41 available.
00:59:43 That's a good question, Al.
00:59:45 These are the real questions that people
00:59:47 want to know. How can I get the code and
00:59:49 how do I run it?
00:59:51 I wanted to ask a follow-up on
00:59:53 the distance geometry approach in general
00:59:55 and that is that its strength is
00:59:57 also, in some sense, its weakness.
00:59:59 It's unbiased
01:00:01 in terms of the areas
01:00:03 of conformational space that it explores,
01:00:05 but that also
01:00:07 means that you're spending computation
01:00:09 in unproductive areas of
01:00:11 conformational space that you know, through other information,
01:00:13 might not be
01:00:15 reasonable.
01:00:17 Do you have a way around
01:00:19 that problem?
01:00:21 Well, I don't think you really do spend much time
01:00:23 in conformational space that's
01:00:25 unreasonable. It depends entirely on
01:00:27 how much information you know about
01:00:29 the problem in advance. Any
01:00:31 information you know, you can provide in the form of
01:00:33 distance constraints. It could be NOE distances
01:00:35 in the pharmacophore modeling,
01:00:37 constraints that would overlap atoms on top
01:00:39 of each other. In the latter case,
01:00:41 you clearly don't spend time generating conformers
01:00:43 that can't possibly overlap.
01:00:45 You generate solutions directly
01:00:47 in straight conformational
01:00:49 analysis where you might use distance geometry
01:00:51 to generate purely random conformations.
01:00:53 Then, clearly,
01:00:55 we may be wasting time generating structures
01:00:57 that are high energy or
01:00:59 degenerate by
01:01:01 symmetry, for example, in the case
01:01:03 of highly symmetric ring structures.
01:01:05 I want to interrupt
01:01:07 just a second. Callers, when
01:01:09 you call us, please do not hang up.
01:01:11 Stay on the line. We are not going to hang
01:01:13 up on you. We will tell you
01:01:15 if we can't take your question,
01:01:17 but please don't hang up. I'm sorry.
01:01:19 Please continue.
01:01:21 I guess as a practicing
01:01:23 medicinal chemist, I can ask you this question.
01:01:25 Which
01:01:27 methods do you use most
01:01:29 in terms of
01:01:31 trying to answer this scientific
01:01:33 question that relates to
01:01:35 a product or some
01:01:37 goal outside of your
01:01:39 fundamental research?
01:01:41 My interests are both in
01:01:43 lead generation, trying to come up with
01:01:45 brand new structures that we think would be
01:01:47 active and worth synthesizing,
01:01:49 and also in optimizing leads.
01:01:51 The two problems are somewhat
01:01:53 different. In the latter case of
01:01:55 optimizing a lead, I think
01:01:57 it's a less ambitious problem.
01:01:59 Therefore, it's quite a bit easier.