Transcript: Molecular Modeling for Biological Systems (Supercomputer Teleconference) Part 2

1990-Jan-24

These captions and transcript were generated by a computer and may contain errors. If there are significant errors that should be corrected, please let us know by emailing digital@sciencehistory.org.

00:00:00 One should realize that when a complex is formed,

00:00:03 the price for the loss in translational entropy

00:00:06 is only paid once, so the entropic cost

00:00:09 of the second hydrogen bond is relatively small.

00:00:12 Also, in the gas phase with our potential functions,

00:00:15 acetic acid and pyrazine have an optimal interaction energy

00:00:19 of minus 6.9 kilocalories per mole,

00:00:22 while the interaction with the more basic pyridine

00:00:25 is somewhat stronger at minus 7.3.

00:00:28 So these hydrogen bonds have strengths

00:00:31 around 7 kilocalories per mole in the gas phase.

00:00:34 Has this been damped out by the solvent

00:00:37 so that only 1.45 kcals per mole in free energy

00:00:41 remains for the loss of a full hydrogen bond

00:00:44 in going from pyrazine to pyridine with Rebix diacid?

00:00:49 It was our feeling that this damping seemed too great.

00:00:53 Consequently, we decided to explore

00:00:56 the structure in energetics with fluid simulations.

00:00:59 The system was set up using an x-ray structure

00:01:02 for the host obtained from Professor Rebick,

00:01:05 and we had to develop the requisite torsional potentials,

00:01:08 though the host is and was designed to be quite rigid,

00:01:12 including the buttressing methyl groups on the acridine.

00:01:15 The simulations were aimed at obtaining

00:01:18 the relative free energy of binding

00:01:20 for pyrazine versus pyridine.

00:01:22 Thus, the calculations involved mutating pyrazine to pyridine

00:01:25 both bound to the host and unbound

00:01:28 as represented in the illustrated thermodynamic cycle.

00:01:32 Roughly 300 chloroform molecules were included,

00:01:37 and the calculations for the bound system

00:01:39 were initiated with the guest in the cleft of the host.

00:01:43 In the thermodynamic cycle, we have on the top line

00:01:46 the host-binding pyrazine and on the bottom line

00:01:49 the binding of pyridine.

00:01:51 Rather than trying to compute the absolute free energies

00:01:54 of binding corresponding to the horizontal arrows,

00:01:58 the computationally more straightforward

00:02:00 vertical mutations were performed.

00:02:03 It was found that pyrazine is better solvated than pyridine

00:02:06 by 0.3 kilocalories per mole,

00:02:09 while the conversion of the bound complex with pyrazine

00:02:12 to the one with pyridine costs 1.5 kcals per mole.

00:02:17 The difference in these numbers, 1.2 kilocalories per mole,

00:02:21 is then the calculated difference

00:02:23 in free energy of binding favoring pyrazine.

00:02:27 The agreement with the experimental value of 1.45

00:02:30 is very good.

00:02:32 So we have reproduced the experimental thermodynamic data.

00:02:37 The remaining issue is the structure.

00:02:40 The beauty of the simulations is that one obtains

00:02:43 total structural information as well as the thermodynamics.

00:02:48 The next picture shows the pyrazine complex,

00:02:52 just one snapshot from the simulation.

00:02:55 However, we have averaged the structure for the complex

00:02:59 over the entire run, and in fact,

00:03:01 the illustrated structure was chosen

00:03:04 because it provides an excellent depiction

00:03:06 of the average structure.

00:03:08 There's only one good hydrogen bond on average.

00:03:12 The cleft appears to be too small

00:03:14 to optimally accommodate pyrazine.

00:03:16 So there is the one good hydrogen bond

00:03:18 and then a weaker secondary electrostatic interaction

00:03:21 between the other acid group

00:03:23 and the more remote pyrazine-nitrogen.

00:03:26 However, the secondary interaction

00:03:28 can account for the 1.45 kilocalories per mole

00:03:31 preference for binding pyrazine.

00:03:34 During the simulation, of course, everything moves.

00:03:37 The pyrazine tumbles,

00:03:39 sometimes forming the hydrogen bond

00:03:41 on one side and then on the other.

00:03:43 Though it never sits smack

00:03:45 between the acid groups buried in the cleft.

00:03:48 It floats above the cleft

00:03:50 and occasionally sits more symmetrically

00:03:52 with two highly bent hydrogen bonds

00:03:55 to the acid groups.

00:03:57 Incidentally, the computations were performed

00:04:00 for the acid groups both anti, as was shown, and syn.

00:04:04 The binding results were the same in both cases.

00:04:08 With pyridine, the structure is, on average,

00:04:11 even more tipped up than is illustrated,

00:04:13 though there is always one good hydrogen bond

00:04:16 between the pyridine-nitrogen and one of the acid groups.

00:04:20 So we believe that the pyrazine complex

00:04:23 does not exhibit the two-point binding

00:04:25 that has been postulated,

00:04:27 but there is one strong hydrogen bond

00:04:29 and a weaker secondary electrostatic interaction.

00:04:32 Can this position be supported?

00:04:35 We have made an attempt

00:04:37 by trying to address the following question.

00:04:40 If we had the two-point binding for pyrazine,

00:04:43 what would be the binding preference over pyridine?

00:04:46 A bigger cleft is needed,

00:04:48 and we have considered the model system that is illustrated.

00:04:52 Gas phase calculations yielded an optimal separation

00:04:56 of 8.4 angstroms between oxygens

00:04:59 for the complex with two acetic acid molecules and pyrazine.

00:05:03 The average separation in Rebix hosts

00:05:06 is under 8 angstroms according to the simulations.

00:05:09 Noting this, we then pinned two acetic acid molecules

00:05:13 in a box of chloroform

00:05:15 with the oxygens fixed at a separation of 8.4 angstroms.

00:05:19 Everything else was allowed to move,

00:05:21 and the mutation of pyrazine to pyridine was repeated.

00:05:25 This time, the pyrazine stays between the acids,

00:05:28 as shown in the next illustration.

00:05:31 It might be noted from this figure

00:05:33 that the torsional motion for the acid hydrogens

00:05:36 is witnessed by the non-planarity

00:05:39 of the acetic acid molecule on the right.

00:05:42 Also, the pyrazine wobbles in the cleft,

00:05:45 though it maintains the two hydrogen bonds.

00:05:48 The cost in mutating the complex pyrazine to pyridine

00:05:51 is now 3.9 kilocalories per mole,

00:05:54 giving a net difference in binding-free energy

00:05:57 of 3.6 kilocalories per mole.

00:05:59 This translates to a Ka ratio of 405

00:06:03 as compared with the factor of 12 for Rebic's host.

00:06:07 So we feel that a Ka ratio on the order of 400

00:06:11 would be diagnostic of the two-point binding,

00:06:14 and that the ratio of 12 is inadequate.

00:06:17 Furthermore, this example nicely illustrates

00:06:20 how theory and experiment can go hand-in-hand

00:06:23 in fully characterizing the thermodynamic

00:06:26 and structural issues associated

00:06:28 with bioorganic host-guest chemistry.

00:06:31 The recent interest in DNA triple helices, ribozymes,

00:06:36 and now aggregates such as the guanine quartet

00:06:40 require renewed inquiry

00:06:42 into the association of nucleotide bases and analogs.

00:06:46 In this vein, we have been intrigued

00:06:49 by triply hydrogen-bonded systems

00:06:51 and their association in chloroform.

00:06:53 Nature has provided the beautiful example

00:06:56 of guanine cytosine,

00:06:58 which has been estimated by Lord and Rich

00:07:01 using IR spectroscopy

00:07:03 to have a Ka of 10 to the 4th to 10 to the 5th

00:07:06 in deuterochloroform, as illustrated.

00:07:09 Recently, Kelly has obtained a similar Ka,

00:07:13 1.7 times 10 to the 4th,

00:07:15 for a related complex

00:07:17 where the amino pyrimidinone ring of guanine

00:07:21 has been replaced by an amino pyridone.

00:07:24 The impact of the third hydrogen bond

00:07:27 appears clear in comparisons

00:07:29 with the Ka's of about 10 to the 2

00:07:32 for doubly hydrogen-bonded systems

00:07:34 such as adenine with uracil.

00:07:37 However, the data in the next figure

00:07:40 make the situation look more complicated.

00:07:43 These triply hydrogen-bonded cases

00:07:46 show Ka's in the doubly hydrogen-bonded range,

00:07:49 some 2 to 3 orders of magnitude weaker

00:07:52 than for guanine cytosine.

00:07:54 These systems involve binding of uracil or thymine

00:07:57 with a diaminopurine or a diaminopyridine.

00:08:01 The former was studied by Lord and Rich,

00:08:04 and the latter has been studied by Hamilton and others.

00:08:08 The basic types of hydrogen bonds

00:08:10 are the same in all cases

00:08:12 with two O to HN and one NHN hydrogen bonds.

00:08:18 It should be noted that analogs of the nucleotide bases

00:08:21 continue to be important chemotherapeutics.

00:08:24 2,6-diaminopurine

00:08:26 was one of the earliest anti-leukemia drugs

00:08:29 developed by Gertrude Elion,

00:08:31 while AZT and DDI

00:08:33 are the current preferences for control of AIDS.

00:08:36 The binding differential provides an obvious challenge,

00:08:39 which we pursued through Monte Carlo simulations

00:08:42 using a thermodynamic cycle

00:08:44 to address the relative binding-free energies.

00:08:47 Simulations were run for guanine cytosine, GC,

00:08:52 in chloroform,

00:08:54 being mutated to the complex between uracil

00:08:57 and the parent 2,6-diaminopyridine, DAP,

00:09:02 illustrated in the next figure,

00:09:05 and also GC being mutated to AU.

00:09:10 The thermodynamic results are summarized as shown.

00:09:15 The calculations have some added novelty

00:09:18 in that each cycle involved three separate mutations.

00:09:21 For example, the mutations G to DAP,

00:09:24 C to U, and GC to U-DAP

00:09:28 were performed to obtain the required Ka ratio.

00:09:32 Naturally, potential functions

00:09:34 for the nucleotide bases were needed.

00:09:36 These were obtained by fitting the OPLS parameters

00:09:39 in an all-atom representation

00:09:41 to the results of ab initio molecular orbital calculations

00:09:44 on complexes of the bases or fragments

00:09:47 and a water molecule in many orientations.

00:09:50 The key thermodynamic result from the simulations

00:09:53 was that GC is predicted to be better bound

00:09:57 by a Ka ratio of 10 to the 5th

00:10:00 than either U with DAP or A with U.

00:10:05 This implies that the predicted binding of U with DAP

00:10:09 and A with U is similar,

00:10:12 which is consistent with the Ka ratios

00:10:15 of about 100 that are observed for complexes of these types.

00:10:20 The much stronger binding for guanine cytosine

00:10:23 has also been reproduced,

00:10:25 though the predicted difference is even greater

00:10:28 than what has been estimated experimentally.

00:10:31 The simulations reveal that a key component

00:10:34 of the stronger binding for GC

00:10:36 is an intrinsically strong interaction for this complex.

00:10:40 This is evident from the results

00:10:42 of gas phase optimizations as illustrated.

00:10:46 The optimal interaction energy for GC

00:10:48 is minus 22 kilocalories per mole

00:10:52 with the present potential functions.

00:10:54 This value is within a Kcal per mole

00:10:57 of other theoretical results

00:10:59 and an experimental value for mass spectroscopy.

00:11:02 In contrast, the optimal interaction energy

00:11:06 for the U-DAP complex

00:11:08 is only minus 11 Kcals per mole.

00:11:12 The shock of this figure

00:11:15 is reinforced by the optimal interaction

00:11:18 of minus 10.6 Kcals per mole

00:11:22 that is found for adenine with uracil

00:11:25 as shown in the next illustration.

00:11:28 That is, addition of the third hydrogen bond

00:11:31 in DAP with uracil

00:11:33 contributes essentially nothing

00:11:35 to the interaction energy.

00:11:38 Since the primary interactions

00:11:40 have the same types

00:11:42 in the triply hydrogen-bonded complexes,

00:11:44 a more subtle origin

00:11:46 must be sought for the discrepancies.

00:11:48 Though a multipole analysis could be given,

00:11:51 a simpler approach is provided

00:11:53 by considering the secondary interactions

00:11:55 shown for GC.

00:11:57 In multiply hydrogen-bonded systems,

00:12:00 not only are the directly hydrogen-bonded atoms

00:12:03 close together,

00:12:04 but also the atoms in adjacent hydrogen bonds

00:12:07 are close as well.

00:12:09 The significant partial charges and short distances

00:12:12 make these secondary electrostatic interactions important.

00:12:16 For GC, the illustrated secondary distances

00:12:20 are all under 3.6 angstroms.

00:12:23 Taking a closer look,

00:12:24 the nature of each secondary interaction

00:12:26 can be ascertained from the partial charges.

00:12:29 Starting at the top with GC,

00:12:31 the HH and NO interactions would be repulsive,

00:12:35 while the NH and HO interactions are attractive.

00:12:40 Thus, the secondary interactions cancel,

00:12:42 and there is no net secondary effect.

00:12:46 If the situation were the same for DAP with uracil,

00:12:50 we would obviously not have much to talk about.

00:12:54 However, as implied in the next figure,

00:12:57 all four secondary interactions

00:12:59 are in fact repulsive for DAP with uracil.

00:13:04 At this point, all we have to do is postulate

00:13:07 that the primary hydrogen-bonding interactions

00:13:09 are worth about 7.5 kilocalories per mole each

00:13:13 and that the secondary interactions are worth 2.5.

00:13:17 This leads to a net interaction

00:13:19 of 3 times minus 7.5 or minus 22.5 for GC,

00:13:25 and the minus 22.5 plus 4 times 2.5

00:13:30 for the four destabilizing interactions,

00:13:33 giving a net of minus 12.5 for DAP with uracil.

00:13:39 The results can also be generalized

00:13:41 for triply hydrogen-bonded systems

00:13:43 as summarized in the next illustration.

00:13:46 There are three possible arrangements.

00:13:48 The worst case has the partially positive and negative sites

00:13:51 alternating on each molecule.

00:13:54 This leads to the four repulsive secondary interactions,

00:13:57 as in DAP with uracil.

00:14:00 And for GC, on the other hand,

00:14:04 we have the intermediate situation

00:14:06 with a net index of zero secondary interactions.

00:14:10 The best situation would be to have one molecule

00:14:12 with all of the hydrogen-bond donor sites

00:14:15 and the other with all of the acceptor sites.

00:14:17 This leads to a net

00:14:19 of four constructive secondary interactions.

00:14:22 Based on the above results,

00:14:24 systems of the last type might exhibit optimal interactions

00:14:27 around 10 kcal per mole stronger than for GC.

00:14:32 The key message for molecular recognition

00:14:35 is that very strong binding can be obtained

00:14:38 by getting all of the plus ducts on one side

00:14:41 and all of the minus ducts on the other.

00:14:44 This is the end of my short discourse on hydrogen bonding.

00:14:49 It is clear that fluid simulations

00:14:51 are going to play an increasingly important role

00:14:53 in molecular design.

00:14:55 Detailed consideration of solvent effects is critical,

00:14:58 and the inadequacies of conclusions

00:15:01 just based on gas phase data are extreme.

00:15:05 My co-workers on these projects are...

00:15:09 are shown on the slide,

00:15:11 including Stefan Budan,

00:15:13 Tuan Nguyen,

00:15:14 Scott Wierschke,

00:15:15 Jim Blake,

00:15:17 Julian Pernada,

00:15:18 and Julian Tirado-Rivas.

00:15:20 Support has been provided by the NIH and NSF.

00:15:39 Again, we're ready to hear from you, the viewers.

00:15:42 Go to the phones now

00:15:43 and call in your questions and comments.

00:15:45 This is the time that you have to interact

00:15:47 with today's speaker.

00:15:49 Art, while we are waiting for calls for Bill,

00:15:52 do you have any comments or follow-up questions?

00:15:54 Yeah, those ducks you were talking about,

00:15:57 if I'm not mistaking,

00:15:59 they were swimming around in chloroform.

00:16:01 All right.

00:16:02 What happens if you put them in water?

00:16:07 Okay.

00:16:08 Well, trying to think of something cute here.

00:16:14 The ducks float better in water,

00:16:17 and the problem with that

00:16:20 is that if we tried to study

00:16:22 the base pair association in water,

00:16:25 it's well known from experiments

00:16:27 that the bases don't hydrogen bond in water.

00:16:30 That is, the isolated bases, they stack.

00:16:33 And we were, in these studies,

00:16:36 interested in the hydrogen bonding aspects

00:16:40 of the base pair association,

00:16:42 so that's why we were studying chloroform.

00:16:44 It's why people who are interested

00:16:46 in hydrogen bonding are doing

00:16:48 their experimental work in chloroform as well.

00:16:51 The point is, water competes very effectively

00:16:54 for the hydrogen bonding,

00:16:56 and until you have the structure

00:16:58 imposed in a full-blown nucleic acid,

00:17:01 one doesn't see this hydrogen bonding in water.

00:17:05 So, as a follow-up,

00:17:07 what are the prospects

00:17:09 of actually doing that simulation?

00:17:11 And maybe in the context of an earlier question

00:17:14 about, say, Monte Carlo

00:17:16 versus dynamics techniques.

00:17:18 We have no technical problem

00:17:20 running the simulation in water.

00:17:22 The problem is that we can't

00:17:24 address the hydrogen bonding.

00:17:26 We'll see stacked structures.

00:17:28 So that, I mean, there is no

00:17:30 technical problem, and we may,

00:17:32 in fact, do some simulations in water

00:17:34 where we constrain the system

00:17:36 to be hydrogen bonded.

00:17:38 Your other question, going back

00:17:40 to the Monte Carlo versus

00:17:42 dynamics issue,

00:17:44 we... all of the results

00:17:46 I showed today

00:17:48 were from Monte Carlo simulations

00:17:50 with our BOAS program.

00:17:52 We do find that

00:17:54 for these smaller molecule problems,

00:17:56 that is, non-protein problems,

00:17:58 that the Monte Carlo procedures

00:18:01 are very effective.

00:18:03 We represent the...

00:18:05 everything in internal coordinates.

00:18:07 In molecular dynamics,

00:18:09 you're doing Cartesian dynamics.

00:18:11 That is, each particle

00:18:13 moves effectively independently.

00:18:15 Uh...but in Monte Carlo,

00:18:17 we do our moves using

00:18:19 the internal coordinates.

00:18:21 And that has some advantages,

00:18:23 particularly with the free energy calculations.

00:18:25 Now, we do protein dynamics ourselves

00:18:27 using, uh...Peter's AMBER program.

00:18:29 Uh...and we find for large systems,

00:18:31 coupled systems like that,

00:18:33 that molecular dynamics

00:18:35 is certainly, at this point in time,

00:18:37 a more efficient way to go.

00:18:39 But we like Monte Carlo

00:18:41 for the smaller systems

00:18:43 because we can use

00:18:45 the internal coordinate representation,

00:18:47 and frankly, it makes the setup

00:18:49 and execution of the free energy

00:18:51 calculations quite a bit easier.

00:18:53 Bill, your first caller

00:18:55 is Barry, who is on line 5.

00:18:57 Barry, please go ahead.

00:18:59 I had a question about your, uh...

00:19:01 work with the, uh...

00:19:03 I have a problem here.

00:19:05 I'm hearing you.

00:19:07 About the work with the pyrazine.

00:19:09 Pyrazine is known to be very different

00:19:11 than pyridine, uh...quantitatively,

00:19:13 because it's much weaker,

00:19:15 um...of lone pairs on the nitrogen.

00:19:17 Uh...hence, uh...

00:19:19 pKa of pyrazine

00:19:21 is three or more orders of magnitude

00:19:23 lower than pyridine, and one would

00:19:25 assume that the hydrogen bonding

00:19:27 to any, uh...carboxylic acid

00:19:29 would be much weaker.

00:19:30 How are these, uh...

00:19:32 Um...

00:19:34 I'm still having problems here.

00:19:36 Uh...technical problems.

00:19:38 Uh...how are these, uh...

00:19:40 these weaker, uh...

00:19:42 lone pairs on pyrazine,

00:19:44 uh...

00:19:46 are they explicitly included

00:19:48 in the calculations?

00:19:49 Do they naturally fall out

00:19:50 as a result of what you, uh...

00:19:52 of what you do, or, uh...

00:19:54 how are they included?

00:19:58 The question about pyrazine

00:20:00 versus pyridine, we, of course,

00:20:02 use different potential function

00:20:04 parameters for the two molecules,

00:20:06 and they have been optimized

00:20:08 by doing, for example,

00:20:10 fluid simulations of pure liquid pyrazine

00:20:12 and pure liquid pyridine.

00:20:14 We've also studied, uh...ab initio,

00:20:17 or compared our, uh...results

00:20:19 with ab initio results

00:20:21 on pyrazine water and pyridine water complexes.

00:20:24 As it turns out,

00:20:26 our potential functions do give the order

00:20:29 that's, uh...expected by your comments.

00:20:31 That is, uh...we find

00:20:33 that in the gas phase

00:20:35 with our potential functions

00:20:37 that a pyrazine water-hydrogen bond

00:20:39 is about 1 to 1 1⁄2 kcals per mole

00:20:42 weaker than a pyridine,

00:20:44 uh...water-hydrogen bond.

00:20:46 So that effect,

00:20:48 the lower basicity of the pyrazine,

00:20:50 is built into the potential functions

00:20:52 and is incorporated, therefore,

00:20:54 into the calculations.

00:20:56 All right, we have Frank

00:20:58 on line 6, and I would like

00:21:00 to remind the callers not to worry

00:21:02 if you do hear something on the line.

00:21:04 We can hear you fine here,

00:21:06 so just keep talking.

00:21:08 Okay, Frank, what is your question?

00:21:10 Hi, Bill, this is Frank Brown.

00:21:12 I was asking a question

00:21:14 about the pyrazine example again.

00:21:16 You were saying that some of this

00:21:18 came from secondary interaction

00:21:20 with the other nitrogen.

00:21:22 I was wondering if you would

00:21:24 comment on the possible fact

00:21:26 of bumping the electrostatics

00:21:28 on the nitrogen up

00:21:30 being able to do this

00:21:32 with only one nitrogen in the ring.

00:21:34 In other words,

00:21:36 bump up the binding constant

00:21:38 by having donating groups

00:21:40 as you would in a drug molecule.

00:21:42 Uh...certainly, Frank.

00:21:44 In the case of pyridine,

00:21:46 pyridine does bind

00:21:48 to the, uh...Revix diacid

00:21:50 with a respectable Ka of 120

00:21:52 through the one-hydrogen bond.

00:21:54 So there's no question

00:21:56 that one could see enhanced, uh...binding

00:21:58 to Revix diacid

00:22:00 with substituted pyridines,

00:22:02 and as you suggest,

00:22:04 putting donating groups

00:22:06 on the pyridine ring

00:22:08 should facilitate that.

00:22:10 So unquestionably,

00:22:12 one can play

00:22:14 structure reactivity games,

00:22:16 structure binding games

00:22:18 with this system

00:22:20 as well as with any host-guest system

00:22:22 including drug-like systems.

00:22:24 We have a call from Cleveland.

00:22:26 It is Ken on line 8.

00:22:28 Please go ahead.

00:22:30 Hello, Dr. Jorgensen?

00:22:32 I have a general question,

00:22:34 um...relating to what sort of

00:22:36 modifications were, uh...

00:22:38 were done

00:22:40 for the OPSA field

00:22:42 in non-aqueous solvents.

00:22:44 The...

00:22:46 the force field that we use,

00:22:48 the OPLS potentials,

00:22:50 uh...remain the same

00:22:52 independent of the solvent.

00:22:55 They are considered to be transferable.

00:22:57 So the potential function parameters,

00:22:59 the charges and Lennard-Jones parameters,

00:23:01 we would use on a solute,

00:23:03 would remain the same

00:23:05 independent of the solvent that we choose.

00:23:08 And then, the solvent atoms

00:23:10 have their own, uh...charges

00:23:12 and Lennard-Jones parameters,

00:23:14 and the cross interactions

00:23:16 are then determined by, uh...standard mixing rules.

00:23:19 So there's no difference in the potential functions

00:23:22 dependent on the medium.

00:23:24 How do you, uh...

00:23:26 treat, uh...ions in...

00:23:28 in solution?

00:23:30 Uh...ions in solution

00:23:32 are, uh...

00:23:34 potentially more hazardous

00:23:36 in the sense that the

00:23:38 many-body effects become

00:23:40 more significant.

00:23:42 In our own case, we try to avoid

00:23:44 studying ions

00:23:46 where the individual ion-solvent

00:23:48 interactions are above

00:23:50 about 20 kilocalories per mole,

00:23:52 because then the fact

00:23:54 that we are using

00:23:56 two-body potential functions,

00:23:58 I believe, becomes more of a problem

00:24:00 above that, uh...energy range

00:24:02 because of the, uh...strong polarization

00:24:04 of the solvent molecules by the ions,

00:24:06 which we don't have properly represented.

00:24:08 Okay, we need to go to New Jersey

00:24:10 where Willis is on the line.

00:24:12 Willis.

00:24:14 Are you still there?

00:24:18 Okay, perhaps he'll call back.

00:24:20 Would you like to continue?

00:24:22 Getting back to the...

00:24:24 kind of the larger end of the computations,

00:24:26 Peter mentioned something

00:24:28 about parallel computation.

00:24:30 I was wondering what you thought

00:24:32 about it, especially considering that

00:24:34 you use a lot of Monte Carlo.

00:24:36 Yeah.

00:24:38 We're certainly also very excited

00:24:40 about the prospects of

00:24:42 highly parallel computers

00:24:44 and their impact on computational

00:24:46 chemistry.

00:24:48 The Monte Carlo programs

00:24:50 are, uh...sort of notorious

00:24:52 for not being that easy

00:24:54 to vectorize.

00:24:56 We do see significant enhancements

00:24:58 on our Cyber 205 at Purdue

00:25:00 when we have vectorized

00:25:02 our Monte Carlo codes.

00:25:04 But parallelism is very exciting,

00:25:06 particularly for the free-energy

00:25:08 calculations, where we have to run

00:25:10 these multiple simulations

00:25:12 for different values

00:25:14 of the reaction coordinate

00:25:16 or this lambda parameter

00:25:18 in the free-energy calculations.

00:25:20 And those we could run

00:25:22 if we had to do 20 such

00:25:24 calculations, just trivial

00:25:26 parallelism, put them on 20

00:25:28 different processors, and away

00:25:30 we go.

00:25:32 So that would be very exciting

00:25:34 for us, just to have systems

00:25:36 with many processors that we

00:25:38 could even use independently.

00:25:40 All right, we have Willis back

00:25:42 on the line again.

00:25:44 We'll have your question.

00:25:46 The question I wanted to ask

00:25:48 Dr. Jorgensen is, did he

00:25:50 consider using pyridazine where

00:25:52 the two nitrogens are beside each other

00:25:54 in the ring for some of your

00:25:56 ghost test binding, guest host

00:25:58 binding studies?

00:26:00 Yes, Rebeck has

00:26:02 studied the binding of

00:26:04 pyridazine as well with his

00:26:06 host, and the binding

00:26:08 of it is about the same as

00:26:10 pyridine, as I recall.

00:26:12 The problem with

00:26:14 pyridazine in the

00:26:16 context of this host is

00:26:18 you're not going to be able to have

00:26:20 two linear hydrogen bonds to the

00:26:22 pyridazine. So even though they are

00:26:24 closer, that is an advantage. The hydrogen

00:26:26 bonds are going to be more bent

00:26:28 so that the measure Ka

00:26:30 doesn't reflect greater binding

00:26:32 in that case.

00:26:34 All right, Walt, what is your question, please?

00:26:36 With regard to

00:26:38 the

00:26:40 three bonded models that you have,

00:26:42 three hydrogen bonded models that you

00:26:44 have... Okay, just a second.

00:26:48 We're trying to iron out

00:26:50 the bugs. Perhaps Walt

00:26:52 will call back in just a moment.

00:26:56 I wish I had heard the beginning of the

00:26:58 question.

00:27:00 He's starting to talk about the

00:27:02 triply hydrogen bonded systems. They have

00:27:04 been receiving quite a bit of

00:27:06 attention lately. A paper

00:27:08 that may interest some of the viewers

00:27:10 is by Steve Benner

00:27:12 in the second to last issue

00:27:14 of Nature where he has a very nice

00:27:16 discourse

00:27:18 and experimental results

00:27:20 on making nucleotide

00:27:22 base analogs.

00:27:24 Some of our

00:27:26 work

00:27:28 coincides very nicely

00:27:30 with his studies.

00:27:32 Just about

00:27:34 to take a cough drop here.

00:27:36 I was still curious

00:27:38 in pursuing that

00:27:40 avenue because

00:27:42 of the obvious relevance to

00:27:44 control of DNA

00:27:46 structure and so forth.

00:27:48 Sorry.

00:27:54 What do you think you would need

00:27:56 in terms of computing power

00:27:58 to actually

00:28:00 attempt those kinds of calculations

00:28:02 where given a sequence you

00:28:04 would try to order the

00:28:06 relative binding strengths?

00:28:12 I think we can do

00:28:15 reasonable computations

00:28:17 at this time on predicting

00:28:19 let's say the effects

00:28:21 on DNA melting temperature

00:28:23 for base pair

00:28:25 mismatches or the use

00:28:27 of some of these nucleotide

00:28:29 base analogs. I think that the

00:28:31 technology is in place

00:28:33 for that. Some of the

00:28:35 biggest problems with simulations in

00:28:37 nucleic acids are what to do

00:28:39 with the ion atmosphere. That is

00:28:41 where to place the counter ions

00:28:43 and the fact that you

00:28:45 are dealing with a highly charged system.

00:28:47 And perhaps during the panel

00:28:49 discussion we could ask Peter

00:28:51 Coleman a little bit about that since he

00:28:53 is more experienced in that area

00:28:55 than I am. Alright. We have a

00:28:57 caller now on line 8. It is Chang Long.

00:28:59 Would you go ahead please?

00:29:01 Yes. In your simulations

00:29:03 it will be very important

00:29:05 for the criteria of a

00:29:07 hydrogen bonding. Definition

00:29:09 of a hydrogen bonding

00:29:11 especially in the distance and angles

00:29:13 and I wonder whether your program

00:29:15 has the flexibility

00:29:17 that throughout the

00:29:19 variability of distances your energy

00:29:21 can be changed accordingly?

00:29:24 Hydrogen bonding

00:29:26 from my viewpoint is

00:29:28 predominantly an electrostatic

00:29:30 phenomenon. That is it is

00:29:32 controlled by

00:29:34 the partial charge, partial charge

00:29:36 interactions. Now we

00:29:38 do not tell our program

00:29:40 at any point, ah, this is a hydrogen

00:29:42 bond or this isn't.

00:29:44 The interactions are simply controlled

00:29:46 by Coulomb's Law

00:29:48 between the molecules.

00:29:50 The only time when we would get involved

00:29:52 in definitions of a hydrogen bond

00:29:54 would be after the fact

00:29:56 in some analysis program

00:29:58 where we would be trying to actually

00:30:00 analyze for the

00:30:02 hydrogen bonding. And then

00:30:04 we might have to use an energetic

00:30:06 definition that, you know, only

00:30:08 consider something to be hydrogen bonded

00:30:10 if the interaction energy is below

00:30:12 let's say 3 kilocalories

00:30:14 per mole. Or we might

00:30:16 use a geometric definition.

00:30:18 But in the actual course of the simulations

00:30:20 we never say, ah, this is a hydrogen

00:30:22 bond. That is all worked out

00:30:24 by the

00:30:26 intermolecular interactions

00:30:28 as a whole. Alright.

00:30:30 Walt, you're on line 6. What is your question

00:30:32 please? Yeah, as I understand

00:30:34 your modeling with the nucleic

00:30:36 acid bases, and particularly

00:30:38 the three bonded bases,

00:30:40 the third base contributes little to the stability

00:30:42 of the complex.

00:30:44 Could you say something about

00:30:46 the implications that that has for

00:30:48 high

00:30:50 AT and high GC base pairs

00:30:52 in our understanding, or

00:30:54 understanding that high GC base

00:30:56 pairs are much more stable

00:30:58 and we

00:31:00 impute that to their three bonded

00:31:02 character, but that doesn't seem

00:31:04 to be true from your calculations.

00:31:06 Thank you. Yeah, I think that

00:31:08 you missed something

00:31:10 in the results there.

00:31:12 GC does have the strongest

00:31:14 interaction about, in the gas

00:31:16 phase, of 22 kilocalories per mole

00:31:18 in a binding constant in chloroform

00:31:20 of at least 10 to the 5th.

00:31:22 AT, or AU, has a

00:31:24 binding constant in chloroform of 10 to the

00:31:26 2, and in our calculations

00:31:28 the interaction is about 10.5

00:31:30 kilocalories per mole. The difference

00:31:32 is in comparing GC

00:31:34 with the uracil

00:31:36 diaminopyridine type systems.

00:31:38 There, in the uracil diaminopyridine

00:31:40 systems, you do have the third

00:31:42 hydrogen bond, but it really

00:31:44 doesn't give you stronger

00:31:46 net binding than we see

00:31:48 with the AT

00:31:50 type pairs.

00:31:52 The difference is really GC

00:31:54 versus the uracil diaminopyridine.

00:31:56 Again, in the Benner

00:31:58 paper, if you look that one up,

00:32:00 you'll see that his

00:32:02 alternates for triply hydrogen

00:32:04 bonded systems are

00:32:06 predominantly of the uracil

00:32:08 diaminopyridine type,

00:32:10 so one isn't going to see

00:32:12 the strong GC

00:32:14 interactions with those systems either.

00:32:16 Alright, our last call is from

00:32:18 Shock, and he is on line 5. Shock, please

00:32:20 go ahead. Hello?

00:32:22 Yes, we hear you.

00:32:24 What is your question, please?

00:32:26 Hello? We can hear you.

00:32:28 Can you hear us?

00:32:30 The name is

00:32:32 Jack, and the question is

00:32:34 as follows.

00:32:36 The rationale that Dr. Jorgensen has

00:32:38 provided for the

00:32:40 difference in energy of

00:32:42 the GC pair

00:32:44 and 2,6-DAP

00:32:46 U

00:32:48 is clear.

00:32:50 On the other hand, the fact that

00:32:52 2,6-DAP U and AU

00:32:54 are so similar is puzzling.

00:32:56 May the explanation not lie

00:32:58 in the fact that in 2,6-DAP

00:33:00 you have a 2-amino

00:33:02 to a

00:33:04 carbonyl hydrogen bond

00:33:06 whereas in AU you have

00:33:08 a 2-CH

00:33:10 to the

00:33:12 carbonyl hydrogen bond

00:33:14 and the 2-CH

00:33:16 carbonyl hydrogen bond

00:33:18 isn't all that much difference in energy

00:33:20 from the 2-amino

00:33:22 carbonyl hydrogen bond.

00:33:26 Yes, your first

00:33:28 statement is correct that the

00:33:30 AU systems have about

00:33:32 the same gas phase interaction

00:33:34 as the uracil diamino

00:33:36 pyridine. Trying

00:33:38 to dissect that is a little bit harder

00:33:40 comparing a doubly hydrogen

00:33:42 bonded system with a triply hydrogen bonded system

00:33:44 but I think one point that you did raise

00:33:46 is absolutely true.

00:33:48 The hydrogen on C2

00:33:50 in adenine

00:33:52 because it's between the two

00:33:54 pyrimidine nitrogens is quite acidic

00:33:56 and the interaction

00:33:58 of that hydrogen with

00:34:00 the non-formally

00:34:02 hydrogen bonded carbonyl

00:34:04 on uracil or thymine

00:34:06 is significant

00:34:08 and thank you for raising that.

00:34:10 It's time to move on. Thank you so much

00:34:12 Bill and Art and we'll be back with you shortly.

00:34:14 We did not have time

00:34:16 to take your call. I apologize.

00:34:18 I do want to remind you that we'll

00:34:20 see all of our speakers at the end

00:34:22 of the program for a panel discussion.

00:34:24 At that time the phones will be

00:34:26 open for questions you might have.

00:34:28 Now we will hear from Dr.

00:34:30 Jeffrey Blaney with his presentation

00:34:32 on a distance geometry

00:34:34 approach to ligand macromolecular

00:34:36 docking. Art?

00:34:38 Jeff represents a user's

00:34:40 point of view in this discussion

00:34:42 today being a

00:34:44 practicing medicinal chemist.

00:34:46 Although he did see the light early as

00:34:48 Peter Coleman referred to in his talk

00:34:50 and began applying modeling techniques

00:34:52 to the study of structure activity

00:34:54 relationships in understanding

00:34:56 drug activity. Jeff?

00:35:02 Thanks Art.

00:35:04 I'll begin today by focusing on

00:35:06 a key point in molecular modeling

00:35:08 and the different approaches that we might take.

00:35:10 That is one of comparing analytical methods

00:35:12 with methods that are oriented towards design.

00:35:14 The vast majority of modeling tools that we

00:35:16 have available to us, graphical and

00:35:18 computational tools, are analytical.

00:35:20 We can ask a question about a given structure

00:35:22 and get back answers to varying degrees of

00:35:24 accuracy about it.

00:35:26 To be able to use any of these methods

00:35:28 we really need to have a molecule to start with.

00:35:30 Typically we use these methods

00:35:32 to combine information about a family of active

00:35:34 molecules and ask what is it that

00:35:36 makes these molecules active

00:35:38 and can we correlate some

00:35:40 structural, measurable, or calculable

00:35:42 property of these molecules and rationalize

00:35:44 the activity of them in a structure activity

00:35:46 relationship and come up with

00:35:48 a model for

00:35:50 what the common receptor for what the set of

00:35:52 molecules might look like and what their respective

00:35:54 binding modes are. If we're extremely

00:35:56 fortunate we'll have this information in the form

00:35:58 of a high resolution X-ray crystal structure

00:36:00 of the receptor.

00:36:02 In either case, the next goal is to

00:36:04 extrapolate from the receptor model to the

00:36:06 design of new molecules and predict

00:36:08 their activity prior to synthesis.

00:36:10 You've already heard a little bit today about

00:36:12 free energy perturbation methods which are

00:36:14 very powerful and show lots of promise

00:36:16 but are currently limited to making predictions

00:36:18 about fairly small changes between related

00:36:20 molecules. We'd like to be able to ask

00:36:22 that question of molecules that are structurally

00:36:24 entirely different from each other.

00:36:26 We'd really like to be able to predict

00:36:28 absolute free energies of binding of small

00:36:30 molecules to receptor. But it's clear

00:36:32 that we're not quite yet at that point

00:36:34 and as a result it's still incredibly difficult

00:36:36 to predict the activity of structures prior

00:36:38 to synthesis. It turns out

00:36:40 it's also quite difficult to design new structures.

00:36:42 The software

00:36:44 tools that we have tend to be entirely analytical

00:36:46 although they can help give you some

00:36:48 insight into what kind of molecule you might

00:36:50 want to design. But it still really

00:36:52 requires having a very creative and

00:36:54 clever organic or medicinal

00:36:56 chemist working with modeling tools

00:36:58 and chemical intuition

00:37:00 to successfully design new compounds.

00:37:02 In fact, the only approaches

00:37:04 revealed to date that have been successful in designing

00:37:06 new potential drug molecules

00:37:08 not related to any other known active compounds

00:37:10 have come from an intuitive

00:37:12 empirical modeling approach.

00:37:14 These have come from the work of

00:37:16 Bedell's group at Wellcome Labs

00:37:18 on the design of anti-sickling compounds

00:37:20 based on the crystal structure of hemoglobin

00:37:22 and from Ripke's group

00:37:24 at DuPont on the design of phospholipase A2

00:37:26 inhibitors. Both

00:37:28 are included in the list of references.

00:37:30 In both cases, they used fairly

00:37:32 simple qualitative modeling methods.

00:37:34 In Bedell's work, I think

00:37:36 they actually may have used Weyer-Kendrew models.

00:37:38 In Ripke's work at

00:37:40 DuPont, computer graphics methods were used.

00:37:42 But I think the common theme in both

00:37:44 of them is that there was a lot of what you might call

00:37:46 chemical intuition used

00:37:48 and a trial and error approach of building many small

00:37:50 molecules and fitting them into the site

00:37:52 and looking in a qualitative way to see what

00:37:54 could make good hydrogen bonding

00:37:56 interactions, steric interactions

00:37:58 and hydrophobic interactions.

00:38:00 There's been a steadily increasing amount

00:38:02 of work on developing more automated approaches

00:38:04 to designing molecules that have less

00:38:06 bias. A clear problem

00:38:08 even if you have several

00:38:10 very experienced people working

00:38:12 on designing molecules, each of them

00:38:14 given the same information and same set of tools,

00:38:16 that it's very likely they'll find different

00:38:18 answers. We'd like to come up

00:38:20 with a more rigorous and a less biased

00:38:22 automated way of designing structures and evaluating

00:38:24 them.

00:38:26 One of the most successful approaches to date

00:38:28 are those for searching three-dimensional

00:38:30 chemical databases.

00:38:32 We can describe a receptor model in terms of

00:38:34 distances, angles or planes

00:38:36 between functional groups and then search

00:38:38 a 3D database to find molecules

00:38:40 that satisfy those geometric constraints.

00:38:42 These molecules could then be candidates

00:38:44 to bind to the receptor. Despite

00:38:46 the limitations of fixed confirmation

00:38:48 for each of the

00:38:50 molecules in the 3D database,

00:38:52 these methods can successfully generate new

00:38:54 ideas for synthesis.

00:38:56 There are a variety of academic, industrial

00:38:58 and commercial efforts in 3D database

00:39:00 searching. I've listed a number of them

00:39:02 in the references.

00:39:04 Until recently, there's been very little practical

00:39:06 work done in the area of designing structures from scratch

00:39:08 as opposed to pulling them out

00:39:10 of existing databases.

00:39:12 This remains an enormously difficult problem.

00:39:14 We're seeing the beginnings of what

00:39:16 may become feasible approaches here,

00:39:18 but we're still a long ways from a general solution

00:39:20 to the de novo design of drugs.

00:39:22 I've also included a few references

00:39:24 to these approaches.

00:39:26 For the remainder of the talk,

00:39:28 I'll tell you briefly how distance geometry

00:39:30 works and a couple of applications

00:39:32 using it, including our own work

00:39:34 in using it for docking small molecules

00:39:36 into protein binding sites

00:39:38 and how it could be used in design.

00:39:40 Distances are a very

00:39:42 natural way to describe structures.

00:39:44 We tend to think in terms of hydrogen bond

00:39:46 lengths, Van der Waals contacts,

00:39:48 etc.

00:39:50 Several different experimental methods

00:39:52 give us information back in terms of distances.

00:39:54 For example, the

00:39:56 NOE measurement from 2D NMR

00:39:58 can be related to a distance

00:40:00 measurement. In addition,

00:40:02 we don't need to have a starting confirmation

00:40:04 for a molecule to use distance geometry.

00:40:06 We don't need to build a reasonable model

00:40:08 of it to begin with, which is a requirement

00:40:10 for molecular mechanics or dynamics.

00:40:12 We start off with something that's

00:40:14 a reasonably good structure.

00:40:16 We don't need any force field parameters,

00:40:18 torsion, bond angle,

00:40:20 partial charges, etc.

00:40:22 That's because distance geometry

00:40:24 isn't a physical chemical method. It's not

00:40:26 based on any theory of molecular interactions

00:40:28 or energetics. It's a purely geometric

00:40:30 model builder. Flexible

00:40:32 rings are handled very naturally by distance

00:40:34 geometry without doing anything special,

00:40:36 without solving what's typically a fairly

00:40:38 difficult ring-closing problem if we're working

00:40:40 in Cartesian or internal coordinates.

00:40:42 Distance geometry is a random

00:40:44 method, and therefore can be used to determine

00:40:46 whether a given model even exists,

00:40:48 and if so, how unique is it?

00:40:50 Is there one way

00:40:52 of achieving a given set of intermolecular

00:40:54 or intramolecular interactions

00:40:56 or many? Since

00:40:58 it is random and unbiased, occasionally we get

00:41:00 surprises. And I think that's where the

00:41:02 real exciting and interesting work can come from

00:41:04 is when you get an unexpected result.

00:41:06 Computational methods

00:41:08 usually give us answers that quantify or help

00:41:10 confirm an idea we already had.

00:41:12 It's much more exciting when you actually get some new

00:41:14 ideas out.

00:41:16 Distance geometry has been used in several areas.

00:41:18 The major ones that come to mind

00:41:20 include conformational analysis of small molecules,

00:41:22 where distance geometry is

00:41:24 used to generate random conformers that are subsequently

00:41:26 energy minimized.

00:41:28 It's been used most extensively and is probably

00:41:30 best known for solving the solution

00:41:32 structures of small to medium-sized proteins

00:41:34 and nucleic acids from 2D NMR data,

00:41:36 which Dave Case will

00:41:38 talk about in his talk.

00:41:40 It's been used in a few cases in protein

00:41:42 homology model building. This is where

00:41:44 one tries to estimate the three-dimensional structure

00:41:46 of a protein based on a

00:41:48 sequence homology to another protein whose structure

00:41:50 is known. I'll describe in

00:41:52 some detail the last two applications

00:41:54 on this chart, the ensemble

00:41:56 method for modeling pharmacophores and

00:41:58 superimposing molecules, and

00:42:00 finally the docking work that's been done at DuPont

00:42:02 over the last few years.

00:42:04 How does distance geometry actually work?

00:42:06 We describe a molecular

00:42:08 structure not in terms of three-dimensional

00:42:10 Cartesian coordinates or internal coordinates.

00:42:12 Instead, we describe it as a set of

00:42:14 all interatomic distances.

00:42:16 We don't actually set specific distances

00:42:18 initially. We set distance ranges

00:42:20 or bounds.

00:42:22 By specifying the maximum attainable distance

00:42:24 between a pair of atoms and the minimum

00:42:26 attainable distance, it's clear that if

00:42:28 we do this for all pairs of atoms in the structure,

00:42:30 that all possible conformations must

00:42:32 fall in between, as shown in the distance

00:42:34 bound matrix on this slide.

00:42:36 So here we have a very compact way of

00:42:38 describing the entire conformation space of a

00:42:40 molecule.

00:42:42 The challenge of distance geometry is to extract

00:42:44 from this representation, in a random

00:42:46 and efficient way, what those conformations

00:42:48 are.

00:42:50 How do we set the initial upper and lower distance bounds?

00:42:52 For covalently bonded atoms,

00:42:54 we set the upper and lower bound equal

00:42:56 to the bond length. For atoms that

00:42:58 define a bond angle that are 1,3 to each

00:43:00 other, we also set their upper and lower

00:43:02 bounds equal to each other, based on the bond

00:43:04 angle. For atoms that are

00:43:06 1,4 to each other, that have a rotatable

00:43:08 bond between them, we set their lower

00:43:10 bound to the distance they'd have, typically

00:43:12 in a gauche conformation, and the upper

00:43:14 bound to the distance they would have in a trans

00:43:16 conformation.

00:43:18 For atoms farther apart than 1,4,

00:43:20 we set their lower bound to the sum of their

00:43:22 Van der Waals radii, and their upper bound

00:43:24 to the distance they would have in a fully extended

00:43:26 chain. We randomly

00:43:28 select a discrete inter-atomic

00:43:30 distance between each upper and lower

00:43:32 bound, and then convert these distances

00:43:34 back into a set of three-dimensional

00:43:36 coordinates, and then refine the

00:43:38 coordinates against the upper and lower bounds

00:43:40 until they converge, and we have a structure that

00:43:42 satisfies our original distance constraints.

00:43:44 I've gone over this pretty quickly.

00:43:46 There is a lot of work involved in this,

00:43:48 and it's gone behind it,

00:43:50 the vast majority of which is due to

00:43:52 the efforts of Gordon Crippen and Tim Havel, now

00:43:54 both at the University of Michigan.

00:43:56 Thanks largely to their work, the method's

00:43:58 now robust enough to handle chemical structure

00:44:00 problems for small to medium-sized

00:44:02 molecules, up to 1,000

00:44:04 and perhaps 1,200 to 1,500 atoms.

00:44:06 For more

00:44:08 information on how the method actually works,

00:44:10 you'll find excruciating

00:44:12 detail in the references.

00:44:14 Something that became apparent to us

00:44:16 as we started using distance geometry several years

00:44:18 ago at DuPont, and suddenly had the

00:44:20 ability to generate models very quickly,

00:44:22 was that our ability to generate them

00:44:24 far outstripped our ability to analyze them.

00:44:26 So we needed an automated

00:44:28 method for doing that.

00:44:30 We tried a number of approaches, and finally settled on

00:44:32 using cluster analysis for grouping

00:44:34 structures into conformationally related families.

00:44:36 This is a simple approach

00:44:38 that we found to work very well.

00:44:40 The idea is that for a series of

00:44:42 conformers, we calculate the

00:44:44 root mean square, least squares

00:44:46 fit error between all of them,

00:44:48 by superimposing each of the conformers onto each

00:44:50 other, so that if, say, we

00:44:52 generated ten conformers, we'd fill in a symmetric

00:44:54 matrix, a ten by ten matrix,

00:44:56 with all of those RMS least squares

00:44:58 fit values. From that matrix,

00:45:00 we can calculate the distances between

00:45:02 the conformers, using the RMS

00:45:04 matrix as a coordinate matrix

00:45:06 to calculate Euclidean distances between

00:45:08 each conformer.

00:45:10 The result is that we now have a distance matrix

00:45:12 which shows how far apart

00:45:14 each of the conformers are from each other,

00:45:16 and we can take that directly into

00:45:18 a standard cluster analysis program

00:45:20 that produces a tree chart, or a

00:45:22 dendrogram, like the one shown

00:45:24 here. On the left of the

00:45:26 chart are all the individual conformers,

00:45:28 and as we move to the right,

00:45:30 they merge together into progressively larger

00:45:32 and larger clusters.

00:45:34 So we can choose on this chart

00:45:36 what we deem an appropriate level of resolution.

00:45:38 For example, down at the bottom, we find

00:45:40 four conformers that are clustered together,

00:45:42 and for subsequent analysis, we might conclude

00:45:44 that it's sufficient to just take one of

00:45:46 the four as representative, since all four

00:45:48 must be very similar.

00:45:50 Now, I've told you how we can

00:45:52 generate structures with distance geometry

00:45:54 in a simple way for classifying the random structures

00:45:56 into unique families.

00:45:58 I'd like to tell you briefly about the ensemble

00:46:00 approach for pharmacophore modeling,

00:46:02 developed by Scott Dixon and Bob

00:46:04 Sheridan at Letterly. It's also listed

00:46:06 in the references. Pharmacophore

00:46:08 to medicinal chemist

00:46:10 means the set of atoms or functional groups that are

00:46:12 required for biological activity at a receptor.

00:46:14 A simple

00:46:16 example of a pharmacophore might be a basic

00:46:19 nitrogen, a phenolic hydroxyl group,

00:46:21 and an aromatic ring

00:46:23 with specific geometric relationships between

00:46:25 them.

00:46:27 Once we've come up with an idea for what a pharmacophore

00:46:29 might be, we'd then like to be able to take a whole

00:46:31 series of active molecules and ask

00:46:33 how might they bind to the receptor?

00:46:35 To answer that question, we try to superimpose

00:46:37 the molecules such that their common

00:46:39 pharmacophoric groups would overlap.

00:46:41 Ordinarily, this would be a

00:46:43 very complicated problem because you have

00:46:45 six degrees of freedom

00:46:47 for each one of the molecules, three

00:46:49 rotational and three translational,

00:46:51 plus all their internal degrees of freedom due to their

00:46:53 bond rotations.

00:46:55 So the combinatorial possibilities of searching

00:46:57 all orientations and conformations are enormous.

00:46:59 And this is a complicated

00:47:01 problem with conventional methods.

00:47:03 But in distance geometry, it becomes quite simple.

00:47:05 Rather than putting one molecule

00:47:07 into the distance bounds matrix, we put several

00:47:09 into it at once. The chart shown

00:47:11 here shows a distance bound

00:47:13 matrix that now contains three molecules.

00:47:15 The cross-hatched areas show where

00:47:17 the intramolecular constraints are

00:47:19 and the clear areas show the

00:47:21 intermolecular distance bounds.

00:47:23 We set the lower intermolecular bounds

00:47:25 to zero, which allows the molecules to

00:47:27 pass through each other and superimpose.

00:47:29 And we set the upper bounds

00:47:31 to force the specific atoms that define

00:47:33 the pharmacophore common to the three molecules

00:47:35 to superimpose.

00:47:37 Then we randomly sample

00:47:39 from the discrete distance bounds

00:47:41 sorry, from the distance bounds to get

00:47:43 discrete distances and convert

00:47:45 them into three-dimensional coordinates.

00:47:47 In this way, we generate random

00:47:49 conformations of each of the molecules

00:47:51 subject to the constraint that their common

00:47:53 pharmacophoric atoms must superimpose.

00:47:55 And we can determine both whether the

00:47:57 proposed pharmacophore model

00:47:59 is even possible, and if so,

00:48:01 how unique is it?

00:48:03 Is there just one solution that we might have

00:48:05 some faith in and decide to use

00:48:07 to actually design some new molecules?

00:48:09 Or are there tens or possibly hundreds

00:48:11 meaning that we have a very undetermined

00:48:13 ill-defined problem?

00:48:15 The next chart shows a simple example

00:48:17 of this application that's taken from the

00:48:19 JMedChem reference of Sheridan and Dixon

00:48:21 the four ligands that bind to the nicotinic

00:48:23 receptor. They used a

00:48:25 three-point pharmacophore overlapping

00:48:27 basic nitrogens

00:48:29 and then a bond dipole in three of the

00:48:31 molecules. It's a

00:48:33 carbonyl group. In the fourth, it's taken from a

00:48:35 dummy atom at the center of the pyridine ring

00:48:37 to the pyridine nitrogen. The dashed

00:48:39 lines on the chart show the constraints used

00:48:41 to hold the corresponding atoms together.

00:48:43 On the first

00:48:45 two of the next color slides,

00:48:47 we see the four structures

00:48:49 with the colored spheres now

00:48:51 highlighting the pharmacophoric groups.

00:48:53 And in the next view,

00:48:55 we see one possible superimposition

00:48:57 of them in

00:48:59 a potential solution to the 3D

00:49:01 pharmacophore problem.

00:49:03 The individual structures are shown in the four

00:49:05 corners, and then all four of them are

00:49:07 superimposed based on those pharmacophore constraints

00:49:09 in the center.

00:49:11 Finally, with all

00:49:13 this as background, I'll get to my actual

00:49:15 title, which is Using Distance Geometry to Dock

00:49:17 Small Molecules and Proteins.

00:49:19 This is a modified distance geometry

00:49:21 approach that's optimized

00:49:23 specifically for docking.

00:49:25 The idea is that we're going to generate random

00:49:27 fits of conformationally flexible ligands

00:49:29 into a rigid binding site.

00:49:31 Next, we'll rank those dockings

00:49:33 with a simple molecular mechanics interaction

00:49:35 energy. Our goal is to rapidly

00:49:37 search a large chemical database to find

00:49:39 the best structures to fit a given site model

00:49:41 or receptor. The difference

00:49:43 between this and other three-dimensional search

00:49:45 approaches currently used is that

00:49:47 we allow the ligand to be conformationally flexible

00:49:49 and we aren't restricted to a

00:49:51 fixed ligand geometry.

00:49:53 I'll show you how this works with an example of

00:49:55 docking methotrexate,

00:49:57 an antitumor drug, to

00:49:59 alkazide dietofolyreductase,

00:50:01 solved by Kraut and Matthews at UCSD

00:50:03 about ten years ago.

00:50:05 Methotrexate has many rotatable bonds

00:50:07 so it makes a challenging docking problem.

00:50:09 We start out by

00:50:11 finding the geometric center of the binding site

00:50:13 and defining a sphere of sufficient

00:50:15 volume, shown in this view in yellow,

00:50:17 to enclose all the molecular

00:50:19 surface of the binding site.

00:50:21 We'll constrain the ligand,

00:50:23 the methotrexate, shown in red

00:50:25 in this view, to lie inside of the sphere

00:50:27 and outside of the protein

00:50:29 and then generate random conformations

00:50:31 of the methotrexate in the sphere

00:50:33 such that it doesn't bump into the protein.

00:50:35 We've put a lot of effort into doing this as rapidly

00:50:37 as possible and I'll briefly show

00:50:39 you the approach in the next few slides.

00:50:41 In the next few, we see

00:50:43 the molecular surface of the

00:50:45 dietofolyreductase active site

00:50:47 is calculated using Mike Connelly's program

00:50:49 which rolls a probe sphere over the

00:50:51 surface of the protein and lays down

00:50:53 a series of dots wherever the

00:50:55 probe sphere has a point of tangency with the protein.

00:50:57 The surface that we

00:50:59 actually use in our docking

00:51:01 calculations is shown in the next

00:51:03 view. It's what's called an

00:51:05 extra-radius surface

00:51:07 which we generate by adding one van der Waals

00:51:09 radius to every atom in the protein

00:51:11 and then calculating the surface at this

00:51:13 extra-radius. You can see that this surface

00:51:15 collapses down just onto the

00:51:17 volume shown by the simple stick model

00:51:19 of the methotrexate.

00:51:21 In the next step, we pack a set of

00:51:23 spheres into this extra-radius

00:51:25 surface using the algorithm described

00:51:27 by Kuntz in the 1982

00:51:29 JMB

00:51:31 paper listed in the references.

00:51:33 That's shown on the next slide where

00:51:35 the yellow region is the union of the

00:51:37 set of the 48 spheres that were used

00:51:39 to fill the extra-radius surface

00:51:41 that collectively define the shape of the binding site.

00:51:43 We use

00:51:45 these spheres as an additional constraint on the

00:51:47 distance geometry refinement to

00:51:49 accelerate convergence.

00:51:51 We actually constrain each

00:51:53 ligand atom to lie in one or more of these

00:51:55 spheres shown in the yellow region and

00:51:57 in doing so, we'll of course force the ligand to

00:51:59 lie in the binding site.

00:52:01 What kind of dockings do we generate and how long

00:52:03 does it take to get them?

00:52:05 In the next view, we see one of the best

00:52:07 dockings out of 100 random trials

00:52:09 ranked by molecular mechanics

00:52:11 interaction energy. You can see that

00:52:13 it's fairly close to the actual crystal structure of

00:52:15 methotrexate which is the structure

00:52:17 shown in red in this view.

00:52:19 We've got the location

00:52:21 of the pteridine ring correct. You can

00:52:23 see that the arrow is pointing to the superimposed

00:52:25 green and red pteridine rings here

00:52:27 but we haven't quite got the rotatable

00:52:29 bonds in the glutamate portion

00:52:31 of the molecule right.

00:52:33 Still, this is pretty encouraging since we aren't using

00:52:35 any energetic terms in the docking generation

00:52:37 refinement, only in the ranking.

00:52:39 In the next view,

00:52:41 we see an example

00:52:43 of how unbiased the docking actually is

00:52:45 and that we've now got approximately the same

00:52:47 binding mode but the pteridine ring

00:52:49 is flipped over by 180 degrees

00:52:51 which is in fact the way the natural substrate

00:52:53 folic acid binds. Methotrexate

00:52:55 binds with that ring upside down.

00:52:57 In the next view,

00:52:59 we see a completely alternate binding

00:53:01 mode which although it probably isn't a reasonable

00:53:03 one for methotrexate, it

00:53:05 shows another part of the site that could accept a group

00:53:07 of similar size to the pteridine ring system

00:53:09 and actually has groups that could hydrogen

00:53:11 bond to it which could suggest

00:53:13 ideas for new analogs.

00:53:15 How long does it all take?

00:53:17 Approximately

00:53:19 10-20% of the fits that we generate

00:53:21 have reasonable binding modes

00:53:23 which means that they're in tight contact

00:53:25 with the enzyme surface.

00:53:27 However, only about 1-5% of them are

00:53:29 close to the crystallographic result

00:53:31 so our overall throughput

00:53:33 of good dockings is rather low.

00:53:35 About 70% of these random trials

00:53:37 converge for an average of about 1.5

00:53:39 seconds per fit on a cray.

00:53:41 While this is very fast,

00:53:43 it's clearly not fast

00:53:45 enough to search through tens or hundreds

00:53:47 of thousands of structures.

00:53:49 This is encouraging given that we're searching with

00:53:51 complete conformational freedom in the ligand

00:53:53 and that fast computing is rapidly becoming much cheaper.

00:53:55 We've tried alternate

00:53:57 approaches. The one that's

00:53:59 worked the best so far

00:54:01 tries to add a little chemical information to the docking

00:54:03 by assigning to each atom in the ligand

00:54:05 whether it's polar or nonpolar

00:54:07 or positively or negatively charged.

00:54:09 We don't try to quantify any more

00:54:11 than that, just polar, nonpolar

00:54:13 and whether it's positive or negative.

00:54:15 We do the same for each one of those

00:54:17 small set of the 48 spheres

00:54:19 that we used to describe the site

00:54:21 and then we add an additional constraint

00:54:23 that requires each ligand atom

00:54:25 to lie inside a complementary sphere.

00:54:27 If we do that, we now get all of the

00:54:29 methotrexate and dihydrofolate reductase dockings

00:54:31 close to the crystallographic binding mode.

00:54:33 The number of randomly generated

00:54:35 structures that actually converge is decreased

00:54:37 but all of them have the correct binding mode.

00:54:39 This is probably an artificially

00:54:41 good result because of the nature

00:54:43 of the structures of methotrexate and dihydrofolate

00:54:45 reductase. Methotrexate is a

00:54:47 strongly polar molecule with its

00:54:49 positively charged pteridine end

00:54:51 and its negatively charged glutamate end

00:54:53 and the enzyme site is exactly complementary to it.

00:54:55 So by forcing the ligand

00:54:57 atoms to lie in complementary site spheres

00:54:59 we can hardly miss in this case.

00:55:01 The approach hasn't worked quite as

00:55:03 well on less polar binding sites such as

00:55:05 phospholipase A2.

00:55:07 The eventual goal of this work

00:55:09 is to build new molecules from scratch.

00:55:11 This is a huge combinatorial problem

00:55:13 and it's still not clear what the best approach is.

00:55:15 You'll see some other ideas along

00:55:17 these lines in the papers and the references

00:55:19 from Dean's group.

00:55:21 Here the idea is rather than docking completely

00:55:23 pre-formed structures into the site,

00:55:25 we'll instead try to dock fragments.

00:55:27 These could be phenyl rings, cyclohexyl rings,

00:55:29 amide groups, nitro groups and so on

00:55:31 into the site and then search

00:55:33 for combinations of optimally

00:55:35 docked fragments that could be

00:55:37 assembled together into complete molecules.

00:55:39 From the large number of possible dockings

00:55:41 and the huge number of ways of docking and combining

00:55:43 them, it's clear that the combinatorics here

00:55:45 are enormous and the pruning strategy is going to be required

00:55:47 to keep the possibilities manageable.

00:55:49 In summary,

00:55:51 I hope I've shown you a little bit about

00:55:53 distance geometry and shown you that it's a powerful

00:55:55 model building tool with a wide

00:55:57 range of applications beyond

00:55:59 the usual ones for 2D NMR structure

00:56:01 determination that aren't usually

00:56:03 handled by conventional approaches.

00:56:05 Distance geometry is still just a geometric

00:56:07 model builder and doesn't have any energetic terms

00:56:09 and so it can occasionally generate

00:56:11 high energy conformations.

00:56:13 So distance geometry models should be refined

00:56:15 with molecular mechanics and or dynamics.

00:56:17 I've been

00:56:19 encouraged by the number of recent different approaches

00:56:21 for developing

00:56:23 methods to design new structures.

00:56:25 I think this is where the real challenge of

00:56:27 molecular modeling for drug design lies.

00:56:29 And finally, I'd like to thank Bill Ripken

00:56:31 DuPont for making it possible for me to do this work,

00:56:33 for Gordon Crippen for teaching me

00:56:35 how to do distance geometry and contributing

00:56:37 several important ideas,

00:56:39 and Peter Coleman for being kind enough to invite me here

00:56:41 and let me get warm for a few days.

00:56:43 Thank you.

00:56:57 We are here again to answer your questions

00:56:59 and take your comments. While you are going

00:57:01 to the phone to call in your questions,

00:57:03 I want to alert you to the short break

00:57:05 that will follow the question and answer session.

00:57:07 We will stop for 20 minutes

00:57:09 so you can get a quick bite to eat, maybe

00:57:11 something to drink or just stretch your legs a bit.

00:57:13 So if you're wanting to do any of

00:57:15 those things, please wait for another

00:57:17 15 minutes so that you'll not miss any

00:57:19 of the program. Art, do you have any

00:57:21 comments about Jess' talk?

00:57:23 Yes, I found it very interesting and

00:57:25 actually I'd like to follow up on

00:57:27 some of the latter material you talked

00:57:29 about. In modeling

00:57:31 the receptor, obviously, you have

00:57:33 to use some good three-dimensional

00:57:35 information. Typically, you model

00:57:37 from an x-ray structure.

00:57:43 There's a larger source

00:57:45 of x-ray structures and that's very good,

00:57:47 but it doesn't include all of the information,

00:57:49 especially the dynamics of

00:57:51 the receptor site.

00:57:53 Can you see any way of folding

00:57:55 in the dynamics of the site

00:57:57 into

00:57:59 the modeling and evaluation of

00:58:01 the energies?

00:58:03 During the docking, it's

00:58:05 running the

00:58:07 docking against a moving target that you'd

00:58:09 have in dynamics

00:58:11 is still extremely difficult. You could imagine

00:58:13 generating a dynamics trajectory,

00:58:15 saving it every few steps,

00:58:17 and then generating dockings into those.

00:58:19 A more sensible approach, perhaps,

00:58:21 would be to run a

00:58:23 dynamic trajectory, cluster

00:58:25 those structures that you produced during

00:58:27 the dynamics into families and

00:58:29 then use each one of those different ones

00:58:31 as a target to dock against.

00:58:33 We haven't tried any of this in our work.

00:58:35 At this point, it should be clear

00:58:37 that we're aiming for a pretty low level of resolution

00:58:39 and we're really just trying to get

00:58:41 ideas for structures that could fit well

00:58:43 into a site, which we could

00:58:45 then refine using dynamics

00:58:47 to determine if, in fact, they are

00:58:49 likely to bind tightly.

00:58:51 Do you think that it's going to be a

00:58:53 strong effect?

00:58:55 In other words, that dynamics are going to influence

00:58:57 greatly the

00:58:59 relative order of

00:59:01 the pharmacophores.

00:59:03 I don't think that's clear

00:59:05 and we haven't done that much on it yet.

00:59:07 Okay, we have a caller on the line.

00:59:09 It's Alfred, line 5. Alfred, what is

00:59:11 your question? Yes, this is

00:59:13 Alfred Lowry for Jeff Blaney.

00:59:15 There's a rumor that

00:59:17 your version of the distant

00:59:19 geometry code will

00:59:21 be released by Quantum Chemistry

00:59:23 Program Exchange. Is there a

00:59:25 deadline or a date?

00:59:27 Yeah, as a matter of fact, there is.

00:59:29 We will have that

00:59:31 submitted to QCP sometime within the next

00:59:33 couple weeks.

00:59:35 Which is longer than we'd expected,

00:59:37 but things often

00:59:39 take longer than you'd expect. So it will be

00:59:41 available.

00:59:43 That's a good question, Al.

00:59:45 These are the real questions that people

00:59:47 want to know. How can I get the code and

00:59:49 how do I run it?

00:59:51 I wanted to ask a follow-up on

00:59:53 the distance geometry approach in general

00:59:55 and that is that its strength is

00:59:57 also, in some sense, its weakness.

00:59:59 It's unbiased

01:00:01 in terms of the areas

01:00:03 of conformational space that it explores,

01:00:05 but that also

01:00:07 means that you're spending computation

01:00:09 in unproductive areas of

01:00:11 conformational space that you know, through other information,

01:00:13 might not be

01:00:15 reasonable.

01:00:17 Do you have a way around

01:00:19 that problem?

01:00:21 Well, I don't think you really do spend much time

01:00:23 in conformational space that's

01:00:25 unreasonable. It depends entirely on

01:00:27 how much information you know about

01:00:29 the problem in advance. Any

01:00:31 information you know, you can provide in the form of

01:00:33 distance constraints. It could be NOE distances

01:00:35 in the pharmacophore modeling,

01:00:37 constraints that would overlap atoms on top

01:00:39 of each other. In the latter case,

01:00:41 you clearly don't spend time generating conformers

01:00:43 that can't possibly overlap.

01:00:45 You generate solutions directly

01:00:47 in straight conformational

01:00:49 analysis where you might use distance geometry

01:00:51 to generate purely random conformations.

01:00:53 Then, clearly,

01:00:55 we may be wasting time generating structures

01:00:57 that are high energy or

01:00:59 degenerate by

01:01:01 symmetry, for example, in the case

01:01:03 of highly symmetric ring structures.

01:01:05 I want to interrupt

01:01:07 just a second. Callers, when

01:01:09 you call us, please do not hang up.

01:01:11 Stay on the line. We are not going to hang

01:01:13 up on you. We will tell you

01:01:15 if we can't take your question,

01:01:17 but please don't hang up. I'm sorry.

01:01:19 Please continue.

01:01:21 I guess as a practicing

01:01:23 medicinal chemist, I can ask you this question.

01:01:25 Which

01:01:27 methods do you use most

01:01:29 in terms of

01:01:31 trying to answer this scientific

01:01:33 question that relates to

01:01:35 a product or some

01:01:37 goal outside of your

01:01:39 fundamental research?

01:01:41 My interests are both in

01:01:43 lead generation, trying to come up with

01:01:45 brand new structures that we think would be

01:01:47 active and worth synthesizing,

01:01:49 and also in optimizing leads.

01:01:51 The two problems are somewhat

01:01:53 different. In the latter case of

01:01:55 optimizing a lead, I think

01:01:57 it's a less ambitious problem.

01:01:59 Therefore, it's quite a bit easier.