Friday, April 29, 2005

What’s Wrong with the Force Field?

If people don’t understand it, they won’t trust it.

“I sense a great disturbance in the Force”, Darth Vader says as Obi-Wan Kenobi boards the Super Star Destroyer. I am a believer in the force field that Jedi knights manipulate with miraculous effects. But I have grown wary of the force fields that computational biologists use in their simulations.

To simulate on computers the behavior of a molecular system, say, the binding of a drug to a protein, or the conformational changes in a signaling molecule, it is crucial to calculate the interactions between the atoms accurately. Although quantum mechanics can compute these interactions in principle, in practice empirical energy functions have to be used due to limited computing power. Many empirical energy functions have been developed and affectionately referred to as force fields. In the beginning, there were only simple interaction parameters for inert gases, then carbon monoxide, then water. Then there was the need to simulate proteins, and complex force fields were developed. Many different force fields emerged independently, each christened with its acronym. There are Amber, Charmm, Gromos. There are OPLS-AA and MMFF. There are many more. Each is a cult with followers and they are constantly at war.

After decades of development, however, computer simulation is still greeted with skepticism by biologists. Compared to experimental technologies of similar age, this is a curious exception. True, almost everybody uses simulation now. But whenever there is a disagreement between the simulation and the experiment, the simulation result becomes the primary suspect. The Journal of Molecular Biology recently rejected a computational paper with the comment that it “will only accept computational works that AGREE with experimental results.” This reflects the general attitude toward computer simulation: it is only good for verifying experiments, but not good enough to generate independent hypothesis.

Sadly for the computational biologists, this dismissal of their work is largely justified. Currently, computer simulations are riddled with problems, and many a simulation have produced results far from well-established experimental observations. The force fields bear much blame. Everyone agrees that they are not good enough. (At the same time everyone contends that his or her force field is BETTER than others.) Everyone agrees that something has to be done about them. But what is wrong with the force field?

The common opinion in the computational community is that the force field needs to be more accurate. The definition of accuracy, of course, is tricky, and depends on the specific problems to which simulation is applied. So although everyone is talking about improving the force field, most people are doing just that, talking.

I think that the force field has a much more crippling problem than its inaccuracy. That is it is too complicated! Way too complicated! In the quest of accuracy, increasing number of parameters have been shoved into the energy functions, so that all present force fields are messy compilations of thousands of parameters. I doubt anyone, including the folks who parameterized the force fields, can correctly remember a tenth of the parameters, or even to recollect why these parameters are chosen over others. For example, in most of the force fields, there are many subtypes for each atomic element. Take OPLS-AA force field. It contains 329 subtypes for carbon atoms, 165 for hydrogen, 76 for nitrogen, and 47 for oxygen. Mendeleev would have turned in his grave.

The complexity of the force field is really the cause of its plight. Worse, it is the complexity without underlying order or reason. In the blind pursuit of accuracy, parameters are fitted ad hoc, without justification. The consequence of the Machiavellian philosophy – “the end justifies the means” – is that only the end remains meaningful. The only useful result from the force field calculation is the total energy. The components, such as van der Waals, hydrogen bond, or electrostatic energies, are only believed by the most faithful.

I was once in a group meeting where a graduate student talked about his calculations of the stability of a structural motif in proteins. He used his results to argue that hydrogen bonds were responsible for the stability. Yet the force field used had nothing to describe hydrogen bonds. I proposed that the stability could come from pure electrostatic interactions and asked him about the charges on the atoms as assigned by the force field. Unsurprisingly he had no clue. He was forgivable. How can someone be expected to know these numbers when there are so many?

Most people use the force field as a black box. Unfortunately, people tend to abuse black boxes. When the simulation works, people over-interpret the results. When it fails, people simply sweep it under the rug. Few try to understand the reason behind the success or failure. To do so one has to open the black box. But it is too messy inside.

Contrast that to experimental techniques. An NMR spectroscopist usually has a good knowledge of the chemical shifts of protons in different environments, and a biologist doing fluorescence experiments knows the absorption band and photon efficiency of the labeling dye. NMR spectroscopy and fluorescence labeling are respectable experimental techniques not because they always give correct results. It is because when they go wrong, the experimenters can understand why.

That is the problem with the force field. People do not know why their simulations succeed or fail. It is like a blind cat trying to catch a dead mouse. The cat may stumble upon the mouse, but the cat does not know why and how to do it again.

In science, the simple always supercedes the complicated. The clear is preferable to the confusing. Inaccuracy in the force field is tolerable as long as the applicability and limitations of the force field are understood and predictable. So instead of pushing for better accuracy by adding atom types and parameters, we should simplify the force fields first. Reduce the number of atom types, reduce the number of independent parameters, and minimize the set of standard data to which to fit the parameters. Make it simple, make it comprehensible, and it will be respectable.

May the force be with us.

Monday, April 18, 2005

Subterranean Magic

Saturday, Yang and I took subway from Queens into Manhattan. New York City subway got its weekend relief from its workday congestions, and there were only a handful of passengers in our car. Our ride was roomy and long, so I took the opportunity to show off my card tricks.

It is a challenge to do the tricks for Yang, for she is more interested in catching my sleights than being entertained. This is true for most of my friends. It is the amateur magician’s adversity. When a professional magician performs his art, the audience assumes that his technique is beyond detection, and gives up the attempt to catch his maneuvers. Besides, no one buys the ticket just to shout “wait, he just palmed my card!” Amateur magicians do not have the luxury of a lenient audience. Amateurs beg to perform for their friends, and they delight in any “well done” from the audience. My friends do not watch my tricks for entertaining magic – they go to Vegas or watch television for that – they watch me to catch me. They take it as a puzzle that can be solved. They take me as the weakest and breakable link in the ring of magic.

It was thus no surprise that after I made a card vanish, Yang held my hand and tried to see if I had anything up in my sleeves. I ignored her and continued my routine. Taking out the four aces and placing the rest of the deck into my pocket, I was getting ready for my favorite trick of twisting the aces. Then I heard a man’s voice next to me:

“Oh, that is the end of my fun.”

I turned my head and saw a man with dark curly hair and a strongly contoured face, looking to be in his thirties. Sitting next to him was an attractive young Asian woman, with rather pale complexion and short hair reaching the back of her shoulders. I vaguely remembered this couple sitting across the aisle and a few seats away from us. They must have moved to the adjacent seat after I started my magic.

It was not often that I got a voluntary spectator who enjoyed my magic. I was flattered. I reassured the man: “There is more.”

I did my twisting the aces. When I finish, Yang tried to guess, incorrectly, how I turned the cards between the cards. I looked at the man. He was quite amused with the trick. But he did not show any sign of puzzlement normally expected from a spectator of magic.

I did a few more tricks. Failing to catch my sleights, Yang lost her interest. It was almost our stop. I was about to put the cards away, when the man asked me:

“Who taught you this?”

I did not hear him very well in the clanking noise of the train, and he had to repeat the question twice. Then I told him that I learned card manipulation from McBride’s DVDs. On hearing the name, the woman casually turned her head to the man and asked:

“McBride. Do you know him?”

The man said:

“Yes, I know him.”

I felt something was not right. I managed to say how great a card manipulator McBride was.

The man took out a card, his business card, and handed it to me.

“You give me a call.”

I did not have to look at the card to guess its content. Printed on the card in plain Ariel font was

Richard D. Prestia

Saturday, April 09, 2005

My Operating System

I took a personality test to determine the most fitting OS for me. Not a surprising result.