Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

People have tried applying machine learning to learning many-body potentials in physics for use in speeding up quantum molecular dynamics while maintaining most of the accuracy. What you'd do is say that the total energy of the system is a sum of local potentials on each atom R, where the input is the local environment of atom R:

E = \sum_R e(env(R))

You use some method to create some features env(R) that makes the translational and rotational invariance of e() easy, with a radial cutoff beyond some distance, and then model e somehow. I think the most promising method is Gaussian Approximation Potentials, which use Gaussian processes to model e and (what they call) a bispectral decomposition to represent the local environment around the atom.

http://prl.aps.org/abstract/PRL/v104/i13/e136403 (free arxiv version: http://arxiv.org/abs/0910.1019)

Without the above simplifications like modelling it as a sum of local potentials, and making env(R) a cutoff, you would indeed just be fitting a 3N dimensional function in the case of N atoms. It'd be exact, but it'd also blow up pretty badly and utterly nontransferable. Also, the energy surface isn't necessarily continuous and differentiable -- consider the energy when two atoms move to occupy the same position.

I suspect that the bispectral decomposition to give env(R) could be improved by using unsupervised feature learning to learn better features such as "we're 5 angstrom away from a surface". I've seen talks where people have hand-optimized feature sets to include things like "there's an aromatic ring pointing at us from 5 angstrom away" that a simple function + cutoff might miss.



I think I might have read a different paper on GAP before, this one seems to have some more detail of their philosophy. Thank you very much for the link.

You say that: "Without the above simplifications like modelling it as a sum of local potentials, and making env(R) a cutoff, you would indeed just be fitting a 3N dimensional function in the case of N atoms. It'd be exact, but it'd also blow up pretty badly and utterly nontransferable. Also, the energy surface isn't necessarily continuous and differentiable -- consider the energy when two atoms move to occupy the same position."

Those contraints and representations of the problem are the thing I would want the machine learning algorithm to discover. Is this beyond the scope of the state of the art algorithms in machine learning? I understand that making a good choice for the representation of the problem should make the job of the learning algorithm easier, but finding the representation that is best is quite challenging.


I think it's possible, it'd just take serious computing power. There are a number of physically guaranteed symmetries which would be silly to make a machine learn:

a) Permutation symmetry of identical atoms b) Rotational symmetry of system c) Translational symmetry of system

I suppose it could learn them approximately, given enough examples, but why bother? I think it'd be kind of like not using a convolutional neural net for recognising digits in photos and just using a bazillion more weights and examples.

I'd say you'd start with a completely general learning model which respects the above symmetries and then see where it takes it.

However, I don't know how you'd make a transferable N-body potential from a model taught only on some number of atoms. Again, I guess it's kind of like training a CNN handwriting recogniser on 256x256 images and then applying it to arbitrary sized images, which you can only do by assuming locality and translational symmetry of the features.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: