Does chemistry need machine learning?

The Question :

24 people think this question is useful

In many fields of science (e.g. biology, medicine, psychology, statistics, physics), machine learning and artificial intelligence techniques are becoming more and more popular to analyze data. Is it also so in chemistry?

The Question Comments :
  • On there is a nice series of articles on machine learning in chemistry.
  • The main challenge here is the paucity of large open datasets in Chemistry. When you can get the data, machine learning techniques often work.
  • @chrishmorris Welcome to Chemistry.SE! Please take the tour to get familiar with this site. I have converted your answer to a comment, because in this state it does not really fit the philosophy of this site.
  • Try out Chembrows ( in parallel to your usual literature search, and you may see how well (predictive / suggestive) AI works for you (or not).
  • For one thing, ML is already used in drug development and similar R&D situations.

The Answer 1

23 people think this answer is useful

The short answer is yes. Machine learning, data mining, AI and other techniques are highly useful in chemistry.

I completely agree with Fred’s answer that lots of machine learning, expert systems and statistical analysis in chemistry goes back a long time. This is particularly true in analytical chemistry – match a mass spec or NMR or IR against a library of known compounds.

Now I saw your tag “computational chemistry” and there are some newer applications here. Basically the question is whether you can predict properties (e.g., the heat of formation, dipole moment, etc.) that would normally require quantum mechanics.

There are a few groups attempting this, but one of the more successful attempts so far has come from Anatole von Lilienfeld.

The researchers found that with a landscape of more than 5000 molecules, the error for predicting atomisation energies of new molecules drops below 10kcal/mol, approaching the 5kcal/mol accuracy of hybrid DFT. ‘Calculating a molecule’s atomisation energy using hybrid DFT would take on average one hour on a single CPU,’ says von Lilienfeld. ‘With machine learning, it’s milliseconds.’

There are more conventional approaches to use machine learning, evolutionary algorithms and the like to perform parameterization of force fields, semiempirical quantum mechanics, etc.

Basically, if you have a lot of data, machine learning techniques can be effective ways to analyze the data and use it for other purposes.

The Answer 2

14 people think this answer is useful

Yes. It’s not a new thing at all in chemistry; you’ll find papers on chemical AI applications going back four decades or more.

We have massive amounts of data that must be processed quickly. For example, a diode array detector in HPLC can collect large numbers of spectra per minute at hundreds of wavelengths, and we need to use that data to distinguish and possibly identify two closely eluting compounds coming off the column. Intelligent, automated data analysis and pattern recognition is a must. A lot of “artificial nose” sensors have similar requirements; they can use neural nets to answer questions like “are these potatoes infected with dry rot?” We also use expert systems in various areas of analytical chemistry.

Here are some nice overviews by Hugh Cartwright:

Using Artificial Intelligence in Chemistry and Biology: A Practical Guide (CRC Press)

Applications of Artificial Intelligence in Chemistry (Oxford University Press)

Development and Uses of Artificial Intelligence in Chemistry (Reviews in Computational Chemistry, Volume 25, Wiley)

The Answer 3

5 people think this answer is useful

I know I’m a little late to this party, but in the last few years there have been some potentially very important developments in the application of machine learning to chemistry. Both of which apply to molecular dynamics.

First, one potentially obvious observation is that when performing a molecular dynamics simulation, each of the time steps is highly correlated with the previous time step, assuming the time step employed is small enough to calculate any meaningful property. This is a perfect situation for machine learning because it means that after using some of these time steps as a training phase, the subsequent time steps can be simulated with minimal loss of accuracy extremely quickly. This idea, and its application to ab initio molecular dynamics, is discussed in ref. [1]. This is huge because it means with a clever use of machine learning, ab initio MD may be possible on much larger time scales than have been done before.

As an interesting point, one of the keys is that MD simulations are often done because eventually some abnormal, but probably important configuration is sampled. This means that a so-called decision engine must be implemented so that the machine learning can switch off and start learning again from the unusual configuration. This is also described in ref. [1].

The second important development is in the construction of molecular force fields which are very commonly used in MD simulations. The purpose of these force fields is that they are much faster than solving the Schrodinger equation approximately. Force fields developed from machine learning potentially solve two problems which normal force fields face. First, normal force fields are only valid in a very specific context for which they were developed. Mainly, they are limited by the functional form upon which they were built. Machine learning also has the problem of needing a training set, but machine learning force fields are adaptive and thus able to become more robust upon visiting configurations not previously encountered. Second, machine learning force fields can be extended to new atoms and types of molecules without starting completely from scratch. After all, the same laws of physics apply to different elements, so the force field just needs to learn, approximately, how these same laws apply to different elements. Basically, it’s faster than starting over from zero. Considerations such as these can be found in ref. [2].

I have also seen a paper at some point which discussed machine learning as a method for providing better guesses of the optimized geometry of a system, which would require fewer iterations using a QM method. Also, I think I have seen a paper about using machine learning for finding new minima on complicated potential energy surfaces such as those of large water clusters. This is related to the optimization point.

Basically, the possibilities are endless.

[1] Botu, V., & Ramprasad, R. (2015). Adaptive machine learning framework to accelerate ab initio molecular dynamics. International Journal of Quantum Chemistry, 115(16), 1074-1083.

[2] Botu, V., Batra, R., Chapman, J., & Ramprasad, R. (2016). Machine learning force fields: construction, validation, and outlook. The Journal of Physical Chemistry C, 121(1), 511-522.

The Answer 4

-5 people think this answer is useful

If we don’t use artificial intelligence in chemistry , then chemistry will be the only field amongst all , not to have used that .
But the truth is that lot of artificial intelligence has already been depolyed in chemistry , without us being known that it is being used.

Artificial inteligence can solve almost all the problems in the chemistry.

An interesting article for this can be found in wikipedia

Many chemist think that artificial intelligence is tough , but in fact it is very easy. A chemistry person with basic computer knowledge can learn it in 6-7 days , and can deploy it in millions of ways in Chemistry.

An application of artificial intelligence has been developed by Wiley .It is a software called chemplanner .It uses is that it can synthesize any organic molecule in the world that a human can think of , in matter of seconds , using literature of already existing reactions in the world.
So , with the help of this artificial intelligence software by wiley , you can synthesize any organic molecule in the world .

Add a Comment