Vision

Determining molecular or reaction properties via quantum-mechanical calculations or wet-lab experiments can be expensive and time-consuming, slowing down progress in developing new catalysts or designing new compounds with desired properties. Here, machine learning can help to accelerate such investigations. However, an efficient and accurate modeling requires the tailoring of machine learning representations and architectures to the peculiarities of chemistry, for example regarding chirality, or long-range interactions. We therefore actively develop new machine learning representations and architectures, and therefore operate at the interface of computer science, data science, and chemistry. Our vision is to spark and foster the emerging field of reaction machine learning, providing high-quality open-source software and benchmark data.

How to Tailor Deep Learning to Chemical Reactions

We develop and maintain machine learning software for chemical reaction properties, paying increased attention to representing the information in a reaction in a meaningful way, bringing together the insights of chemists and software engineers. We explore and develop different architectures such as graph-convolutional neural networks, graph transformers, and mixtures thereof.

Selected References:

  • Chemprop: A Machine Learning Package for Chemical Property Prediction. E. Heid, K. P. Greenman, Y. Chung, S.-C. Li, D. E. Graff, F. H. Vermeire, H. Wu, W. H. Green, C. J. McGill. J. Chem. Inf. Model. (2023), 64, 9-17  DOI:10.1021/acs.jcim.3c01250
  • Machine learning of reaction properties via learned representations of the condensed graph of reaction. E. Heid, W. H. Green. J. Chem. Inf. Model. (2022) 62, 2101-2110, DOI:10.1021/acs.jcim.1c00975
  •  

Machine Learning for Sustainable Chemistry

Building on our reaction machine learning software, we aim to apply our models to organocatalytic and biocatalytic reactions, with a focus on chemo-, regio-, and stereoselective reactions. With strong experimental collaborations, this project strives to make a real-world impact, and bring deep learning to the lab in an easily accessible fashion.

Selected References:

  • EnzymeMap: Curation, validation and data-driven prediction of enzymatic reactions. E. Heid, D. Probst, W. H. Green, G. K. H. Madsen. Chem. Sci. (2023), 14, 14229-14242.  DOI:10.1039/D3SC02048G
  •  

Uncertainty Quantification

Machine-learned predictions of chemical properties are only useful if we can estimate the reliability of the model on new test data, possibly even under distribution shifts. We therefore develop best practices as well as uncertainty quantification routines for chemical machine learning, which can necessitate (or enable) different approaches compared to other fields of machine learning. For example, when predicting the energy of a molecule as a function of atomic coordinates, we also get the atomic forces as the negative gradient of the energy versus the atomic positions, which can be used to get more reliable uncertainty metrics.

Selected References:

  • Spatially resolved uncertainties for machine learning potentials. E. Heid, J. Schörghuber, R. Wanzenböck, G. K. H. Madsen.  J. Chem. Inf. Model. (2024), 64, 6377-6387 DOI:10.1021/acs.jcim.4c00904
  • Deep Ensembles vs. Committees for Uncertainty Estimation in Neural-Network Force Fields: Comparison and Application to Active Learning. J. Carrete, J. Montes-Campos, R. Wanzenböck, E. Heid, G. K. H. Madsen. J. Chem. Phys. (2023), 158, 204801 DOI:10.1063/5.0146905
  • Characterizing Uncertainty in Machine Learning for Chemistry. E. Heid, C. J. McGill, F. Vermeire, W. H. Green. J. Chem. Inf. Model. (2023), 63, 4012-4029 DOI:10.1021/acs.jcim.3c00373

Cheminformatics

To create and curate molecular or reaction property databases, we apply and develop an extensive set of cheminformatics tools, which use heuristic rules to manipulate molecular graphs or string representations thereof.

Selected References:

  • EnzymeMap: Curation, validation and data-driven prediction of enzymatic reactions. E. Heid, D. Probst, W. H. Green, G. K. H. Madsen. Chem. Sci. (2023), 14, 14229-14242.  DOI:10.1039/D3SC02048G
  • On the Value of Using 3D Shape and Electrostatic Similarities in Deep Generative Methods. G. Bolcato, E. Heid, J. Boström. J. Chem. Inf. Model. (2022) 62, 1388-1398, DOI:10.1021/acs.jcim.1c01535
  • EHreact: Extended Hasse diagrams for the extraction and scoring of enzymatic reaction templates. E. Heid, S. Goldman, K. Sankaranarayaran, C. W. Coley, C. Flamm, W. H. Green. J. Chem. Inf. Model (2021) 61, 4949-4061, DOI:10.1021/acs.jcim.1c00921

How to Contribute

Interested to join our interdisciplinary team of chemists and programmers?

For open positions, please contact esther.heid@tuwien.ac.at.