Enhancing scientific discoveries in molecular biology with deep generative models


Generative models provide a well established statistical framework for evaluating uncertainty and deriving conclusions from large data sets especially in the presence of noise, sparsity and bias. Initially developed for computer vision and natural language processing, these models have been shown to effectively summarize the complexity that underlies many types of data and enable a range of applications including supervised analysis, such as assigning labels to images, unsupervised tasks such as dimensionality reduction, and extrapolation analysis such as de-novo generation of artificial images. With this early success, the power of generative models is now being increasingly leveraged in molecular biology, with applications ranging from designing new molecules with properties of interest, to identifying deleterious mutations in our genomes, and to making sense out of transcriptional variability between single cells. In this review, we provide a brief overview of the technical notions behind generative models and their implementation with deep learning techniques. We then describe several different ways in which these models can be utilized in practice, using several recent applications in molecular biology as examples.

Journal article
Molecular Systems Biology