AI for Natural Product Drug Discovery

Abstract

This review mainly calls for increasing the application of artificial intelligence in natural product research, including genome and metabolome mining, natural product structural characterization, natural product target and biological activity prediction, etc., and discusses the “critical challenges” of creating and maintaining large, high-quality datasets for training algorithms.


Introduction to Specialized Metabolites in Nature

Bacteria, fungi, plants, and animals produce a variety of specialized metabolites, including peptides, polyketides, sugars, terpenes, and alkaloids. These natural products play crucial roles in complex interorganism interactions, acting as signals, weapons, nutrient scavengers, and stress protectors. Although previously commonly used as antibiotics, chemotherapeutics, immunosuppressants, and crop protection agents, natural products have become less popular in industry in recent years than before due to the rise of combinatorial chemistry and high-throughput screening.

Biosynthetic Gene Clusters: A Pathway to Drug Discovery

The genes for most metabolite biosynthetic pathways in bacteria and fungi (and some plants and animals) occur as clusters in the genomes of the producing organisms: more than 2,500 of these biosynthetic gene clusters (BGCs) and their products have now been experimentally identified Characterized. This physical clustering has the potential to facilitate the identification of millions of putative new molecular biosynthetic pathways through computational genome analysis, providing a starting point for drug discovery.

The Role of AI in Predicting Biosynthetic Pathways

AI is currently being used to predict the chemical structure of BGC products based on DNA sequences, and key training data can be obtained through known biosynthetic pathways and their natural products. However, there is an urgent need for more efficient methods to filter and prioritize the large predicted biosynthetic diversity of natural products to identify drug leads.

Figure 1: Applications of artificial intelligence in natural product and drug discovery

840151cd439fc2d71b1a8ffa4b68ec57  

Figure 2: Example of natural product molecules discovered using AI

Including using the chemprop algorithm to discover the new antibiotic Halicin; using a convolutional neural network to predict the structures of rivulariapeptolides and symplocolide A from complex microbial extracts; using SVM to discover Prstinin A3 by mining whole-genome information.   c71a6cddc6bcc92236bed7cf37f8ce3b  

Figure 3: Prediction of bioactive and macromolecular targets based on genomic, metabolomic, and phenotypic data

7a241b1bce092e62b6bee722fa153076  

Figure 4: Molecular characterization of commonly used natural products, including pharmacophore, molecular fingerprint, SMILES, 3D dynamics and intermolecular interactions

2adb04f979ea4cf30a082e43953c2349  

Figure 5: Storing and sharing natural product data: infrastructure and incentives

cfcbec49258214dd325ef3181a34727c

Related Reading

View More