Integrating multiomics data to revolutionize the identification of bioactive compounds
Ribosomally synthesized and post-translationally modified peptides (RiPPs) represent a rapidly expanding class of natural products with diverse biological activities and pharmaceutical potential .
Traditional discovery methods for RiPPs are labor-intensive and often fail to identify novel compounds due to the limitations of conventional screening approaches . The integration of multiomics data provides an unprecedented opportunity to accelerate the discovery process .
DeepRiPP addresses these challenges by implementing a comprehensive computational framework that combines genomics, transcriptomics, and metabolomics data with advanced machine learning algorithms to predict and prioritize novel RiPPs for experimental validation .
DeepRiPP integrates multiple data types including genomic sequences, transcript expression profiles, and mass spectrometry data to create a comprehensive profile of potential RiPP biosynthetic gene clusters .
Advanced neural networks and ensemble methods are employed to identify patterns indicative of RiPP biosynthesis and predict novel compounds with high confidence .
DeepRiPP represents the first framework to systematically integrate multiomics data with deep learning for RiPP discovery, significantly outperforming previous computational methods .
Multiomics data is collected from various sources and standardized for analysis. This includes genome sequencing, RNA-seq data, and metabolomic profiles .
Key features are extracted from the integrated datasets, including sequence motifs, expression patterns, and spectral signatures .
Deep learning models are trained on known RiPPs to recognize patterns associated with biosynthesis and bioactivity .
The trained models predict novel RiPPs and rank them based on confidence scores and potential bioactivity .
Top candidates are selected for laboratory validation, significantly reducing the search space for experimentalists .
DeepRiPP has demonstrated a significant increase in the rate of novel RiPP discovery compared to traditional methods .
The automated workflow reduces the time from data collection to candidate identification by over 70% .
Discovery of 3 novel antimicrobial peptides from soil microbiome data .
Identification of 2 RiPPs with promising anticancer activity .
Prediction and validation of a novel protease inhibitor RiPP .
Accelerated identification of novel therapeutic compounds with potential applications in treating infectious diseases, cancer, and metabolic disorders .
Engineering of novel RiPPs for industrial applications including biocatalysis, biomaterials, and agricultural products .
Uncovering new biological mechanisms and expanding our understanding of natural product biosynthesis .
Future developments will focus on expanding the framework to include additional data types, improving prediction accuracy through advanced AI models, and developing user-friendly interfaces for broader adoption .