CONTRA-IL6 framework accurately predicts IL-6-inducing peptides using protein language models
Background
Interleukin-6 (IL-6) is a crucial immunomodulatory cytokine involved in diverse physiological and pathological conditions, including autoimmune diseases, cancers, and cytokine storms. Peptides capable of inducing IL-6 expression are key modulators of host immune responses, making them promising candidates for therapeutic design and epitope-based vaccine development. However, experimentally identifying these IL-6-inducing peptides is laborious and not suitable for large-scale screening. Existing computational methods often fail to capture both global contextual semantics and local motif-level features essential for accurate prediction of peptide immunogenicity.
Study Design
Researchers developed CONTRA-IL6, a novel deep learning framework designed to predict IL-6-inducing peptides. The framework integrates Transformer fusion and convolutional localization modules with stacked pretrained protein language model embeddings. They performed comprehensive benchmarking on an independent dataset, comparing CONTRA-IL6's predictive performance against six state-of-the-art predictors. The study also employed feature space visualizations (uniform manifold approximation and projection, kernel density estimation), 1D gradient-weighted class activation mapping++, and in silico mutagenesis to interpret model decisions and confirm functional importance. Ablation studies further assessed the contribution of global and local modules.
Results
CONTRA-IL6 demonstrated superior predictive performance compared to six existing state-of-the-art methods. Notably, it achieved the highest Matthews correlation coefficient (MCC) of 0.504 and an F1 score of 0.549. This represents an improvement of 3.2% in MCC and 4.3% in F1 over the best-performing existing method, indicating balanced and robust performance. Feature space visualizations showed clear class separation between IL-6-inducing and non-inducing peptides. 1D gradient-weighted class activation mapping++ highlighted strong attention to specific C-terminal regions of the peptides. > Crucially, in silico mutagenesis causally confirmed the functional importance and physicochemical constraints of these identified regions, moving beyond mere attribution. Ablation studies further validated the synergistic contribution of both global and local modules to the model's overall performance.
Key Findings
CONTRA-IL6achieved a Matthews correlation coefficient (MCC) of 0.504.- The framework demonstrated an F1 score of 0.549.
CONTRA-IL6improved over the best existing method by 3.2% in MCC and 4.3% in F1.- Feature space visualizations showed clear class separation for IL-6-inducing peptides.
In silico mutagenesiscausally confirmed the functional importance of C-terminal regions.
Why It Matters
This interpretable deep learning framework offers a significant leap forward for immunoinformatics research, streamlining the identification of novel IL-6-inducing peptides. CONTRA-IL6's robust and scalable nature means researchers can now rapidly screen vast peptide libraries, accelerating the discovery of candidates for therapeutic design against conditions like autoimmune diseases or for epitope-based vaccine development. The interpretability features, like in silico mutagenesis, provide critical insights into the structural determinants of IL-6 induction, guiding rational peptide engineering. This tool could dramatically reduce the time and cost associated with experimental validation, making the discovery process more efficient.
il-6
immunomodulation
deep-learning
peptide-prediction
computational-biology
vaccine-development