IDP-ELM: Accurate and Fast Prediction of Intrinsically Disordered Protein
IDP-ELM[1] predicts intrinsically disordered regions (IDRs) and their functions directly from amino-acid sequence — no structure or multiple-sequence alignment required. By combining multiple protein language models with ensemble learning, it reaches state-of-the-art accuracy (0.8469 AUC on nonredundant CAID) at a throughput of ~0.8 s per sequence, predicting three per-residue tracks: disorder (IDR), disordered flexible linkers (DFL), and disordered protein binding (DP).
Run a Prediction
Paste one or more sequences in FASTA format or upload a FASTA file, then submit. Each run opens a dedicated, bookmarkable result page with a per-residue disorder profile.
Predictions currently run on a CPU server, so a job may take from a few seconds to a couple of minutes depending on length. The result page shows live progress, can be bookmarked or shared, and is retained for one week.
How IDP-ELM Works
For each protein language model (PLM), the per-residue representations feed a BiLSTM that predicts secondary structure; its logits are concatenated with the representations and passed to a BiGRU that predicts disorder (IDR); the IDR logits are in turn fed, with the representations, to a further BiGRU that predicts the IDR functions (DFL and DP). The outputs of the per-PLM predictors are then averaged by ensemble learning.
Performance
On the nonredundant CAID[2] benchmark, IDP-ELM outperforms existing disorder predictors across AUC, F1 and MCC while needing only the sequence as input — no MSA generation, which is the slow, sometimes-impossible step for other methods.
Case Studies
Beyond aggregate metrics, IDP-ELM recovers disordered regions that other predictors miss — capturing the bulk of an IDR with few false positives, even for proteins that resist crystallisation.
Predicted Tracks
Each residue receives three probabilities (0–1). A value above 0.5 indicates the residue is predicted to belong to that class.
References
- ^ Xu, S.; Onoda, A. Accurate and Fast Prediction of Intrinsically Disordered Protein by Multiple Protein Language Models and Ensemble Learning. J. Chem. Inf. Model. 2024, 64 (7), 2901–2911. DOI: 10.1021/acs.jcim.3c01202
- ^ Necci, M.; Piovesan, D.; CAID Predictors; DisProt Curators; Tosatto, S. C. E. Critical Assessment of Protein Intrinsic Disorder Prediction. Nat. Methods 2021, 18 (5), 472–481. DOI: 10.1038/s41592-021-01117-3
- ^ Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; et al. Highly Accurate Protein Structure Prediction with AlphaFold. Nature 2021, 596 (7873), 583–589. DOI: 10.1038/s41586-021-03819-2
- ^ The PyMOL Molecular Graphics System, Version 2.0; Schrödinger, LLC.
Please contact shijie.xu@ees.hokudai.ac.jp for any questions.
Changelogs
- 2026-05-28Rebuilt the web interface with a live result page and per-residue disorder profiles.
- 2024-05-28Web server updated.
- 2023-09-12First release.