RNAcentral
RNAcentral is a comprehensive database of non-coding RNA sequences
All data freely available at rnacentral.org with API access, embeddable widgets, and full database dumps. Huge thanks to all @RNAcentral Consortium members for their contributions!
As always, we welcome feedback! Please get in touch and help us make RNAcentral even better!
05.01.2026 01:28 β π 0 π 0 π¬ 0 π 0
Major structural change: RNAcentral now groups related transcripts into gene-level entries! Using ML + graph clustering, we built 103,814 human ncRNA genes from 600,225 transcripts. Better reflects biology and enables comparative analyses.
05.01.2026 01:28 β π 0 π 0 π¬ 1 π 0
LitSumm: large language models for literature summarization of noncoding RNAs
Abstract. Curation of literature in life sciences is a growing challenge. The continued increase in the rate of publication, coupled with the relatively fi
LitSumm uses GPT-4 to generate functional summaries from scientific literature. Currently covers ~4,600 human ncRNAs prioritized by community interest. See AI-powered summaries on any sequence page! Link to LitSumm paper: doi.org/10.1093/data...
05.01.2026 01:28 β π 0 π 0 π¬ 1 π 0
RNAcentral has grown to 45M ncRNA sequences, now includes 52 expert databases, with 10 new databases and major updates to existing sources π
05.01.2026 01:28 β π 0 π 0 π¬ 1 π 0
RNAcentral in 2026: genes and literature integration
Abstract. RNAcentral was founded in 2014 to serve as a comprehensive database of non-coding RNA sequences. It began by providing a single unified interface
πNew RNAcentral paper published in @narjournal.bsky.social! Discover automated literature integration, new expert databases, gene-level entries grouping related transcripts, and more: doi.org/10.1093/nar/...
05.01.2026 01:28 β π 1 π 2 π¬ 1 π 0
RNAcentral - Exploring non-coding RNAs
RNAcentral - Exploring non-coding RNAs
We've just updated our RNAcentral Online Tutorial!
www.ebi.ac.uk/training/onl...
This tutorial provides an overview of RNAcentral and covers different ways of accessing and using the data. It's aimed at anyone with an interest in non-coding RNAs.
As always, we welcome your feedback!
05.11.2025 18:07 β π 10 π 4 π¬ 0 π 0
RNAcentral
Send your feedback about RNAcentral using a webform or submit an issue on GitHub
As always, we welcome feedback! Please get in touch and help us make RNAcentral even better! rnacentral.org/contact
08.10.2025 10:09 β π 0 π 0 π¬ 0 π 0
RNAcentral Release 26
Official blog of RNAcentral, the non-coding RNA sequence database.
Read more in our blog post: blog.rnacentral.org/2025/10/rnac... or our recent preprint: doi.org/10.1101/2025...
08.10.2025 10:09 β π 0 π 0 π¬ 1 π 0
Gene identifiers will be stable across releases, even as new transcripts are added. Each gene gets metadata including RNA type and description from expert databases.
Find genes via text search, sequence pages, or download them in our GFF files.
08.10.2025 10:09 β π 0 π 0 π¬ 1 π 0
We built 103,814 human ncRNA genes from 600,225 transcripts using machine learning + graph clustering. The pipeline was trained on Ensembl/GENCODE data and achieved 99.4% accuracy.
Most genes are lncRNAs (65,187), followed by antisense lncRNAs (16,790) and pre-miRNAs (8,560).
08.10.2025 10:09 β π 2 π 0 π¬ 1 π 0
Why genes? Until now, transcripts differing by a single nucleotide were separate entries. For rRNAs, this meant thousands of nearly identical sequences with no established relationship.
Genes bring biological context and make it easier to find all variants of the same RNA.
08.10.2025 10:09 β π 0 π 0 π¬ 1 π 0
π RNAcentral Release 26 is here! This release introduces our biggest structural change yet: gene-level entries for ncRNAs across 204 organisms.
For the first time, you can explore RNA data at the gene level, not just individual sequences.
π§΅π
08.10.2025 10:09 β π 5 π 3 π¬ 1 π 0
13/12: Co-author update! @nanonancy.bsky.social was also instrumental in helping make sure the summaries were up to scratch! Thanks Nancy!
07.02.2025 14:59 β π 2 π 0 π¬ 0 π 0
12/12 Big thanks to our co-authors @afg781.bsky.social, @antonipetrov.bsky.social, @alexbateman1.bsky.social and others! Read the full paper here: doi.org/10.1093/data... #bioinformatics #LLM #AI
07.02.2025 14:38 β π 2 π 0 π¬ 1 π 0
11/12 But overall, this shows that with careful prompting and checking, LLMs can help address the curation bottleneck in bioinformatics! π―
07.02.2025 14:38 β π 1 π 0 π¬ 1 π 0
10/12 Some limitations: We can only use open-access papers (highlighting the importance of #OpenAccess!), and LLMs sometimes struggle with complex information synthesis.
07.02.2025 14:38 β π 0 π 0 π¬ 1 π 0
9/12 We've also made our entire dataset of contexts and summaries available:
huggingface.co/datasets/RNA...
07.02.2025 14:38 β π 0 π 0 π¬ 1 π 0
8/12 Want to try it yourself? Search for RNAs with summaries at:
rnacentral.org/search?q=has...
07.02.2025 14:38 β π 1 π 1 π¬ 1 π 0
7/12 All summaries are now available through @rnacentral.bsky.social - making it easier than ever to quickly understand what we know about specific RNAs
07.02.2025 14:38 β π 0 π 0 π¬ 1 π 0
6/12 The results? We generated >4,600 summaries covering ~28,700 RNA transcripts! Expert evaluation showed 94% were rated good or excellent quality. π
07.02.2025 14:38 β π 0 π 0 π¬ 1 π 0
5/12 The key innovation is our multi-stage checking system:
Reference validation
Automated fact-checking
Self-consistency verification
This helps ensure accuracy and proper attribution.
07.02.2025 14:38 β π 0 π 0 π¬ 1 π 0
4/12 Our solution: Use GPT-4 with carefully designed prompts to read scientific papers and generate accurate summaries, complete with proper citations! π€
07.02.2025 14:38 β π 0 π 0 π¬ 1 π 0
3/12 We focused on non-coding RNAs, where the curation gap is particularly acute. Most databases lack good summaries of what each RNA does, making it harder for researchers to quickly understand their function.
07.02.2025 14:38 β π 0 π 0 π¬ 1 π 0
2/12 Why did we build this? Curation of scientific literature is becoming increasingly challenging. There's a growing gap between publication rates and the number of available curators.
07.02.2025 14:38 β π 0 π 0 π¬ 1 π 0
1/12 Excited to share our new paper in DATABASE on LitSumm - our system that uses large language models to automatically generate high-quality literature summaries for non-coding RNAs! π§¬π
07.02.2025 14:38 β π 6 π 3 π¬ 1 π 2
Structural biology & bioinformatics @IIMCB, policy-for-science & science-for-policy @PAN_akademia @EMBO @acad_euro
RNA Resources Project Leader at @ebi.embl.org RNAcentral, Rfam
Pediatric neuro-oncologist, Asst Prof U-Michigan | Cancer researcher | RNA & ribosome enthusiast | Broad Institute, DFCI alum | book & music & tea lover | Dad | Views are mine (he/him). https://prensnerlab.org
The Molecular Interactions Database at the EMBL-EBI. Open Source Framework for Molecular Interactions.
https://www.ebi.ac.uk/intact/home
Also followβ‘οΈ@complexportal.bsky.social
Fungal Biologist with love for pathogens and biotech. Non coding RNAs and chromatin
Trying to make sense of #RNA information.
She/Her. Project leader at @ebi. Developer. STEM ambassador. Creator of the Protein families game. Traveller. Photo. SBK dancer.
InterPro provides functional analysis of proteins by classifying them into families and predicting domains and important sites.
ebi.ac.uk/interpro/
The Pfam database is a large collection of protein families, domains and repeats. https://www.ebi.ac.uk/interpro/entry/pfam/#table
The European Molecular Biology Laboratory drives visionary basic research and technology development in the life sciences. www.embl.org
Associate Professor at Charles University | Protein & RNA structural bioinformatics |Protein annotation (sequence & structure)
https://bioinformatics.cuni.cz/hoksza/
MA, ChartPR
Deputy Head of Comms at EMBL's European Bioinformatics Institute (EMBL-EBI)
Talk to me about books, F1, science and cheese. Views are my own.
Director of Modified Bases Research at @nanoporetech.com
EMBL-EBI alumni, keen cyclist and father of two.
Views are my own. Reposts are not endorsement.
The Genomics & Bioinformatics Core Facility at the University of Edinburgh
https://genomics.ed.ac.uk/
Bioinformatician at EMBL EBI. Working on making experimental macromolecular complex data more accessible to the scientific community. Also working with RNA structures.
https://orcid.org/0000-0001-6609-3535
We study RNA Biology using a variety of approaches.
medicine.yale.edu/lab/steitz/
The Ensembl project seeks to enable genomic science by providing high-quality, integrated annotation.
Vertebrates: www.ensembl.org
Non-vertebrates: www.ensemblgenomes.org
You can test the new Ensembl browser and share your feedback at beta.ensembl.org
Rfam is a database of ncRNA families. Follow us to hear about new ncRNAs, website updates, and more
Reactome is an open source curated pathway database that provides pathway and network analysis tools for life science researchers.
Developing macromolecular structure resources and tools for life science.
- Founding partner of the wwPDB.
- Manage the community-led PDBe-KB resource and the AlphaFold Protein Structure Database, a collaboration with Google DeepMind.
- Part of EMBL-EBI