jknafou/TransBERT-bio-fr · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Nice momentum for TransBERT : 4 → 54 downloads in a month !
It's encouraging to see that machine-translated corpora can support high-quality domain-specific models for low-resource languages. Looking forward to the next steps. 🗺️
huggingface.co/jknafou/TransBERT-bio-fr
03.12.2025 09:39 — 👍 0 🔁 0 💬 0 📌 0
📣 The call for tutorials & workshops at #ECCB2026 is now open! Share your tools, methods or expertise with the community.
🗓️ Deadline: 5 January 2026
👉 Submit: tinyurl.com/tw-eccb26
💡 ECCB will take place on 31 Aug–4 Sept in Geneva, gathering 1,000+ scientists from academia, industry, & healthcare.
03.11.2025 08:09 — 👍 19 🔁 15 💬 0 📌 4
Revisiting on last month’s #SciDataCon2025 session co-chaired by Julien Gobeill and Wolmar Nyberg Åkerström on transforming supplementary data into FAIR datasets 🔍🔓🔗♻️
Such an important challenge for the community.
#FAIRdata #OpenScience
20.11.2025 06:11 — 👍 2 🔁 0 💬 0 📌 0
TransBERT: A Framework for Synthetic Translation in Domain-Specific Language Modeling
Julien Knafou, Luc Mottin, Anaïs Mottaz, Alexandre Flament, Patrick Ruch. Findings of the Association for Computational Linguistics: EMNLP 2025. 2025.
Two weeks ago, Julien Knafou presented his work at #EMNLP2025 in Suzhou.
TransBERT is a framework for pre-training LM using synthetically translated text, and TransCorpus is our toolkit for creating large-scale translated corpora.
Try it here
huggingface.co/jknafou/Tran...
github.com/jknafou/Tran...
19.11.2025 10:41 — 👍 0 🔁 0 💬 0 📌 1
ELIXIR Platforms co-located event - autumn 2025 | ELIXIR
The SIB TM group will attend the ELIXIR Data Platform at the Co-Located Event in Marseille (17–20 Nov 2025)! We look forward to discussions on curated data, FAIR practices, AI-ready datasets, and cross-platform collaboration in biomedical and biodiversity research.
elixir-europe.org/events/elixi...
17.11.2025 17:54 — 👍 0 🔁 0 💬 0 📌 0
Hello world! The SIB Text Mining Group just landed on Bluesky!
We process clinical, scientific and other types of documents to help biomedical / biodiversity experts make sense of complex information.
Expect smart search tools, data science, AI… and plenty of text wizardry 🧙♂️✨📝 Stay tuned!
17.11.2025 17:33 — 👍 3 🔁 0 💬 0 📌 0