π Congratulations to Assistant Professors @abosselut.bsky.social (IC), @bunnech.bsky.social (IC & SV), and @mschrimpf.bsky.social (IC & SV) for being selected as #AI2050 Early Career Fellows by @schmidtsciences.bsky.social !
π Full article: actu.epfl.ch/news/epfl-pr...
11.11.2025 16:48 β π 7 π 2 π¬ 1 π 0
Chenhao Tan's Homepage - recruiting
Chenhao Tan's Homepage
Recruiting PhDs & postdocs for:
π€ agents "taking over" science (hypogenic.ai and π)
π§ͺ Real scientists β‘οΈAI (e.g., materials, chem, physics)
π Theory + incentives for H-AI collab & credit (e.g., formalizing tacit knowledge)
new adventures for me, π if you can! π
chenhaot.com/recruiting.h...
03.11.2025 20:06 β π 8 π 3 π¬ 0 π 0
If you're interested in doing a postdoc at @icepfl.bsky.social , there's still time to apply for the @epfl-ai-center.bsky.social postdoctoral fellowships.
Apart from this, I'm also recruiting postdocs in developing novel training algorithms for reasoning models and agentic AI.
14.10.2025 17:56 β π 8 π 2 π¬ 1 π 0
Join us again at #MELT workshop (520D) at #COLM2025 to hear from @ImanolSchlag about #Apertus, the largest multilingual LLM trained on over 1000 languages.
10.10.2025 15:36 β π 2 π 0 π¬ 0 π 0
Kicking off #MELT workshop at #COLM2025 with Monojit Choudhury talking about "Meta-Cultural Competence: What LLMs Should Know About Culture to Serve the Next Billion Users" !
10.10.2025 13:15 β π 5 π 0 π¬ 0 π 1
Come join us in 520D (all the way down the hall and around the corner) at #COLM2025 for the first workshop on multilingual and equitable language technologies!
10.10.2025 12:53 β π 2 π 1 π¬ 0 π 0
Very happy this paper got accepted to NeurIPS 2025 as a Spotlight! π
Main takeaway: In mechanistic interpretability, we need assumptions about how DNNs encode concepts in their representations (eg, the linear representation hypothesis). Without them, we can claim any DNN implements any algorithm!
01.10.2025 15:00 β π 25 π 4 π¬ 0 π 0
What's the right unit of analysis for understanding LLM internals? We explore in our mech interp survey (a major update from our 2024 ms).
Weβve added more recent work and more immediately actionable directions for future work. Now published in Computational Linguistics!
01.10.2025 14:03 β π 40 π 14 π¬ 2 π 2
I don't see why the answer would be no but since you specifically say "October", what if we submitted to ARR in July and want to do early submission to ACL 2026 ?
29.09.2025 20:03 β π 1 π 0 π¬ 0 π 0
1/π¨ New preprint
How do #LLMsβ inner features change as they train? Using #crosscoders + a new causal metric, we map when features appear, strengthen, or fade across checkpointsβopening a new lens on training dynamics beyond loss curves & benchmarks.
#interpretability
25.09.2025 14:02 β π 14 π 6 π¬ 2 π 0
π‘Can we optimize LLMs to be more creative?
Introducing Creative Preference Optimization (CrPO) and MuCE (Multi-task Creativity Evaluation Dataset).
Result: More novel, diverse, surprising textβwithout losing quality!
π Appearing at #EMNLP2025
22.09.2025 13:43 β π 6 π 4 π¬ 1 π 0
Special thanks to everyone that participated in this journey!
03.09.2025 09:26 β π 1 π 0 π¬ 0 π 0
swiss-ai (Swiss AI Initiative)
Org profile for Swiss AI Initiative on Hugging Face, the AI community building the future.
(5) Transparency: We're fully open, pairing our weights with a full suite of reproduction artifacts.
Check out our artifacts and technical report here: huggingface.co/swiss-ai
03.09.2025 09:26 β π 2 π 0 π¬ 1 π 0
(4) Multilinguality: We pretrain the model on 15T tokens from 1811 languages, and post-train with 3.8 M examples from 149 languages
03.09.2025 09:26 β π 2 π 0 π¬ 1 π 0
(3) Memorization Prevention: Adopting the Goldfish objective, we suppress verbatim recall and reduce risks of memorization
03.09.2025 09:26 β π 1 π 0 π¬ 1 π 0
(2) Data Compliance: we pretrained exclusively on openly available data, retroactively respecting robots.txt exclusions and filtering for copyrighted, non-permissive, toxic, and personally identifiable content
03.09.2025 09:26 β π 2 π 0 π¬ 1 π 0
What makes Apertus special?
(1) Scale: Apertus-70B is the first fully open model to be trained at 70B parameter scale on 15T tokens, requiring us to scale out training to 4096 GPUs at
@cscsch.bsky.social
03.09.2025 09:26 β π 2 π 0 π¬ 1 π 0
The next generation of open LLMs should be inclusive, compliant, and multilingual by design. Thatβs why we @icepfl.bsky.social @ethz.ch @cscsch.bsky.social ) built Apertus.
03.09.2025 09:26 β π 25 π 8 π¬ 2 π 2
Apertus: a fully open, transparent, multilingual language model - EPFL AI Center
EPFL, ETH Zurich and the Swiss National Supercomputing Centre (CSCS) released Apertus today, Switzerlandβs first large-scale, open, multilingual language model β a milestone in generative AI for trans...
EPFL, @ethz.ch and the @cscsch.bsky.social released Apertus today, Switzerlandβs first large-scale, open, multilingual language model β a milestone in generative AI for transparency and diversity.
Find out more here: ai.epfl.ch/apertus-a-fu...
@abosselut.bsky.social @icepfl.bsky.social
02.09.2025 09:46 β π 18 π 7 π¬ 0 π 2
EPFL, ETH Zurich & CSCS just released Apertus, Switzerlandβs first fully open-source large language model.
Trained on 15T tokens in 1,000+ languages, itβs built for transparency, responsibility & the public good.
Read more: actu.epfl.ch/news/apertus...
02.09.2025 11:48 β π 54 π 29 π¬ 1 π 6
Very happy to see that Pleias multilingual data processing pipelines have contributed to the largest open pretraining project in Europe.
From their tech report: huggingface.co/swiss-ai/Ape...
02.09.2025 16:46 β π 30 π 10 π¬ 2 π 0
Apertus: ein neues Sprachmodell fΓΌr die Schweiz
Die Schweiz steigt ins Rennen der grossen Sprachmodelle ein. Unter dem Namen #Apertus verΓΆffentlichen @ethz.ch, @icepfl.bsky.social und das @cscsch.bsky.social das erste vollstΓ€ndig offene, mehrsprachige #LLM des Landes.
FΓΌrs MAZ habe ich Apertus kurz analysiert:
www.maz.ch/news/apertus...
02.09.2025 08:33 β π 25 π 7 π¬ 3 π 1
Thank you for your incredible work!
02.09.2025 18:23 β π 1 π 0 π¬ 0 π 0
π¨New Preprint!
In multilingual models, the same meaning can take far more tokens in some languages, penalizing users of underrepresented languages with worse performance and higher API costs. Our Parity-aware BPE algorithm is a step toward addressing this issue: π§΅
11.08.2025 12:28 β π 28 π 7 π¬ 3 π 0
Kaiserslautern, Germany
π£ Life update: Thrilled to announce that Iβll be starting as faculty at the Max Planck Institute for Software Systems this Fall!
Iβll be recruiting PhD students in the upcoming cycle, as well as research interns throughout the year: lasharavichander.github.io/contact.html
22.07.2025 04:12 β π 90 π 12 π¬ 13 π 4
A language model built for the public good Β Β Β - EPFL AI Center
ETH Zurich and EPFL will release a large language model (LLM) developed on public infrastructure. Trained on the βAlpsβ supercomputer at the Swiss National Supercomputing Centre (CSCS), the new LLM ma...
EPFL and ETH ZΓΌrich are building together a Swiss made LLM from scratch.
Fully open and multilingual, the model is trained on CSCS's supercomputer "Alps" and supports sovereign, transparent, and responsible AI in Switzerland and beyond.
Read more here: ai.epfl.ch/a-language-m...
#ResponsibleAI
09.07.2025 07:26 β π 10 π 3 π¬ 0 π 0
Check out Silin's paper done in collaboration with Apple on reinforcing abstract thinking in reasoning traces!
23.06.2025 18:55 β π 3 π 0 π¬ 0 π 0
π Workshop on multilingual & culturally aware AI.
Co-located with @colmweb.org⬠2025 in Montreal, Canada
https://melt-workshop.github.io/
Assistant Professor at UCLA. Alum @StanfordNLP. NLP, Cognitive Science, Accessibility. https://www.coalas-lab.com/elisakreiss
ACL Rolling Review (https://aclrollingreview.org)
Tweets by the ARR Communications / Support Team
The largest workshop on analysing and interpreting neural networks for NLP.
BlackboxNLP will be held at EMNLP 2025 in Suzhou, China
blackboxnlp.github.io
Assistant professor at the Hebrew University.
Professor of Data Science
Lead of @ds-hamburg.bsky.social
Researching Safe Generative AI
Professor of Computational Linguistics at Heidelberg University Β· Working on NLP for NLU and Artificial Intelligence Β· Leads the Heidelberg University NLP Group @hd-nlp.bsky.social #nlproc #ML.
Assistant Prof. @csaudk.bsky.social | Fellow @cphsodas.bsky.social
Previous: @icepfl.bsky.social @americanexpress @Xerox @Intel
Interests: π₯ΎποΈπ΄ββοΈποΈββοΈπΈ
#NLProc #LLMs #AgenticAI #Causality #GraphML
https://www.cs.au.dk/~clan/people/aarora
LLM Security at NVIDIA
Prof in CS/NLP at IT University of Copenhagen
garak guy, garak.ai
"berΓΈmt skikkelse"
"like a gazelle"
Copenhagen/Seattle
full professor, EPFL
https://people.epfl.ch/andrea.cavallaro?lang=en
Assistant prof at LTI CMU; Research scientist at Meta AI. Working on NLP: language interfaces, applied pragmatics, language-to-code, grounding. https://dpfried.github.io/
Computational Linguist (Saarland University, Germany).
Musician (barbershop singer, coach, and judge).
New to Bluesky.
Applying and improving #NLP/#AI but in a safe way. Teaching computers to tell stories, play D&D, and help people talk (accessibility).
Assistant Prof in CSEE @ UMBC.
π³οΈβπβΏ
https://laramartin.net
(jolly good) Fellow at the Kempner Institute @kempnerinstitute.bsky.socialβ¬, incoming assistant professor at UBC Linguistics (and by courtesy CS, Sept 2025). PhD @stanfordnlp.bsky.socialβ¬ with the lovely @jurafsky.bsky.socialβ¬
isabelpapad.com
Postdoc @UW, Prev.@UMich, Ph.D @PSU, Research Intern @GoogleAI, @AmazonScience. π¦ https://hua-shen.org.
Studying genomics, machine learning, and fruit. My code is like our genomes -- most of it is junk.
Assistant Professor UMass Chan
Previously IMP Vienna, Stanford Genetics, UW CSE.
Ass. Professor of Political Science at the University of St. Gallen, School of Economics and Political Science (CH)
Mom of three and one border collie
Professor at EPFLπ¨π Research on Soft Bioelectronics.
Training: INSA de Lyon π«π·; Princeton Univ. πΊπΈ; Univ. of Cambridge π¬π§