๐ค Open position at IBM Research Zurich!
Passionate about AI for maths & curious about Quantum Computing?
Join our team & help to shape the future of computing!
We are offering internships & master theses. If you are looking for a PhD, please apply to the same ad!
๐ www.zurich.ibm.com/careers/2025...
19.09.2025 19:58 โ
๐ 0
๐ 0
๐ฌ 0
๐ 0
Paperscraper
Documentation for the paperscraper python package
After several years of usage by the open-source community, our paperscraper package finally has its own Docs available: jannisborn.github.io/paperscraper/
Use #paperscraper for publication keyword search, download PDFs, extract citation statistics and many more! ๐
13.08.2025 19:58 โ
๐ 1
๐ 0
๐ฌ 0
๐ 0
Check out our workflow for AI-driven molecular design. Weโve successfully validated this experimentally already (papers coming soon)!
30.07.2025 22:00 โ
๐ 3
๐ 1
๐ฌ 0
๐ 0
GitHub - tum-ai/number-token-loss: A regression-alike loss to improve numerical reasoning in language models
A regression-alike loss to improve numerical reasoning in language models - tum-ai/number-token-loss
Jonas Zausinger*, Lars Pennig*, Anamarija Kozina, Sean Sdahl, Julian Sikora, Adrian Dendorfer, Timofey Kuznetsov, Mohamad Hagog, Nina Wiedemann, Kacper Chlodny, Vincent Limbach, Anna Ketteler, Thorben Prein, Vishwa Mohan Singh & Michael Danziger.
๐ป GitHub code: ibm.biz/ntl-code
03.07.2025 21:20 โ
๐ 0
๐ 0
๐ฌ 0
๐ 0
Regress, Donโt Guess โ Number Token Loss
A regression-like loss on number tokens for language models.
5. Text-task friendly: Doesnโt interfere with CE on purely textual tasks ๐
6. Scalable: Tested up to 3B, e.g., with hashtag#IBMGranite 3.2๐
7. Plug-and-play: Itโs โjust a loss,โ so itโs super easy to adopt ๐ข
๐ ICML paper: ibm.biz/ntl-paper
03.07.2025 21:20 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
Regress, Don't Guess -- A Regression-like Loss on Number Tokens for Language Models
While language models have exceptional capabilities at text generation, they lack a natural inductive bias for emitting numbers and thus struggle in tasks involving quantitative reasoning, especially ...
1. Better math performance: NTL consistently boosts accuracy on math benchmarks (e.g., GSM-8K) ๐
2. Lightning-fast: 100ร faster to compute than CE, so thereโs no training overhead โก
3. Model-agnostic: Works with Transformers, Mamba, etc. ๐ค
(continued โฌ๏ธ )
๐๏ธ Hugging Face Spaces demo: ibm.biz/ntl-demo
03.07.2025 21:20 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
In our upcoming #ICML2025 paper, we introduce the #NumberTokenLoss (NTL) to address this -- see the demo above! NTL is a regression-style loss computed at the token levelโno extra regression head needed. We propose adding NTL on top of CE during LLM pretraining. Our experiments show: (see โฌ๏ธ )
03.07.2025 21:20 โ
๐ 1
๐ 1
๐ฌ 1
๐ 0
#ICML Why are LLMs so powerful but still suck at math? ๐ค A key problem is cross-entropy loss: It is nominal-scale, so tokens are unordered. That makes sense for words, but not for numbers. For a "5" label, predicting โ6โ or โ9โ gives the same loss ๐ฑ Yes, it's crazy! No, nobody has fixed this yet! โฌ๏ธ
03.07.2025 21:20 โ
๐ 2
๐ 0
๐ฌ 1
๐ 0
Towards generalizable single-cell perturbation modeling via the Conditional Monge Gap
Learning the response of single-cells to various treatments offers great potential to enable targeted therapies. In this context, neural optimal transport (OT) has emerged as a principled methodologic...
๐จ Our new paper: Conditional Optimal Transport generalizes well to unseen drugs. Big step forward, thanks to conditional Monge Gap! Even better: conditional models often beat local, non-conditional ones. arxiv.org/abs/2504.08328. Code public! Thanks to all co-authors
@marianna-raps.bsky.social
21.04.2025 15:43 โ
๐ 5
๐ 1
๐ฌ 1
๐ 0
Great to hear! ๐ Let me know if there are questions
16.01.2025 20:58 โ
๐ 1
๐ 0
๐ฌ 0
๐ 0
Redirecting
Our next journal club meeting will be discussing "A Computational Investigation of Inventive Spelling and the 'Lesen durch Schreiben' Method" by @jannisblrn.bsky.social et al. on 23 Jan 2025, 11am - 12pm (GMT+1). Join us by emailing us at gewonn.contact.us@gmail.com, and stay tuned for more news!
16.01.2025 16:20 โ
๐ 3
๐ 3
๐ฌ 1
๐ 1
If you're @neuripsconf.bsky.social and into #OptimalTransport & bio, dont miss on Alice Driessen's spotlight talk on #ConditionalMongeGap for modeling CAR Response. Today #AIDrugX workshop!
Positive results on OOD perturbations -> accurate gene expression prediction. Paper: ibm.biz/carot-pre
15.12.2024 21:29 โ
๐ 4
๐ 1
๐ฌ 0
๐ 0
Full poster
14.12.2024 22:48 โ
๐ 0
๐ 0
๐ฌ 0
๐ 0
Number token loss
A new loss improves math capabilities in language models! The loss is model-agnostic and only requires to know which tokens represent numbers.
No computational overhead but better performance.
Poster today @NeurIPS - MathAI Workshop! Thx to collaborators from TUM AI!
Paper: arxiv.org/abs/2411.02083
14.12.2024 22:31 โ
๐ 7
๐ 0
๐ฌ 1
๐ 0
Can we iteratively design small molecules with desired target properties, simply by sending messages on Slack? YES!
Super excited to give a live demo on๐คdZiner๐งช during the SPOTLIGHT ๐ฆ talk at #AI4Mat #NeurIPS2024!
Preprint: lnkd.in/e-24AEHC
Code: lnkd.in/egF4hGCg
06.12.2024 22:33 โ
๐ 14
๐ 3
๐ฌ 0
๐ 0