What do certified guarantees look like in the age of large language models and long reasoning chains? Look for us at EMNLP to find out!
04.11.2025 23:05 β π 1 π 0 π¬ 0 π 0
Sum-of-Parts Models: Faithful Attributions for Groups of Features
Overcoming fundamental barriers in feature attribution methods with grouped attributions
If you're at ICML, in about 15 minutes, Weiqiu & I will be at our poster on sum-of-parts models: for faithful attributions and cosmology discovery. Stop by to say hi!
East Exhibition Hall A-B #E-1208
Thu 17 Jul 11 a.m. - 1:30 p.m. PDT
debugml.github.io/sum-of-parts/
#ICML @youweiqiu.bsky.social
17.07.2025 17:45 β π 1 π 1 π¬ 0 π 0
LLM ignoring instructions? Make it listen with InstABoost.
β
Simple: Steer your model in 5 lines of code
β
Effective: Outperforms latent steering & prompt-only methods
β
Grounded: Based on our mechanistic theory on rule-following (LogicBreaks)
Blog: debugml.github.io/instaboost
10.07.2025 18:46 β π 1 π 1 π¬ 0 π 0
π§ Foundation models are reshaping reasoning. Do we still need specialized neuro-symbolic (NeSy) training, or can clever prompting now suffice?
Our new position paper argues the road to generalizable NeSy should be paved with foundation models.
π arxiv.org/abs/2505.24874
(π§΅1/9)
13.06.2025 20:30 β π 1 π 1 π¬ 1 π 0
Brachio Lab: FIX
We've been doing a bunch of interpretability work with scientists (i.e. our recent FIX benchmark brachiolab.github.io/fix/)!
21.11.2024 18:08 β π 2 π 0 π¬ 1 π 0
Chief Scientist at the UK AI Security Institute (AISI). Previously DeepMind, OpenAI, Google Brain, etc.
Research in NLP (mostly LM interpretability & explainability).
Assistant prof at UMD CS + CLIP.
Previously @ai2.bsky.social @uwnlp.bsky.social
Views my own.
sarahwie.github.io
#NLP Postdoc at Mila - Quebec AI Institute & McGill University
mariusmosbach.com
19th International conference on Neurosymbolic Learning and Reasoning
UC Santa Cruz, Santa Cruz, California
8 to 10 September 2025
https://nesy-ai.org/
https://2025.nesyconf.org
San Diego Dec 2-7, 25 and Mexico City Nov 30-Dec 5, 25. Comments to this account are not monitored. Please send feedback to townhall@neurips.cc.
UPS Foundation Professor @upenn.bsky.social ,Associate Dean of Research @PennEngineers, former department chair @ESEatPenn, former director @GRASPlab
Security and Privacy of Machine Learning at UofT, Vector Institute, and Google π¨π¦π«π·πͺπΊ Co-Director of Canadian AI Safety Institute (CAISI) Research Program at CIFAR. Opinions mine
Assistant Prof. @ JHU π¦π·πΊπΈ Mathematics of Data & Biomedical Data Science
jsulam.github.io
Assistant Professor at Stanford
Machine learning, algorithm design, econ-CS
https://vitercik.github.io/
https://www.vita-group.space/ π¨βπ« UT Austin ML Professor (on leave)
https://www.xtxmarkets.com/ π¦ XTX Markets Research Director (NYC AI Lab)
Superpower is trying everything πͺ
Newest focus: training next-generation super intelligence - Preview above πΆ
Faculty at βͺthe ELLIS Institute TΓΌbingen and Max Planck Institute for Intelligent Systems. Leading the AI Safety and Alignment group. PhD from EPFL supported by Google & OpenPhil PhD fellowships.
More details: https://www.andriushchenko.me/
cs phd @upenn advised by Michael Kearns, Aaron Roth, and Duncan Watts| previously @stanford | she/her
https://psamathe50.github.io/sikatasengupta/
CS PhD student at UPenn studying strategic human-AI interaction. On the job market! Nataliecollina.com
Post-training Alignment at IBM Research AI | Prev: Penn CS + Wharton