BlackboxNLP blackboxnlp - Bluesky Statics

Nicolò & Mingyang: Can we understand which circuits emerge in small models and reasoning-tuned systems, and how do they compare with default systems? Are there methods that generalize better across all tasks?

09.11.2025 07:23 — 👍 0 🔁 0 💬 0 📌 0

Q: What's next for interpretability benchmarks? Michal: People sitting together and planning how to extend tests to multimodal, diverse contexts. @michaelwhanna.bsky.social: For circuit finding, integrating sparse features circuits could help us better understand our models.

09.11.2025 07:21 — 👍 0 🔁 0 💬 1 📌 0

Nicolò & Mingyang: Starting to explore notebooks and public libraries can be very helpful in gaining early intuitions about what's promising.

09.11.2025 07:16 — 👍 0 🔁 0 💬 1 📌 0

@michaelwhanna.bsky.social: Don't try to read everything. Find Qs you really care about, and go a level deeper to answer meaningful questions.

09.11.2025 07:15 — 👍 0 🔁 0 💬 1 📌 0

Q: How would one go about approaching interpretability research these days? Michal: "When things don't work out of the box, it's a sign to double down and find out why. Negative results are important!"

09.11.2025 07:15 — 👍 1 🔁 0 💬 1 📌 0

@danaarad.bsky.social: As deep learning research converges on similar architectures for different modalities, it will be interesting to determine which interpretability method will remain useful across various models and tasks.

09.11.2025 07:15 — 👍 1 🔁 0 💬 1 📌 0

@michaelwhanna.bsky.social, Nicolò & Mingyang: Counterfactuals in minimal settings can be helpful, but they do not capture the whole story. Extending current methods to long contexts, and finding practical applications in safety-related areas are exciting challenges ahead.

09.11.2025 07:07 — 👍 1 🔁 0 💬 1 📌 0

Michal: Mechanistic interpretability has heavily focused on toy tasks and text-only models. The next step is scaling to more complex tasks that involve real-world reasoning.

09.11.2025 07:07 — 👍 1 🔁 0 💬 1 📌 0

Our panel moderated by @danaarad.bsky.social
"Evaluating Interpretability Methods: Challenges and Future Directions" just started! 🎉 Come to learn more about the MIB benchmark and hear the takes of @michaelwhanna.bsky.social, Michal Golovanevsky, Nicolò Brunello and Mingyang Wang!

09.11.2025 06:54 — 👍 9 🔁 1 💬 1 📌 1

Next up: Kentaro Ozeki presenting "Normative Reasoning in Large Language Models: A Comparative Benchmark from Logical and Modal Perspectives" aclanthology.org/2025.blackbo...

09.11.2025 06:32 — 👍 1 🔁 0 💬 0 📌 0

After a productive poster session, BlackboxNLP returns with the second keynote "Memorization: Myth or Mystery?" by @vernadankers.bsky.social!

09.11.2025 05:48 — 👍 7 🔁 0 💬 0 📌 0

Nadav Shani is giving the first oral presentation of the day: Language Dominance in Multilingual Large Language Models. Find the paper here: aclanthology.org/2025.blackbo...

09.11.2025 02:19 — 👍 3 🔁 0 💬 0 📌 0

Next up: Circuit-Tracer: A New Library for Finding Feature Circuits presented by @michaelwhanna.bsky.social! Paper: aclanthology.org/2025.blackbo...

09.11.2025 02:17 — 👍 3 🔁 0 💬 0 📌 0

I'll be presenting this work at @blackboxnlp.bsky.social in Suzhou, happy to chat there or here if you are interested !

22.10.2025 08:16 — 👍 1 🔁 1 💬 1 📌 0

Nov 9, @blackboxnlp.bsky.social , 11:00-12:00 @ Hall C – Interpreting Language Models Through Concept Descriptions: A Survey (Feldhus & Kopf) @lkopf.bsky.social

🗞️ aclanthology.org/2025.blackbo...

bsky.app/profile/nfel...

06.11.2025 07:00 — 👍 4 🔁 2 💬 1 📌 1

Quanshi Zhang is giving the first keynote of the day: Can Neural Network Interpretability Be the Key to Breaking Through Scaling Law Limitations in Deep Learning?

09.11.2025 01:38 — 👍 0 🔁 0 💬 0 📌 0

BlackboxNLP is up and running! Here's the topics covered by this year's edition at a glance. Excited to see so many interesting topics, and the growing interest in reasoning!

09.11.2025 01:38 — 👍 2 🔁 0 💬 0 📌 1

📢 Call for Papers! 📢
#BlackboxNLP 2025 invites the submission of archival and non-archival papers on interpreting and explaining NLP models.

📅 Deadlines: Aug 15 (direct submissions), Sept 5 (ARR commitment)
🔗 More details: blackboxnlp.github.io/2025/call/

12.08.2025 19:10 — 👍 9 🔁 1 💬 0 📌 3

Writing your technical report for the MIB shared task?
Take a look at the task page for guidelines and tips!

06.08.2025 09:51 — 👍 2 🔁 0 💬 0 📌 0

The report deadline was also extended to August 10th!
Note that this is a final extension. We look forward to reading your reports! ✍️

06.08.2025 09:49 — 👍 2 🔁 1 💬 0 📌 0

Just 5 days left to submit your method to the MIB Shared Task at #BlackboxNLP!

Have last-minute questions or need help finalizing your submission?
Join the Discord server: discord.gg/n5uwjQcxPR

03.08.2025 06:40 — 👍 1 🔁 1 💬 0 📌 0

BlackboxNLP 2025 The Eight Workshop on Analyzing and Interpreting Neural Networks for NLP

Results + technical report deadline: August 8, 2025
Full task details: blackboxnlp.github.io/2025/task/

30.07.2025 05:57 — 👍 0 🔁 0 💬 0 📌 0

With the new extended deadline, there's still plenty of time to submit your method to the MIB Shared Task!

We welcome submissions of existing methods, experimental POCs, or any approach addressing circuit discovery or causal variable localization 💡

30.07.2025 05:57 — 👍 2 🔁 1 💬 1 📌 0

Results deadline extended by one week!
Following requests from participants, we’re extending the MIB Shared Task submission deadline by one week.

🗓️ New deadline: August 8, 2025
Submit your method via the MIB leaderboard!

29.07.2025 09:35 — 👍 3 🔁 1 💬 0 📌 2

📝 Technical report guidelines are out!

If you're submitting to the MIB Shared Task at #BlackboxNLP, feel free to take a look to help you prepare your report: blackboxnlp.github.io/2025/task/

28.07.2025 12:34 — 👍 3 🔁 1 💬 0 📌 1

Just 10 days to go until the results submission deadline for the MIB Shared Task at #BlackboxNLP!

If you're working on:
🧠 Circuit discovery
🔍 Feature attribution
🧪 Causal variable localization
now’s the time to polish and submit!

Join us on Discord: discord.gg/n5uwjQcxPR

23.07.2025 07:42 — 👍 3 🔁 1 💬 0 📌 1

Are you attending ICML? 👀

I'm sadly not, but if you are, you should check out the MIB 🕶️poster at 11AM: icml.cc/virtual/2025...

The benchmark is used as the shared task at this year's
@blackboxnlp.bsky.social (blackboxnlp.github.io/2025/task/) - there's still time to participate 🏆

17.07.2025 15:56 — 👍 4 🔁 1 💬 0 📌 0

⏳ Three weeks left! Submit your work to the MIB Shared Task at #BlackboxNLP, co-located with @emnlpmeeting.bsky.social

Whether you're working on circuit discovery or causal variable localization, this is your chance to benchmark your method in a rigorous setup!

13.07.2025 05:56 — 👍 4 🔁 2 💬 0 📌 2

Have you started working on your submission for the MIB shared task yet? Tell us what you’re exploring!

New featurization methods?
Circuit pruning?
Better feature attribution?

We'd love to hear about it 👇

09.07.2025 07:15 — 👍 2 🔁 1 💬 0 📌 1

BlackboxNLP 2025 The Eight Workshop on Analyzing and Interpreting Neural Networks for NLP

🗓️ Deadline: August 1
📜 Full task details: blackboxnlp.github.io/2025/task/
💬 Join the discussion: discord.gg/n5uwjQcxPR

08.07.2025 09:35 — 👍 2 🔁 0 💬 0 📌 0

Posts by BlackboxNLP (@blackboxnlp.bsky.social)