Sebastian Bordt

Sebastian Bordt

@sbordt.bsky.social

Language models and interpretable machine learning. Postdoc @ Uni Tübingen. https://sbordt.github.io/

480 Followers 251 Following 68 Posts Joined Oct 2023
3 months ago

Our spotlight paper is happening today at the #NeurIPS poster session! Drop by if you want to chat about the nitty-gritty details of large-scale transformer training!

1 0 0 0
3 months ago
Preview
On the Surprising Effectiveness of Large Learning Rates under Standard Width Scaling Scaling limits, such as infinite-width limits, serve as promising theoretical tools to study large-scale models. However, it is widely believed that existing infinite-width theory does not faithfully ...

📄 Paper: arxiv.org/abs/2505.22491

Catch our Spotlight at #NeurIPS2025 Today!

📅 Wed Dec 3 🕟 4:30 - 7:30 PM 📍 Exhibit Hall C,D,E — Poster #3903
Huge thanks to my amazing collaborators: @mohaas.bsky.social @sbordt.bsky.social @ulrikeluxburg.bsky.social

3 2 0 1
3 months ago
Preview
Why Can We Train Large Models with Large Learning Rates? Infinite-width theory may explain the training dynamics of finite-width neural networks after all.

Ever wondered about the rationale behind transformer training details like qk-norm, learning rate, and z-loss? Read this blog post to find out more!

1 0 0 0
4 months ago
Post image

Here is a formal impossibility result for XAI: Informative Post-Hoc Explanations Only Exist for Simple Functions. I'll give an online presentation about this work next tuesday in @timvanerven.nl 's Theory of Interpretable AI Seminar:

arxiv.org/abs/2508.11441

tverven.github.io/tiai-seminar/

15 5 0 0
5 months ago
Preview
Theory of XAI Workshop, Dec 2, 2025 Explainable AI (XAI) is now deployed across a wide range of settings, including high-stakes domains in which misleading explanations can cause real harm. For example, explanations are required by law ...

🚨 Workshop on the Theory of Explainable Machine Learning

Call for ≤2 page extended abstract submissions by October 15 now open!

📍 Ellis UnConference in Copenhagen
📅 Dec. 2
🔗 More info: sites.google.com/view/theory-...

@gunnark.bsky.social @ulrikeluxburg.bsky.social @emmanuelesposito.bsky.social

8 4 0 0
5 months ago
Post image

I am hiring PhD students and/or Postdocs, to work on the theory of explainable machine learning. Please apply through Ellis or IMPRS, deadlines end october/mid november. In particular: Women, where are you? Our community needs you!!!

imprs.is.mpg.de/application
ellis.eu/news/ellis-p...

25 15 0 0
6 months ago

We need new rules for publishing AI-generated research. The teams developing automated AI scientists have customarily submitted their papers to standard refereed venues (journals and conferences) and to arXiv. Often, acceptance has been treated as the dependent variable. 1/

85 25 4 6
6 months ago
Preview
Center for the Alignment of AI Alignment Centers We align the aligners

This new center strikes the right tone in approaching the AI alignment problem. alignmentalignment.ai

59 14 4 4
6 months ago
Pascual Restrepo - Research Pascual Restrepo Official Website. Economist, MIT.

I dont know if it's a good point to start, but you might want to take a look at the works by Daron Acemoglu and Pascual Restrepo pascual.scripts.mit.edu/research/

1 0 1 0
6 months ago
YouTube
How much can we forget about Data Contamination? - [Sebastian Bordt] YouTube video by Friday Talks Tübingen

A new recording of our FridayTalks@Tübingen series is online!

How much can we forget about Data Contamination?
by
@sbordt.bsky.social

Watch here: youtu.be/T9Y5-rngOLg

2 1 1 0
7 months ago
Preview
What Formal Languages Can Transformers Express? A Survey Abstract. As transformers have gained prominence in natural language processing, some researchers have investigated theoretically what problems they can and cannot solve, by treating problems as forma...

Thanks for your comments. I don't think that neural networks are just a form of memory (though they can store a large number of memories). For example, transformers with unbounded steps are Turing-complete direct.mit.edu/tacl/article...

1 0 0 0
7 months ago

I see the point of the original post, but I think it's also important to keep in mind this other aspect.

1 0 1 0
7 months ago
Post image

www.inference.vc/we-may-be-su...

7 1 1 0
7 months ago

The stochastic parrot is now an IMO gold medalist parrot

55 7 2 2
8 months ago

Wednesday: Position: Rethinking Explainable Machine Learning as Applied Statistics icml.cc/virtual/2025...

1 0 0 0
8 months ago

I'm at #ICML in Vancouver this week, hit me up if you want to chat about pre-training experiments or explainable machine learning.

You can find me at these posters:

Tuesday: How Much Can We Forget about Data Contamination? icml.cc/virtual/2025...

1 1 1 0
8 months ago

Great to hear that you like it, and thank you for the feedback! I agree that stakeholders are important, although you are not going to find much about it in this paper. We might argue, though, that similar aspects with stakeholders arise in data science with large datasets, hence the analogy :)

0 0 0 0
8 months ago

Our #ICML position paper: #XAI is similar to applied statistics: it uses summary statistics in an attempt to answer real world questions. But authors need to state how concretely (!) their XAI statistics contributes to answer which concrete (!) question!
arxiv.org/abs/2402.02870

6 2 0 0
8 months ago

There are many more interesting aspects to this, so take a look at our paper!

arxiv.org/abs/2402.02870

We would also be happy for questions and comments on why we got it completely wrong.😊

If you are at ICML, I will present this paper on Wed 16 Jul 4:30 in the East Exhibition Hall A-B #E-501.📍

0 0 0 0
8 months ago

We think the literature on explainable machine learning can learn a lot from looking at these papers!📚

1 0 1 0
8 months ago

As I learned from our helpful ICML reviewers, there is a lot of existing research at the intersection of machine learning and statistics that takes the matter of interpretation quite seriously.

0 0 1 0
8 months ago

In this framework, another way to formulate the initial problems is: For many popular explanation algorithms, it is not clear whether they have an interpretation.

0 0 1 0
8 months ago

Having an interpretation means that the explanation formalizes an intuitive human concept, which is a fancy philosophical way of saying that it is clear what aspect of the function the explanation describes.🧠

0 0 1 0
8 months ago

In addition, the way to develop explanations that are useful "in the world" is to develop explanations that have an interpretation.

0 0 1 0
8 months ago

This has several important implications. Most importantly, explainable machine learning has often been trying to reinvent the wheel when we already have a robust framework for discussing complex objects in the light of pressing real-world questions.

1 0 1 0
8 months ago

It took us a while to recognize it, but once you see it, you can't unsee it: Explainable machine learning is applied statistics for learned functions.✨

0 0 1 0
8 months ago

Concretely, researchers in applied statistics study complex datasets by mapping their most important properties into low-dimensional structures. Now think:

Machine learning model ~ Large dataset
Explanation algorithm ~ Summary statistics, visualization

2 0 1 0
8 months ago

Here comes our key realization: This question has occurred in other disciplines before, specifically in applied statistics research.

1 0 1 0
8 months ago

So, how can we seriously discuss whether an explanation algorithm can be used to answer relevant questions about our trained model or the world?🌍

0 0 1 0
8 months ago

I have actually encountered this point in my own research before, where we did a detailed mathematical analysis of SHAP, but all the math could not reveal the right way to use the explanations in practice (arxiv.org/abs/2209.040...

0 0 1 0