Martin Tutek's Avatar

Martin Tutek

@mtutek.bsky.social

Postdoc @ TakeLab, UniZG | previously: Technion; TU Darmstadt | PhD @ TakeLab, UniZG Faithful explainability, controllability & safety of LLMs. ๐Ÿ”Ž On the academic job market ๐Ÿ”Ž https://mttk.github.io/

312 Followers  |  378 Following  |  74 Posts  |  Joined: 24.11.2024  |  1.7905

Latest posts by mtutek.bsky.social on Bluesky

Post image

Can you solve this algebra puzzle? ๐Ÿงฉ

cb=c, ac=b, ab=?

A small transformer can learn to solve problems like this!

And since the letters don't have inherent meaning, this lets us study how context alone imparts meaning. Here's what we found:๐Ÿงตโฌ‡๏ธ

22.01.2026 16:09 โ€” ๐Ÿ‘ 47    ๐Ÿ” 10    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2
Post image

โณ Deadline approaching! Weโ€™re hiring 2 fully funded postdocs in #NLP.

Join the MilaNLP team and contribute to our upcoming research projects (SALMON & TOLD)

๐Ÿ”— Details + how to apply: milanlproc.github.io/open_positio...

โฐ Deadline: Jan 31, 2026

19.01.2026 17:24 โ€” ๐Ÿ‘ 11    ๐Ÿ” 10    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

I'd like a link as well!

15.01.2026 09:20 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Nathan Stringham, Fateme Hashemi Chaleshtori, Xinyuan Yan, Zhichao Xu, Bei Wang, Ana Marasovi\'c
Teaching People LLM's Errors and Getting it Right
https://arxiv.org/abs/2512.21422

29.12.2025 07:45 โ€” ๐Ÿ‘ 4    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

๐Ÿš€ Weโ€™re opening 2 fully funded postdoc positions in #NLP!

Join the MilaNLP team and contribute to our upcoming research projects.

๐Ÿ”— More details: milanlproc.github.io/open_positio...

โฐ Deadline: Jan 31, 2026

18.12.2025 15:29 โ€” ๐Ÿ‘ 19    ๐Ÿ” 13    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 2
Llama enjoying a mug of hot cocoa in an office with Tuesday, March 31 circled on a calendar behind them

Llama enjoying a mug of hot cocoa in an office with Tuesday, March 31 circled on a calendar behind them

COLM 2026 is just around the corner! Mark your calendars for:

๐Ÿ’ก Abstract deadline: Thursday, March 26, 2026
๐Ÿ“„ Full paper submission deadline: Tuesday, March 31, 2026

Call for papers (website coming soon):
docs.google.com/document/d/1...

16.12.2025 15:31 โ€” ๐Ÿ‘ 9    ๐Ÿ” 4    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
The Doge of Venice visits a Murano glassworks in the 17th century. I will talk about why glassmaking in this era has some similarities to AI research today.

The Doge of Venice visits a Murano glassworks in the 17th century. I will talk about why glassmaking in this era has some similarities to AI research today.

At the #Neurips2025 mechanistic interpretability workshop I gave a brief talk about Venetian glassmaking, since I think we face a similar moment in AI research today.

Here is a blog post summarizing the talk:

davidbau.com/archives/202...

11.12.2025 15:02 โ€” ๐Ÿ‘ 14    ๐Ÿ” 3    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 2

Iโ€™m recruiting a postdoc to work on algorithms for cancer genome reconstruction. We have access to a rich set of tumour samples sequenced across multiple technologies. If interested, feel free to DM. Please share.

11.12.2025 03:04 โ€” ๐Ÿ‘ 12    ๐Ÿ” 12    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1
Post image

๐Ÿง‘โ€๐Ÿ”ฌIโ€™m recruiting PhD students in Natural Language Processing @unileipzig.bsky.social Computer Science, together with @scadsai.bsky.social!

Topics include, but arenโ€™t limited to:

๐Ÿ”ŽLinguistic Interpretability
๐ŸŒMultilingual Evaluation
๐Ÿ“–Computational Typology

Please share!

#NLProc #NLP

11.12.2025 13:36 โ€” ๐Ÿ‘ 41    ๐Ÿ” 25    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 3

I will be @euripsconf.bsky.social this week to present our paper as non-archival at the PAIG workshop (Beyong Regulation:
Private Governance & Oversight Mechanisms for AI). Very much looking forward to the discussions!

If you are at #EurIPS and want to chat about LLM's training data. Reach out!

02.12.2025 21:47 โ€” ๐Ÿ‘ 9    ๐Ÿ” 4    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

๐Ÿ“ข Postdoc position ๐Ÿ“ข

Iโ€™m recruiting a postdoc for my lab at NYU! Topics include LM reasoning, creativity, limitations of scaling, AI for science, & more! Apply by Feb 1.

(Different from NYU Faculty Fellows, which are also great but less connected to my lab.)

Link in ๐Ÿงต

02.12.2025 16:04 โ€” ๐Ÿ‘ 21    ๐Ÿ” 12    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 1

wait this is not the routine?

27.11.2025 17:42 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Comic. Panels up to the 10-year point are grayed out. New panels since the Ten Years comic, which chronicles the first ten years of PERSON 1's journey with cancer: (1) [two people in bed] PERSON 1 (woman): One more chapter? PERSON 2 (man): Donโ€™t we both have to get up early? PERSON 1: Nnnnnggggh PERSON 2: Sure, good point. (2) [many people wearing masks, walking while looking at graphs on their phones] (3) [birds landing on people] PERSON 2 in beanie and scarf: Hah! They like *my* seeds best. PERSON 1 in scarf holding phone with a bird sitting on it: Wait, how do I take a picture of this one? (4) [two people rowing boats with tree landscape] (5) [Person 1 carries overflowing stack of things to Person 2 in bed] PERSON 1: I brought you honey lemon tea, more pillows, a cinnamon roll, Tylenol, another blanket, aโ€“ PERSON 2: It was just Appendicitis, Iโ€™m reallyโ€“ PERSON 1: *It is my turn to take care of you and I am going to do it right!* (6) [Two people in car] (7) [still in car) PERSON 1: Oh my god. PERSON 2: Oh my god. (8) [car driving] PERSON 1: Pull over! PERSON 2: I am! (9) [both people get out of car] (10) [Large colored panel of aurora borealis over water with both people looking on] (11) [Person 1 sits against tree while Person 2 lies on the ground] PERSON 1: Fifteen years. No sign of the cancer. (12) I *am* having some weird symptoms. Joint pain. Fatigue. I think Iโ€™m losing my close-up vision. PERSON 2: Yeah. Me too. (13) PERSON 2: I think weโ€™re getting old. (14) PERSON 1: I guess thatโ€™s okay. PERSON 2: Itโ€™s all I wanted.

Comic. Panels up to the 10-year point are grayed out. New panels since the Ten Years comic, which chronicles the first ten years of PERSON 1's journey with cancer: (1) [two people in bed] PERSON 1 (woman): One more chapter? PERSON 2 (man): Donโ€™t we both have to get up early? PERSON 1: Nnnnnggggh PERSON 2: Sure, good point. (2) [many people wearing masks, walking while looking at graphs on their phones] (3) [birds landing on people] PERSON 2 in beanie and scarf: Hah! They like *my* seeds best. PERSON 1 in scarf holding phone with a bird sitting on it: Wait, how do I take a picture of this one? (4) [two people rowing boats with tree landscape] (5) [Person 1 carries overflowing stack of things to Person 2 in bed] PERSON 1: I brought you honey lemon tea, more pillows, a cinnamon roll, Tylenol, another blanket, aโ€“ PERSON 2: It was just Appendicitis, Iโ€™m reallyโ€“ PERSON 1: *It is my turn to take care of you and I am going to do it right!* (6) [Two people in car] (7) [still in car) PERSON 1: Oh my god. PERSON 2: Oh my god. (8) [car driving] PERSON 1: Pull over! PERSON 2: I am! (9) [both people get out of car] (10) [Large colored panel of aurora borealis over water with both people looking on] (11) [Person 1 sits against tree while Person 2 lies on the ground] PERSON 1: Fifteen years. No sign of the cancer. (12) I *am* having some weird symptoms. Joint pain. Fatigue. I think Iโ€™m losing my close-up vision. PERSON 2: Yeah. Me too. (13) PERSON 2: I think weโ€™re getting old. (14) PERSON 1: I guess thatโ€™s okay. PERSON 2: Itโ€™s all I wanted.

Fifteen Years

xkcd.com/3172/

26.11.2025 22:32 โ€” ๐Ÿ‘ 11743    ๐Ÿ” 2451    ๐Ÿ’ฌ 289    ๐Ÿ“Œ 242

There's a reviewer at ICLR who apparently always writes *exactly* 40 weaknesses and comments no matter what paper he's reviewing.

Exhibit A: openreview.net/forum?id=8qk...
Exhibit B: openreview.net/forum?id=GlX...
Exhibit C: openreview.net/forum?id=kDh...

15.11.2025 14:42 โ€” ๐Ÿ‘ 9    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2

within the next 3-4 days, so sadly that doesn't work

11.11.2025 11:03 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

*Urgently* looking for emergency reviewers for the ARR October Interpretability track ๐Ÿ™๐Ÿ™

ReSkies much appreciated

11.11.2025 10:29 โ€” ๐Ÿ‘ 1    ๐Ÿ” 9    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image Post image

Full house at BlackboxNLP at #EMNLP2025!! Getting ready for my 1.45PM keynote ๐Ÿ˜Ž Join us in A102 to learn about "Memorization: myth or mystery?"

09.11.2025 03:04 โ€” ๐Ÿ‘ 12    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

๐™’๐™š'๐™ง๐™š ๐™๐™ž๐™ง๐™ž๐™ฃ๐™œ ๐™ฃ๐™š๐™ฌ ๐™›๐™–๐™˜๐™ช๐™ก๐™ฉ๐™ฎ ๐™ข๐™š๐™ข๐™—๐™š๐™ง๐™จ!

KSoC: utah.peopleadmin.com/postings/190... (AI broadly)

Education + AI:
- utah.peopleadmin.com/postings/189...
- utah.peopleadmin.com/postings/190...

Computer Vision:
- utah.peopleadmin.com/postings/183...

07.11.2025 23:35 โ€” ๐Ÿ‘ 16    ๐Ÿ” 10    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps Martin Tutek, Fateme Hashemi Chaleshtori, Ana Marasovic, Yonatan Belinkov. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025.

Outstanding paper (5/7):

"Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps"
by Martin Tutek, Fateme Hashemi Chaleshtori, Ana Marasovic, and Yonatan Belinkov
aclanthology.org/2025.emnlp-m...

6/n

07.11.2025 22:32 โ€” ๐Ÿ‘ 11    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

hvala Josipa ๐ŸŽ‰๐Ÿฅณ

07.11.2025 10:58 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Thank you Gabriele :)

07.11.2025 10:52 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image Post image

Very honored to be one out of seven outstanding papers at this years' EMNLP :)

Huge thanks to my amazing collaborators @fatemehc.bsky.social @anamarasovic.bsky.social @boknilev.bsky.social , this would not have been possible without them!

07.11.2025 08:58 โ€” ๐Ÿ‘ 22    ๐Ÿ” 6    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 2
Preview
Unsupervised Word-level Quality Estimation for Machine Translation Through the Lens of Annotators (Dis)agreement Gabriele Sarti, Vilรฉm Zouhar, Malvina Nissim, Arianna Bisazza. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025.

Presenting today our work "Unsupervised Word-level Quality Estimation Through the Lens of Annotator (Dis)agreement" at the Machine Translation morning session (Room A301, 11:45 China time). See you there! ๐Ÿค—

Paper: aclanthology.org/2025.emnlp-m...
Slides/video/poster: underline.io/events/502/s...

06.11.2025 01:19 โ€” ๐Ÿ‘ 5    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Hereโ€™s a custom feed for #EMNLP2025. Click the pin to save it to your home screen!

02.11.2025 15:15 โ€” ๐Ÿ‘ 11    ๐Ÿ” 4    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps When prompted to think step-by-step, language models (LMs) produce a chain of thought (CoT), a sequence of reasoning steps that the model supposedly used to produce its prediction. Despite much work o...

Flying out to @emnlpmeeting soon๐Ÿ‡จ๐Ÿ‡ณ
I'll present our parametric CoT faithfulness work (arxiv.org/abs/2502.14829) on Wednesday at the second Interpretability session, 16:30-18:00 local time A104-105

If you're in Suzhou, reach out to talk all things reasoning :)

31.10.2025 13:30 โ€” ๐Ÿ‘ 11    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

literally any book by sally rooney

(jk I know you don't like her)

24.10.2025 14:20 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

โฐ One week left to apply for the two PhD Fellowships in Trustworthy NLP and Explainable NLU! The two positions have a starting date in spring 2026. Check the original post for more details๐Ÿ‘‡

24.10.2025 08:30 โ€” ๐Ÿ‘ 4    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

The only benefit of them being humanoid is training data I guess?

Companies have a bunch of videos of e.g. factory workers doing repetetive tasks, so you have more signal on intermediate steps of some actions to train the robots behavior

23.10.2025 14:17 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image Post image

๐Ÿ“ฃTomorrow at #COLM2025:

1๏ธโƒฃ Purbid's ๐ฉ๐จ๐ฌ๐ญ๐ž๐ซ at ๐’๐จ๐‹๐š๐‘ (๐Ÿ๐Ÿ:๐Ÿ๐Ÿ“๐š๐ฆ-๐Ÿ:๐ŸŽ๐ŸŽ๐ฉ๐ฆ) on catching redundant preference pairs & how pruning them hurts accuracy; www.anamarasovic.com/publications...

2๏ธโƒฃ My ๐ญ๐š๐ฅ๐ค at ๐—๐‹๐‹๐Œ-๐‘๐ž๐š๐ฌ๐จ๐ง-๐๐ฅ๐š๐ง (๐Ÿ๐Ÿ๐ฉ๐ฆ) on measuring CoT faithfulness by looking at internals, not just behaviorally

1/3

09.10.2025 16:54 โ€” ๐Ÿ‘ 14    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

If you're at COLM, check out various works by Ana and her group!

09.10.2025 16:58 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@mtutek is following 20 prominent accounts