Ryota Takatsuki @rtakatsky - Bluesky Profile

Ryota Takatsuki

@rtakatsky.bsky.social

PhD student at Sussex Centre for Consciousness Science. Research fellow at AI Alignment Network. Dreaming of reverse-engineering consciousness someday.

25 Followers | 36 Following | 8 Posts | Joined: 01.12.2024 | 1.7286

Latest posts by rtakatsky.bsky.social on Bluesky

I’m really excited about Diffusion Steering Lens, an intuitive and elegant new “logit lens” technique for decoding the attention and MLP blocks of vision transformers!

Vision is much more expressive than language, so some new mech interp rules apply:

25.04.2025 13:36 — 👍 11 🔁 3 💬 0 📌 0

Decoding Vision Transformers: the Diffusion Steering Lens Logit Lens is a widely adopted method for mechanistic interpretability of transformer-based language models, enabling the analysis of how internal representations evolve across layers by projecting th...

This work was done as my internship project at Araya. Huge thanks to my supervisors, Ippei Fujisawa & Ryota Kanai, and my external mentor @soniajoseph.bsky.social for making this happen! 🙏

Link to the paper: arxiv.org/abs/2504.13763
(7/7)

25.04.2025 09:37 — 👍 2 🔁 0 💬 0 📌 0

We also validated DSL’s reliability through two interventional studies (head importance correlation & overlay removal). Check out our paper for details!
(6/7)

25.04.2025 09:37 — 👍 0 🔁 0 💬 1 📌 0

Below are the top-10 head DSL visualizations by similarity to the input, consistent with residual-stream visualizations from Diffusion Lens.
(5/7)

25.04.2025 09:37 — 👍 0 🔁 0 💬 1 📌 0

To fix this, we propose Diffusion Steering Lens (DSL), a training-free method that steers a specific submodule’s output, patches its subsequent indirect contributions, and then decodes it with the diffusion model.
(4/7)

25.04.2025 09:37 — 👍 0 🔁 0 💬 1 📌 0

We first adapted Diffusion Lens (Toker et al., 2024) to decode residual streams in the Kandinsky 2.2 image encoder (CLIP ViT-bigG/14) via the diffusion model.
We can visualize how the predictions evolve through layers, but individual head contributions stay largely hidden.
(3/7)

25.04.2025 09:37 — 👍 0 🔁 0 💬 1 📌 0

Classic Logit Lens projects residual streams to the output space. It works surprisingly well on ViTs, but visual representations are far richer than class labels.
www.lesswrong.com/posts/kobJym...
(2/7)

25.04.2025 09:37 — 👍 0 🔁 0 💬 1 📌 0

🔍Logit Lens tracks what transformer LMs “believe” at each layer. How can we effectively adapt this approach to Vision Transformers?

Happy to share our “Decoding Vision Transformers: the Diffusion Steering Lens” was accepted at the CVPR 2025 Workshop on Mechanistic Interpretability for Vision!
(1/7)

25.04.2025 09:37 — 👍 5 🔁 0 💬 1 📌 1

hello world

24.04.2025 07:01 — 👍 2 🔁 0 💬 0 📌 0

@rtakatsky is following 20 prominent accounts

29th ASSC meeting. Santiago de Chile
@assc2026

29th Annual Meeting of the Association for the Scientific Study of Consciousness. Santiago de Chile June 30-July 3 #ASSC29

Robert Chis-Ciure
@robertchisciure

ERC Research Fellow at the Sussex Centre for Consciousness Science (Anil Seth’s lab)

Mike Levin
@drmichaellevin

I'm a scientist at Tufts University; my lab studies anatomical and behavioral decision-making at multiple scales of biological, artificial, and hybrid systems. www.drmichaellevin.org

Ishan Singhal
@singhal

Postdoctoral Research Fellow at SCCS, University of Sussex| Cognitive science, time and consciousness

Editors of Neuroscience of Consciousness
@nconsc

Account of the Editors of Neuroscience of Consciousness https://academic.oup.com/nc Athena Demertzi (co-EiC) @ademertzi Thomas Andrillon (co-EiC) @thomasandrillon Ben Kozuch (DepEd) @benjikozuch

Gualtiero Piccinini
@gualtiero

Florence G. Kline Prof & Curators’ Distinguished Prof @ MU. Barwise Prize winner. Author, Neurocognitive Mechanisms (OUP 2020). Resist fascism, address climate change, support Ukraine!

Ryota Kanai
@kanair

CEO of Araya

Yuzuki Arai
@yudukikun5120.me

Dirk Gütlin
@gutlin

I post mainly about Neuroscience, Machine Learning, Complex Systems, or Stats papers. Working on neural learning /w @auksz.bsky.social CCNB/BCCN/Free University Berlin. I also play bass in a pop punk band: https://linktr.ee/goodviewsbadnews

Johannes Fahrenfort
@fahrenfort

Assistant Prof at VU Amsterdam. Neuroscience of consciousness, decision making. Computational modeling. Pet method: EEG. Critical of subjective measures. Co-PI in the http://consciousbrainlab.com with @svangaal.bsky.social and @timostein.bsky.social.

CLaE
@claeneuro

Neuroscience,Insular cortex I do not reply to direct messages  Twitter @leafs_s mastdon @leafs_s@mstdn.science

Stella Biderman
@stellaathena

I make sure that OpenAI et al. aren't the only people who are able to study large scale AI systems.

Ethan Mollick
@emollick

Professor at Wharton, studying AI and its implications for education, entrepreneurship, and work. Author of Co-Intelligence. Book: https://a.co/d/bC2kSj1 Substack: https://www.oneusefulthing.org/ Web: https://mgmt.wharton.upenn.edu/profile/emollick

Naomi Saphra
@nsaphra

Waiting on a robot body. All opinions are universal and held by both employers and family. ML/NLP.

Sheila Macrine, PhD
@macrinephd

Professor Cognitive Psychologist at UMass Dartmouth. She focuses: embodied cognition and Learning Free access: https://direct.mit.edu/books/oa-edited-volume/5306/Movement-MattersHow-Embodied-Cognition-Informs https://embodiedcognitionandlearning.com/

Marius Mosbach
@mariusmosbach

#NLP Postdoc at Mila - Quebec AI Institute & McGill University mariusmosbach.com

Blake Richards
@tyrellturing

Researcher at Google and CIFAR Fellow, working on the intersection of machine learning and neuroscience in Montréal (academic affiliations: @mcgill.ca and @mila-quebec.bsky.social).

Nick Fisher
@hydroxide.dev

Mostly 3D+AI+voice. Building https://mixreel.ai Flutter/Dart/C++/PyTorch/Blender https://github.com/nmfisher/thermion https://nick-fisher.com https://playmixworld.com https://bit.ly/3RkzFfH

Marco Zocca
@ocramz

ML, λ • language and the machines that understand it • https://ocramz.github.io

axel cleeremans
@axc

Brain, minds, & consciousness. Also interested in design, visual arts, biology, space, typography, and anticipation. Also known as « Axel from Belgium »