Dilyara Bareeva @dilya - Bluesky Profile

Latest posts by dilya.bsky.social on Bluesky

Manipulating Feature Visualizations with Gradient Slingshots Feature Visualization (FV) is a widely used technique for interpreting concepts learned by Deep Neural Networks (DNNs), which synthesizes input patterns that maximally activate a given feature....

Paper: openreview.net/forum?id=Tgc...
Code: github.com/dilyabareeva...

29.11.2025 16:38 — 👍 3 🔁 0 💬 0 📌 0

Huge thanks to my fantastic co-authors Marina MC Höhne, Alexander Warnecke, @lpirch.bsky.social, Klaus-Robert Müller, @rieck.mlsec.org, @slapuschkin.bsky.social, @kirillbykov.bsky.social, and to the UMI Lab, @aifraunhoferhhi.bsky.social, @xai-berlin.bsky.social and @bifold.berlin for the support!

29.11.2025 16:38 — 👍 3 🔁 1 💬 1 📌 0

Our lightweight adversarial fine-tuning attack lets you bend a feature to visualize any arbitrary concept. Off-manifold, we impose a hyperbolic activation landscape with its optimum at the target, while preserving on-distribution activations through a weighted two-term loss. 🕵️‍♀️

29.11.2025 16:38 — 👍 1 🔁 1 💬 1 📌 0

✈️🇲🇽 Next Wednesday (Dec 3), 1–4 p.m. CST, I’ll be presenting Manipulating Feature Visualizations with Gradient Slingshots at NeurIPS 2025 in Mexico City!

Feature Visualization has long been a staple interpretability tool. Our work shows it’s far from reliable! 🚨

29.11.2025 16:38 — 👍 9 🔁 4 💬 1 📌 0

GitHub - dilyabareeva/quanda: A toolkit for quantitative evaluation of data attribution methods. A toolkit for quantitative evaluation of data attribution methods. - dilyabareeva/quanda

Sadly, I wasn’t able to make it to NeurIPS this year. For anyone attending, check out our quanda poster at the ATTRIB workshop tomorrow (Saturday) from 3 to 4:30 pm, presented by Galip Ümit Yolcu and Anna Hedström!

GitHub: github.com/dilyabareeva...
Paper: arxiv.org/abs/2410.07158

13.12.2024 08:01 — 👍 6 🔁 0 💬 0 📌 0

@dilya is following 20 prominent accounts

Lukas Pirch
@lpirch

PhD at BIFOLD, TU-Berlin • Vulnerability Discovery & Graph-based Machine Learning • 🎹🎸

Can
@canrager

@trishachetani

Explainable AI Berlin
@xai-berlin

Explainable AI research from the machine learning group of Prof. Klaus-Robert Müller at @tuberlin.bsky.social & @bifold.berlin

Lenka Tětková
@lenkatetkova

Postdoc at Technical University of Denmark working on ExplainableAI.

Alex Vasileiou
@alevator

PhD student explainable AI @ ML Group TU Berlin, BIFOLD

Philip Naumann
@pnaumann

PhD student at TU Berlin & @bifold.berlin I am interested in Machine Learning, Explainable AI (XAI), and Optimal Transport

Tom Neuhäuser
@tomneuhaeuser

PhD Student @ ML Group TU Berlin, BIFOLD

@golimblevskaia

Dan Butler
@danieljbutler

interests: software, neuroscience, causality, philosophy | ex: salk institute, u of washington, MIT | djbutler.github.io

Sebastian Lapuschkin
@slapuschkin

Head of XAI research at Fraunhofer HHI Google Scholar: https://scholar.google.de/citations?user=wpLQuroAAAAJ

@arinabelova

Pau Rodriguez
@paurodriguez

Research Scientist at Apple Machine Learning Research. Previously ServiceNow and Element AI in Montréal.

Kabir Kumar, aiplans.org
@kabirkumar

I run AI Plans, an AI Safety lab focused on solving AI Alignment before 2029. For several weeks I used a stone for a pillow. I once spent a quarter of my paycheck on cheese. Ping me! DMs not working atm due to totalitarian UK law :( SurpassAI

Bruno Puri
@brunibrun

PhD student @ Fraunhofer HHI. I work on interpretability in NLP

Jerry Lin 🇺🇦
@jlin404.com

Postdoc at Boston University. Posts do not represent the views of my employer—only my own.

Grace
@geminisgoats

Amateur Writer & Ethical AI Advocate Professional Marketing & Event Coordinator Avid Baker, Homesteader & Forager 🌼

@aranguri

Angus Nicolson
@angusjnic

PostDoc at the Medical University of Innsbruck in the Digital Cardiology Lab. Prev Uni of Oxford DPhil.

Donatella Genovese
@donatellag

PhD Student | Works on Explainable AI | https://donatellagenovese.github.io/