Anubrata Das @ NAACL 2025's Avatar

Anubrata Das @ NAACL 2025

@anubrata.bsky.social

Just Finished PhD @ UT Austin; Human-Centered NLP. Language Models https://anubrata.github.io

2,538 Followers  |  431 Following  |  45 Posts  |  Joined: 19.07.2023
Posts Following

Posts by Anubrata Das @ NAACL 2025 (@anubrata.bsky.social)

as AI increasingly supports shopping and ads, it’s worth remembering that retrieval often shapes who gets exposure in final generated output. in a recent paper, @teknology.bsky.social uses methods from fair ranking to assess and address exposure bias in downstream generation.

841.io/doc/fairrag....

31.12.2025 14:00 β€” πŸ‘ 9    πŸ” 3    πŸ’¬ 0    πŸ“Œ 1
Preview
The human factor in explainable artificial intelligence: clinician variability in trust, reliance, and performance - npj Digital Medicine npj Digital Medicine - The human factor in explainable artificial intelligence: clinician variability in trust, reliance, and performance

Explainable AI is often assumed to build trust. A study of sonographers estimating gestational age found AI predictions improved accuracy, but explanations did not. In fact, explanations made some clinicians perform worse, highlighting user variability.

#MedSky #MLSky

14.11.2025 17:10 β€” πŸ‘ 5    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
Rising Stars in Data Science

Workshop: datascience.stanford.edu/programs/ris...

@utaustin.bsky.social

07.11.2025 18:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Thrilled to be selected for the πŸŽ“ Rising Stars in Data Science Workshop! Grateful to @stanforddata.bsky.social, @HCID UC San Diego, and @dsi-uchicago.bsky.social for this opportunity.
Excited to share my work on trustworthy and collaborative AI and connect with amazing peers and mentors.
πŸ”— πŸ‘‡

07.11.2025 18:31 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

Yes, more so with code for running quick experiments! i definitely want my code to NOT fail gracefully. (And save myself hours of debugging time because there is a default parameter somewhere I did not notice!)

24.10.2025 21:53 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Ah that makes sense! Thanks, yeah I am on that slack, hhh!

27.08.2025 14:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

How can I get an invite for the XAI discord?

27.08.2025 13:12 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Thank you for making the list, could you please add me?

29.07.2025 13:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

In a stunning moment of self-delusion, the Wall Street Journal headline writers admitted that they don't know how LLM chatbots work.

21.07.2025 01:48 β€” πŸ‘ 2957    πŸ” 471    πŸ’¬ 43    πŸ“Œ 89

What if you could understand and control an LLM by studying its *smaller* sibling?

Our new paper introduces the Linear Representation Transferability Hypothesis. We find that the internal representations of different-sized models can be translated into one another using a simple linear(affine) map.

10.07.2025 17:26 β€” πŸ‘ 25    πŸ” 10    πŸ’¬ 1    πŸ“Œ 1
Preview
To Spot Toxic Speech Online, Try AI - McCombs News and Magazine A new tool helps balance accuracy with fairness toward all groups in social media

McCombs article: news.mccombs.utexas.edu/research/to-...
Paper url: doi.org/10.47989/ir3...

@utaustin.bsky.social
@texasscience.bsky.social
@engagingnews.bsky.social
@utischool.bsky.social

#TexasAI
#YearofAI

06.06.2025 15:10 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Can content moderation models balance accuracy & fairness?
UT McCombs news featured our iConference paper by Soumyajit Gupta on optimizing the fairness-accuracy tradeoff in toxicity detection. In collaboration with Venelin Kovatchev @mariadearteaga.bsky.social @mattlease.bsky.social

06.06.2025 15:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1
Post image Post image

How good are LLMs at πŸ”­ scientific computing and visualization πŸ”­?

AstroVisBench tests how well LLMs implement scientific workflows in astronomy and visualize results.

SOTA models like Gemini 2.5 Pro & Claude 4 Opus only match ground truth scientific utility 16% of the time. 🧡

02.06.2025 15:41 β€” πŸ‘ 10    πŸ” 2    πŸ’¬ 1    πŸ“Œ 4

#NAACL2025

03.05.2025 15:27 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Please join us for the TrustNLP workshop (215 San Miguel) @naaclmeeting.bsky.social #trustNLP2025

03.05.2025 15:25 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 1

Session detail:

Poster Session 5 - IAM: Interpretability and Analysis of Models for NLP, Hall 3

01.05.2025 05:27 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

This is a collaborative work with Manoj Kumar, Ninareh Mehrabi, Anil Ramakrishna, Anna Rumshisky, Kai-Wei Chang, Aram Galstyan, Morteza Ziyadi, Rahul Gupta

01.05.2025 05:26 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Causal tracing informed edits provide a better detoxification-degeneration trade-off.

01.05.2025 05:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Model editing helps reduce toxicity. High detoxification can be achieved by simply editing random MLP layers. However, this leads to degeneration and increased perplexity.

01.05.2025 05:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We find evidence of toxic memory in the early layer of GPT-2 XL for innocuous-looking adversarial prompts.

01.05.2025 05:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Paper: On Localizing and Deleting Toxic Memories in Large Language Models
Anthology URL: aclanthology.org/2025.finding...

01.05.2025 05:24 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Excited to present my internship work at
Amazon AGI at @naaclmeeting.bsky.social tomorrow at 2:00 pm local time. Please come say hi if you are around.

01.05.2025 05:21 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

thinking of calling this "The Illusion Illusion"

(more examples below)

01.12.2024 14:33 β€” πŸ‘ 1582    πŸ” 386    πŸ’¬ 60    πŸ“Œ 91

Created a small starter pack including folks whose work I believe contributes to more rigorous and grounded AI research -- I'll grow this slowly and likely move it to a list at some point :) go.bsky.app/P86UbQw

30.11.2024 19:58 β€” πŸ‘ 12    πŸ” 5    πŸ’¬ 1    πŸ“Œ 0

NeurIPS Test of Time Awards:

Generative Adversarial Nets
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio

Sequence to Sequence Learning with Neural Networks
Ilya Sutskever, Oriol Vinyals, Quoc V. Le

27.11.2024 17:32 β€” πŸ‘ 311    πŸ” 28    πŸ’¬ 6    πŸ“Œ 4

Right, sorry for being unclear. I saw your comment sharing the Qualtrics integration tutorial with a video. bsky.app/profile/dggo...

25.11.2024 21:33 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Nvm, found it!

25.11.2024 17:05 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Preview
Home - Obsidian Publish Request If you use our template (.QSF) to set up your research, we would appreciate it if you cite our paper when describing your method: Durably reducing conspiracy beliefs through dialogues with AI…

@tomcostello.bsky.social 's Qualitrics materials and tutorial video for integrating LLMs into Qualtrics can be accessed at publish.obsidian.md/qualtrics-do...

25.11.2024 15:40 β€” πŸ‘ 15    πŸ” 5    πŸ’¬ 2    πŸ“Œ 0

Will there be a video for this talk?

25.11.2024 17:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

πŸ™‹πŸ½

24.11.2024 04:25 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0