David Mortensen's Avatar

David Mortensen

@davidrmortensen.bsky.social

I make colorless green GPUs sleep brrriously. Computational phonology, morphology, language change models, speech/language technologies (especially for people with disabilities).

767 Followers  |  1,214 Following  |  75 Posts  |  Joined: 19.11.2024  |  1.9373

Latest posts by davidrmortensen.bsky.social on Bluesky

Post image

๐ŸšจNew Paper: LLM developers aim to align models with values like helpfulness or harmlessness. But when these conflict, which values do models choose to support? We introduce ConflictScope, a fully-automated evaluation pipeline that reveals how models rank values under conflict.
(๐Ÿ“ท xkcd)

02.10.2025 16:04 โ€” ๐Ÿ‘ 13    ๐Ÿ” 4    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 3
Post image

๐Ÿ”ˆWhen LLMs solve tasks with a mid-to-low resource input or target language, their output quality is poor. We know that. But can we put our finger on what breaks inside the LLM? We introduce the ๐Ÿ’ฅ translation barrier hypothesis ๐Ÿ’ฅ for failed multilingual generation with LLMs. arxiv.org/abs/2506.22724

04.07.2025 17:04 โ€” ๐Ÿ‘ 26    ๐Ÿ” 7    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 1

Thrilled to share that this is out in @pnas.org today! ๐ŸŽ‰

We show that linguistic generalization in language models can be due to underlying analogical mechanisms.

Shoutout to my amazing co-authors @weissweiler.bsky.social, @davidrmortensen.bsky.social, Hinrich Schรผtze, and Janet Pierrehumbert!

09.05.2025 18:29 โ€” ๐Ÿ‘ 37    ๐Ÿ” 6    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2
Post image

When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs:

๐Ÿงต1/9

09.06.2025 13:47 โ€” ๐Ÿ‘ 70    ๐Ÿ” 21    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 2
Post image

RL boosts LLM reasoningโ€”but why stop at math & code? ๐Ÿค”
Meet Nemotron-CrossThinkโ€”a method to scale RL-based self-learning across law, physics, social science & more.

๐Ÿ”ฅResulting in a model that reasons broadly, adapts dynamically, & uses 28% fewer tokens for correct answers!
๐Ÿงตโ†“

01.05.2025 17:41 โ€” ๐Ÿ‘ 5    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

On my way to #NAACL2025 where I'll give a keynote at the noisy text workshop (WNUT), presenting some of the challenges & methods for dialect NLP + also discussing dialect speakers' perspectives!

๐Ÿ—จ๏ธ Beyond โ€œnoisyโ€ text: How (and why) to process dialect data
๐Ÿ—“๏ธ Saturday, May 3, 9:30โ€“10:30

29.04.2025 09:17 โ€” ๐Ÿ‘ 27    ๐Ÿ” 7    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Post image

Excited to announce our #NAACL2025 Oral paper! ๐ŸŽ‰โœจ

We carried out the largest systematic study so far to map the links between upstream choices, intrinsic bias, and downstream zero-shot performance across 131 CLIP Vision-language encoders, 26 datasets, and 55 architectures!

29.04.2025 19:11 โ€” ๐Ÿ‘ 21    ๐Ÿ” 6    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Can self-supervised models ๐Ÿค– understand allophony ๐Ÿ—ฃ? Excited to share my new #NAACL2025 paper: Leveraging Allophony in Self-Supervised Speech Models for Atypical Pronunciation Assessment arxiv.org/abs/2502.07029 (1/n)

29.04.2025 17:00 โ€” ๐Ÿ‘ 15    ๐Ÿ” 10    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
Post image

๐Ÿš€ Excited to share a new interp+agents paper: ๐Ÿญ๐Ÿฑ MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools appearing at #NAACL2025

This was work done @msftresearch.bsky.social last summer with Jason Eisner, Justin Svegliato, Ben Van Durme, Yu Su, and Sam Thomson

1/๐Ÿงต

29.04.2025 13:41 โ€” ๐Ÿ‘ 12    ๐Ÿ” 8    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2
Post image

When interacting with ChatGPT, have you wondered if they would ever "lie" to you? We found that under pressure, LLMs often choose deception. Our new #NAACL2025 paper, "AI-LIEDAR ," reveals models were truthful less than 50% of the time when faced with utility-truthfulness conflicts! ๐Ÿคฏ 1/

28.04.2025 20:36 โ€” ๐Ÿ‘ 25    ๐Ÿ” 9    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 3
Post image

1/๐Ÿšจ ๐—ก๐—ฒ๐˜„ ๐—ฝ๐—ฎ๐—ฝ๐—ฒ๐—ฟ ๐—ฎ๐—น๐—ฒ๐—ฟ๐˜ ๐Ÿšจ
RAG systems excel on academic benchmarks - but are they robust to variations in linguistic style?

We find RAG systems are brittle. Small shifts in phrasing trigger cascading errors, driven by the complexity of the RAG pipeline ๐Ÿงต

17.04.2025 19:55 โ€” ๐Ÿ‘ 9    ๐Ÿ” 5    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2

THIS IS HUGE! Researchers at McMaster University have discovered a NEW peptide antibiotic that targets a broad range of disease-causing bacteria INCLUDING those RESISTANT to existing antibiotics. This discovery marks the first potential new class of antibiotics in NEARLY 30 YEARS. ๐Ÿงช๐Ÿงตโฌ‡๏ธ

31.03.2025 16:00 โ€” ๐Ÿ‘ 9353    ๐Ÿ” 2780    ๐Ÿ’ฌ 227    ๐Ÿ“Œ 284
CDS building which looks like a jenga tower

CDS building which looks like a jenga tower

Life update: I'm starting as faculty at Boston University
@bucds.bsky.social in 2026! BU has SCHEMES for LM interpretability & analysis, I couldn't be more pumped to join a burgeoning supergroup w/ @najoung.bsky.social @amuuueller.bsky.social. Looking for my first students, so apply and reach out!

27.03.2025 02:24 โ€” ๐Ÿ‘ 245    ๐Ÿ” 13    ๐Ÿ’ฌ 35    ๐Ÿ“Œ 6

You should read Article 1 of the United States Constitution. It's a trip.

19.03.2025 04:49 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

There can be only one DB joke. And that is DB.

19.03.2025 04:29 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Advancing the Database of Cross-Linguistic Colexifications with New Workflows and Data Lexical resources are crucial for cross-linguistic analysis and can provide new insights into computational models for natural language learning. Here, we present an advanced database for comparative ...

New preprint by @annikatjuka.bsky.social, Robert Forkel, Christoph Rzymski, and myself available, presenting a new version of the Database of Cross-Linguistic Colexifications (CLICS).

"Advancing the Database of Cross-Linguistic Colexifications with New Workflows and Data"

arxiv.org/abs/2503.11377

17.03.2025 10:25 โ€” ๐Ÿ‘ 7    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Finally found a way to shorten faculty meetings.

16.03.2025 16:30 โ€” ๐Ÿ‘ 262    ๐Ÿ” 59    ๐Ÿ’ฌ 18    ๐Ÿ“Œ 3

No student anywhere in America has said something as antisemitic as this

12.03.2025 18:12 โ€” ๐Ÿ‘ 127    ๐Ÿ” 22    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Midwest Speech and Language Days 2025

The meeting will feature keynote addresses by
@mohitbansal.bsky.social, @davidrmortensen.bsky.social, Karen Livescu, and Heng Ji. Plus all of your great talks and posters! nlp.nd.edu/msld25

08.03.2025 18:35 โ€” ๐Ÿ‘ 4    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Iโ€™ve been thinking about this reading from Isaiah 58 since I heard it at the Ash Wednesday service today.

โ€œIs not this the fast that I choose:
to loose the bonds of injustice,
to undo the thongs of the yoke,
to let the oppressed go free,
and to break every yoke?

06.03.2025 00:16 โ€” ๐Ÿ‘ 193    ๐Ÿ” 33    ๐Ÿ’ฌ 8    ๐Ÿ“Œ 3
Preview
Trump Decried Millions Spent 'Making Mice Transgender.' It Was Cancer and Asthma Research President Trump falsely claimed that Biden spent $8 million on 'making mice transgender,' but the real research was for human health.

โ€œAgain, the mice used for clinical purposes did not undergo gender transition.โ€

www.rollingstone.com/politics/pol...

06.03.2025 00:36 โ€” ๐Ÿ‘ 6013    ๐Ÿ” 1179    ๐Ÿ’ฌ 531    ๐Ÿ“Œ 148
Preview
Congressman Al Green on X: "Today, the House GOP censured me for speaking out for the American people against @POTUSโ€™s plan to cut Medicaid. I accept the consequences of my actions, but I refuse to stay silent in the face of injustice. #WeShallOvercome https://t.co/sVklRmPCJl" / X Today, the House GOP censured me for speaking out for the American people against @POTUSโ€™s plan to cut Medicaid. I accept the consequences of my actions, but I refuse to stay silent in the face of injustice. #WeShallOvercome https://t.co/sVklRmPCJl

Today, the House GOP censured me for speaking out for the American people against @POTUSโ€™s plan to cut Medicaid. I accept the consequences of my actions, but I refuse to stay silent in the face of injustice. #WeShallOvercome x.com/repalgreen/s...

06.03.2025 21:23 โ€” ๐Ÿ‘ 107183    ๐Ÿ” 19450    ๐Ÿ’ฌ 10432    ๐Ÿ“Œ 1837
Screenshot of Arxiv paper title, "Rejected Dialects: Biases Against African American Language in Reward Models," and author list: Joel Mire, Zubin Trivadi Aysola, Daniel Chechelnitsky, Nicholas Deas, Chrysoula Zerva, and Maarten Sap.

Screenshot of Arxiv paper title, "Rejected Dialects: Biases Against African American Language in Reward Models," and author list: Joel Mire, Zubin Trivadi Aysola, Daniel Chechelnitsky, Nicholas Deas, Chrysoula Zerva, and Maarten Sap.

Reward models for LMs are meant to align outputs with human preferencesโ€”but do they accidentally encode dialect biases? ๐Ÿค”

Excited to share our paper on biases against African American Language in reward models, accepted to #NAACL2025 Findings! ๐ŸŽ‰

Paper: arxiv.org/abs/2502.12858 (1/10)

06.03.2025 19:49 โ€” ๐Ÿ‘ 37    ๐Ÿ” 11    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2

I read a paper about search, but I can't quite remember what it's called.

05.03.2025 15:30 โ€” ๐Ÿ‘ 8    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Tip of the Tongue Query Elicitation for Simulated Evaluation Tip-of-the-tongue (TOT) search occurs when a user struggles to recall a specific identifier, such as a document title. While common, existing search systems often fail to effectively support TOT scena...

๐ŸšจNew Breakthrough in Tip-of-the-Tongue (TOT) Retrieval Research!

We address data limitations and offer a fresh evaluation method for these complex queries.

Curious how TREC TOT track test queries are created? Check out this thread ๐Ÿงต and our paper ๐Ÿ“„: arxiv.org/abs/2502.17776

05.03.2025 01:32 โ€” ๐Ÿ‘ 17    ๐Ÿ” 7    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 1

everything is so shitty, read this story about a genuinely good man who saw he had an opportunity to save millions of lives and threw himself into doing so. the world is full of heroes like him.

04.03.2025 11:16 โ€” ๐Ÿ‘ 8459    ๐Ÿ” 1830    ๐Ÿ’ฌ 61    ๐Ÿ“Œ 19
Post image

I humbly put this forward as a possible campaign for the new electric VW

04.03.2025 01:06 โ€” ๐Ÿ‘ 648    ๐Ÿ” 122    ๐Ÿ’ฌ 10    ๐Ÿ“Œ 5

@davidrmortensen is following 20 prominent accounts