Sriram Padmanabhan's Avatar

Sriram Padmanabhan

@sriramp05.bsky.social

Undergraduate at UT Austin majoring in CS & Math

7 Followers  |  4 Following  |  9 Posts  |  Joined: 28.01.2025  |  1.7003

Latest posts by sriramp05.bsky.social on Bluesky

Preview
On Language Models' Sensitivity to Suspicious Coincidences Humans are sensitive to suspicious coincidences when generalizing inductively over data, as they make assumptions as to how the data was sampled. This results in smaller, more specific hypotheses bein...

For more results (e.g., experiments on models’ parametric knowledge about various hypotheses as well as results on the city game) check out the paper here: arxiv.org/abs/2504.09387

21.04.2025 20:51 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

LMs’ zero-shot behavior shows little to no sensitivity to suspicious coincidences. But the results change when the knowledge of the hypothesis space is activated either implicitly (Chain-of-Thought) or explicitly (Knowledge) - sometimes even consistent with humans (qualitatively)

21.04.2025 20:51 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

We test sensitivity in three environments: zero-shot, Chain-of-Thought, and a β€œKnowledge” prompt that provides the model with explicit access to the possible hypotheses the input and target could be sampled from.

21.04.2025 20:51 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

We focus on two domains: the number game from Tenenbaum (1999) with human judgments collected by Eric Bigelow and @spiantado.bsky.social, and a world-cities domain (but with no human judgements).

21.04.2025 20:51 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Given the LM’s yes/no responses, we calculate the F1 scores for members of each hypothesis that fits both the input and target and determine whether the smallest such hypothesis is favored by the model.

21.04.2025 20:51 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

To test model sensitivity to suspicious coincidences, we provide the model with an input that could be sampled from multiple hypotheses (e.g. β€œ16, 8, 2, 64”) and ask it whether a given target value (e.g β€œ32”) is compatible with the input.

21.04.2025 20:51 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

This is known as the suspicious coincidence effect – if you were to convey β€œodd” numbers then it is highly suspicious that you chose those numbers. Humans show this sensitivity across a wide range of contexts: here, smaller hypotheses are favored over more general ones.

21.04.2025 20:51 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Humans readily show sensitivity to the way data is generated when reasoning inductively. E.g.,if some program generated β€œ93, 43, 83, 53” – it’s likely producing numbers ending in 3, even though it’s not the only applicable hypothesis (e.g., they’re all odd numbers)

21.04.2025 20:51 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Are LMs sensitive to suspicious coincidences? Our paper finds that, when given access to knowledge of the hypothesis space, LMs can show sensitivity to such coincidences, displaying parallels with human inductive reasoning. w/@kanishka.bsky.social, @kmahowald.bsky.social, @eunsol.bsky.social

21.04.2025 20:51 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 3

@sriramp05 is following 4 prominent accounts