Wait, what’s the connection there? 🤷♂️
I'm lecturing about the "History of NLP" this week. What should I include? Any favorite anecdotes, images, people, methods? Slides, books, papers, or talks for inspiration or grounding?
I've been maintaining a small collection here: www.are.na/maria-antoni...
Threading some stuff about oil & oil markets, just basic but hope it helps:
1/ oil markets are what you call “finely balanced”. Supply is usually very very close to demand/consumption. Demand is hard to shift *quickly* in response to supply hiccups.
So even small supply changes = big price effects
A lifetime of collecting: the 70,000-volume home library of Bruno Schröder, a mining engineer. A wonderland of books 🤩 www.rarebookhub.com/articles/3355
Published paper proving that #ChatGPT will always make things up.
Not sometimes. Not until the next update. Always. They proved it with math.
Even with perfect data and unlimited computing power, AI models will still confidently tell you things that are completely false.
arxiv.org/abs/2509.04664
🏷️ @aleximas.bsky.social
NY is proposing to "Impose liability for damages caused by a chatbot impersonating certain licensed professionals." nysenate.gov/legislation/... How does a chatbot trick you into thinking its a doctor? Senators: If you forgot you were conversing with AI, you need a doctor.
Ok I'm in a rabbit hole. If you search "how many decisions do we make in a day" the reported number is almost always 35,000, often reported that this is according to "multiple sources". Yet I can't actually find a single source that backs up that number. Anyone know where this number comes from?
Spicy take of the day: there are _always_ unmeasured confounders. We just make value judgements over how much they matter with respect to Y
#statsky #rstats
[ #GenAI-post warning] Almost every researcher I know is using Claude Code, and talking about the huge productivity gains. Are we actually producing more scientific papers yet? Since its release in May 2025, arXiv submissions are indeed *12%* above what we'd expect. Details in thread:
Looks very promising! Thanks for sharing it here
Our paper “Inferring fine-grained migration patterns across the United States” is now out in @natcomms.nature.com! We released a new, highly granular migration dataset. 1/9
Love the NLP: thoughtful application @economist.com
'Taxi' and 'cab' essentially mean the same thing, but 'cab' doesn't come from 'taxicab.'
It comes from ‘cabriolet,’ which was a type of light carriage.
‘Taxi’ comes from ‘taximeter,’ which is the device that calculates the amount of a fare based on the distance traveled.
I've released a new version of strgroup, a Stata command that does fuzzy string matching. No new functionality, but the underlying C code has been optimized: it now uses much less memory and runs about 5 times faster
github.com/reifjulian/s...
Here in Europe, I often hear British English–style pronunciations like DAH-ta, STAH-ta, and LAH-tech—quite consistent?
Classic question! (LaTeX fans know the struggle.) According to this discussion jasonkerwin.com/nonparibus/2... I’m guessing they leave it up in the air?
Open Studio with Pop-Up Store:
Friday December 12, 16-19h
Saturday December 13, 11-18h
Join us at:
Studio Christoph Niemann
Schröderstrasse 2
10115 Berlin
shop.christophniemann.com
After becoming a congressional leader, a politician’s stock portfolio beats out those of peers by 47 (!!!) percentage points a year through trades timed around bills and firms that later get government contracts
www.nber.org/papers/w34524
via @florianederer.bsky.social
Interesting paper highlight that binning can be misspecified in panel settings - this drives misinterpretation of extreme temperature shocks. #linkoftheday
www.dropbox.com/scl/fi/1ya6z...
In light of record submission rates and a large volume of AI-generated slop, SocArXiv recently implemented a policy requiring ORCIDs linked in the OSF profile of submitting authors, and narrowing our focus to social science subjects. Today we are taking two more steps:
/1
How to Reproduce this Book Exactly with LaTeX - great resource for writing Latex #linkoftheday
github.com/BenjaminGor/...
This is terrifying.
"[AI agents] can... infer a researcher's latent hypotheses and produce data that artificially confirms them."
...
"We can no longer trust that survey responses are coming from real people" -@seanjwestwood.bsky.social
NBER grants to fund research into economic measurement - I find this agenda very compelling and important #linkoftheday
www.nber.org/news/nber-la...
Oof, this is super slick! Go check it out 👇🏻
This paper’s been popping as “evidence” that you can’t do real #causalinference w/ obs data. To me it shows you need rigorous pre-specified design (in addition to the willingness to fold when your hypothesis is not possible to answer with the data at hand). #EpiSky, #CausalSky, #AcademicSky
🚨Next time you hire, don’t take it easy! In a new working paper, @elliottash.bsky.social, Jason Sockin, and I show the difficulty of the interview signals to workers whether the job is a good fit. 🚨
Paper link: papers.ssrn.com/sol3/papers....
Do any other languages (Dutch, German, Spanish) share this quirk, or is English alone in having a verb whose past tense is an exact anagram of its base form?
Are there any other English verbs whose past tense is formed by simply rearranging the same letters (with no additions or deletions) as their present tense — like eat → ate? I can’t think of another example #NLP #linguistics
This study uses computational methods, including #AI, to analyze textbooks from public, religious private, & home schools, focusing on how they portray people, topics, & values over time.
Read: papers.ssrn.com/sol3/papers....
Subscribe: www.ssrn.com/index.cfm/en...
#AICommunity