Stephan Hollander

Stephan Hollander

@stephanhollander.bsky.social

Professor @TilburgU School of Economics and Management. Computational linguistics, text-as-data, and Python (@ThePSF) enthusiast. ZEPH 3 17

3,671 Followers 113 Following 111 Posts Joined Sep 2023
1 day ago

Wait, what’s the connection there? 🤷‍♂️

1 0 1 0
3 days ago
Preview
🗄 history of NLP and the ACL | Are.na

I'm lecturing about the "History of NLP" this week. What should I include? Any favorite anecdotes, images, people, methods? Slides, books, papers, or talks for inspiration or grounding?

I've been maintaining a small collection here: www.are.na/maria-antoni...

66 12 22 1
4 days ago

Threading some stuff about oil & oil markets, just basic but hope it helps:
1/ oil markets are what you call “finely balanced”. Supply is usually very very close to demand/consumption. Demand is hard to shift *quickly* in response to supply hiccups.
So even small supply changes = big price effects

206 87 3 10
5 days ago
Post image

A lifetime of collecting: the 70,000-volume home library of Bruno Schröder, a mining engineer. A wonderland of books 🤩 www.rarebookhub.com/articles/3355

5 0 0 0
6 days ago
Post image

Published paper proving that #ChatGPT will always make things up.

Not sometimes. Not until the next update. Always. They proved it with math.

Even with perfect data and unlimited computing power, AI models will still confidently tell you things that are completely false.

arxiv.org/abs/2509.04664

90 36 4 2
1 week ago

🏷️ @aleximas.bsky.social

3 1 0 0
1 week ago
Preview
NY State Senate Bill 2025-S7263 Imposes liability for damages caused by a chatbot impersonating certain licensed professionals.

NY is proposing to "Impose liability for damages caused by a chatbot impersonating certain licensed professionals." nysenate.gov/legislation/... How does a chatbot trick you into thinking its a doctor? Senators: If you forgot you were conversing with AI, you need a doctor.

3 2 0 0
1 week ago

Ok I'm in a rabbit hole. If you search "how many decisions do we make in a day" the reported number is almost always 35,000, often reported that this is according to "multiple sources". Yet I can't actually find a single source that backs up that number. Anyone know where this number comes from?

22 8 6 0
5 months ago

Spicy take of the day: there are _always_ unmeasured confounders. We just make value judgements over how much they matter with respect to Y
#statsky #rstats

21 2 2 2
2 weeks ago
Post image

[ #GenAI-post warning] Almost every researcher I know is using Claude Code, and talking about the huge productivity gains. Are we actually producing more scientific papers yet? Since its release in May 2025, arXiv submissions are indeed *12%* above what we'd expect. Details in thread:

28 3 4 2
1 month ago

Looks very promising! Thanks for sharing it here

3 1 0 0
1 month ago
Video thumbnail

Our paper “Inferring fine-grained migration patterns across the United States” is now out in @natcomms.nature.com! We released a new, highly granular migration dataset. 1/9

71 27 2 5
1 month ago

Love the NLP: thoughtful application @economist.com

8 2 0 0
1 month ago

'Taxi' and 'cab' essentially mean the same thing, but 'cab' doesn't come from 'taxicab.'

It comes from ‘cabriolet,’ which was a type of light carriage.

‘Taxi’ comes from ‘taximeter,’ which is the device that calculates the amount of a fare based on the distance traveled.

733 131 13 16
1 month ago
Preview
GitHub - reifjulian/strgroup: Match strings based on their Levenshtein edit distance. Match strings based on their Levenshtein edit distance. - reifjulian/strgroup

I've released a new version of strgroup, a Stata command that does fuzzy string matching. No new functionality, but the underlying C code has been optimized: it now uses much less memory and runs about 5 times faster
github.com/reifjulian/s...

28 9 1 1
2 months ago

Here in Europe, I often hear British English–style pronunciations like DAH-ta, STAH-ta, and LAH-tech—quite consistent?

0 0 1 0
2 months ago
How to pronounce "Stata" - Jason Kerwin From the Statalist FAQ (emphasis mine): 4.1 What is the correct way to pronounce ‘Stata’? Stata is an invented word. Some pronounce it with a long a as in day (Stay-ta); some pronounce it with a short...

Classic question! (LaTeX fans know the struggle.) According to this discussion jasonkerwin.com/nonparibus/2... I’m guessing they leave it up in the air?

1 0 1 0
3 months ago
Post image Post image Post image Post image

Open Studio with Pop-Up Store:
Friday December 12, 16-19h
Saturday December 13, 11-18h
Join us at:
Studio Christoph Niemann
Schröderstrasse 2
10115 Berlin
shop.christophniemann.com

29 3 1 1
3 months ago
"Captain Gains" on Capitol Hill
Shang-Jin Wei & Yifan Zhou
WORKING PAPER 34524
DOI 10.3386/w34524
ISSUE DATE November 2025
Using transaction-level data on US congressional stock trades, we find that lawmakers who later ascend to leadership positions perform similarly to matched peers beforehand but outperform them by 47 percentage points annually after ascension. Leaders' superior performance arises through two mechanisms. The political influence channel is reflected in higher returns when their party controls the chamber, sales of stocks preceding regulatory actions, and purchase of stocks whose firms receiving more government contracts and favorable party support on bills. The corporate access channel is reflected in stock trades that predict subsequent corporate news and greater returns on donor-owned or home-state firms. 令
1
1
-9
-8
-7
-6
-5
-4
-3
-2 -1
1
2
3
4
5
7
8
9
Year
Figure 2: Estimated dynamic quasi-difference-in-differences coefficient, di, of equation(3), with vertical dashed lines representing 90 percent confidence intervals. The point estimate of the year in which the lawmaker became a congressional leader (Year 0) is normalized to zero. BHAR over the 250 days following each trade is the dependent variable and calculated using the Fama-French five-factor plus momentum as the benchmark model.

After becoming a congressional leader, a politician’s stock portfolio beats out those of peers by 47 (!!!) percentage points a year through trades timed around bills and firms that later get government contracts

www.nber.org/papers/w34524

via @florianederer.bsky.social

1,434 629 32 83
3 months ago
Post image Post image Post image

Interesting paper highlight that binning can be misspecified in panel settings - this drives misinterpretation of extreme temperature shocks. #linkoftheday

www.dropbox.com/scl/fi/1ya6z...

71 18 5 4
3 months ago

In light of record submission rates and a large volume of AI-generated slop, SocArXiv recently implemented a policy requiring ORCIDs linked in the OSF profile of submitting authors, and narrowing our focus to social science subjects. Today we are taking two more steps:
/1

286 143 4 23
3 months ago
Preview
GitHub - BenjaminGor/Latex_Notes_Tutorial: Latex Book/Note Writing Tutorial Latex Book/Note Writing Tutorial. Contribute to BenjaminGor/Latex_Notes_Tutorial development by creating an account on GitHub.

How to Reproduce this Book Exactly with LaTeX - great resource for writing Latex #linkoftheday
github.com/BenjaminGor/...

23 3 0 0
3 months ago
Post image

This is terrifying.

"[AI agents] can... infer a researcher's latent hypotheses and produce data that artificially confirms them."

...

"We can no longer trust that survey responses are coming from real people" -@seanjwestwood.bsky.social

312 121 7 17
4 months ago

NBER grants to fund research into economic measurement - I find this agenda very compelling and important #linkoftheday
www.nber.org/news/nber-la...

34 17 0 2
4 months ago

Oof, this is super slick! Go check it out 👇🏻

2 0 0 0
4 months ago
Post image

This paper’s been popping as “evidence” that you can’t do real #causalinference w/ obs data. To me it shows you need rigorous pre-specified design (in addition to the willingness to fold when your hypothesis is not possible to answer with the data at hand). #EpiSky, #CausalSky, #AcademicSky

9 4 2 1
4 months ago
Post image

🚨Next time you hire, don’t take it easy! In a new working paper, @elliottash.bsky.social, Jason Sockin, and I show the difficulty of the interview signals to workers whether the job is a good fit. 🚨

Paper link: papers.ssrn.com/sol3/papers....

31 9 1 0
4 months ago

Do any other languages (Dutch, German, Spanish) share this quirk, or is English alone in having a verb whose past tense is an exact anagram of its base form?

1 0 0 0
4 months ago

Are there any other English verbs whose past tense is formed by simply rearranging the same letters (with no additions or deletions) as their present tense — like eat → ate? I can’t think of another example #NLP #linguistics

5 1 2 1
4 months ago
Post image

This study uses computational methods, including #AI, to analyze textbooks from public, religious private, & home schools, focusing on how they portray people, topics, & values over time.

Read: papers.ssrn.com/sol3/papers....
Subscribe: www.ssrn.com/index.cfm/en...

#AICommunity

2 5 1 0