Gabriel Agostini (@gsagostini) — Bluesky Profile

3 weeks ago

New paper! The Linear Representation Hypothesis is a powerful intuition for how language models work, but lacks formalization. We give a mathematical framework in which we can ask and answer a basic question: how many features can be stored under the hypothesis? 🧵 arxiv.org/abs/2602.11246

43 14 1 2

1 month ago

I don’t think the methods would be hard to replicate—there is code on my GitHub repo! But I’d imagine we’d need some domain expertise about what are good sources of (noisy, biased) fine-grained migration records in these countries, or about the frequency of census data. Happy to chat more!

1 0 0 0

1 month ago

New research is offering new insight on how Americans move — all the way to the neighborhood level.

A new dataset, MIGRATE, maps annual moves with 4,600‑times more detail than standard public data, revealing patterns hidden in county‑level reporting:

https://bit.ly/49XSD6w

5 2 0 0

1 month ago

Inferring fine-grained migration patterns across the United States - Nature Communications This study releases a very high-resolution migration dataset that reveals trends that shape daily life: rising moves into high-income neighborhoods, racial gaps in upward mobility, and wildfire-driven...

We are excited to see what you do with this data, and hope to build on this work in the future. You can read our full paper at nature.com/articles/s41467-025-68019-2

This is joint work with Rachel Young, Maria Fitzpatrick, @nkgarg.bsky.social, and @emmapierson.bsky.social! 9/9

12 4 0 1

1 month ago

MIGRATE

There are over 100 academic, non-profit, governmental, and journalistic research teams from all over the world already using MIGRATE to study topics across the health, environmental, and social sciences! If you are a researcher interested in data access, visit migrate.tech.cornell.edu. 8/9

5 0 1 0

1 month ago

MIGRATE is also useful for studying local migration trends. For example, it reveals dramatic rates of out-migration after wildfires in California that are invisible in previous Census datasets. 7/9

4 1 1 1

1 month ago

We found, for example, racial disparities in upward mobility —that is, the rate at which people move to higher-income areas varies according to the racial composition of their current area of residence, even after controlling for income levels. 6/9

5 2 1 0

1 month ago

A great advantage of MIGRATE is that CBG-level data allows us to more accurately discuss socioeconomic and demographic trends. We documented national migration patterns, such as how far do people move and what are the characteristics of the areas they move to. 5/9

2 0 1 0

1 month ago

MIGRATE is over 4,000x more spatially granular than publicly available migration datasets, highly correlated with Census data, and less biased than proprietary address data. We’re making it available to researchers asking all those important migration questions!! 4/9

1 0 1 0

1 month ago

We created and released MIGRATE: annual flows between all 47 billion pairs of US Census Block Groups. To do it, we developed an iterative-proportional-fitting based algorithm that harmonizes granular but biased proprietary data with coarser but more reliable Census data. 3/9

1 0 1 0

1 month ago

Migration data is key for studying responses to environmental disasters, policy impacts, etc. But public, county-level migration data is too coarse to capture many important dynamics. Proprietary, individual-level address histories are highly granular, but potentially biased. 2/9

2 0 1 0

1 month ago

Our paper “Inferring fine-grained migration patterns across the United States” is now out in @natcomms.nature.com! We released a new, highly granular migration dataset. 1/9

71 27 2 5

3 months ago

November is over, but we still have some #30DayMapChallenge entries to share! And for our transport-themed day 26 map, MBTA data analyst Joe Hilleary takes us on a ride back in time: he shows current bus routes in Greater Boston by the earliest known year in which a direct percursor route ran a bus.

7 1 1 0

3 months ago

We have a new paper in Science Advances proposing a simple test for bias:

Is the same person treated differently when their race is perceived differently?

Specifically, we study: is the same driver likelier to be searched by police when they are perceived as Hispanic rather than white?

1/

44 16 2 1

3 months ago

My best workflow improvement since starting to work with spatial libraries in Python was to always include a `crs` dictionary on a variables file listing crs for lat-long projections, equidistant projections, and "maybe not satisfying any desiderata but the prettiest out there" projections.

2 0 0 0

3 months ago

#30DayMapChallenge day 20: water

This map of Arsenic and Cadmium levels in Mexico's water show non-trace concentrations of Total and Soluble Arsenic and Cadium. Points are colored by the presence of high amounts of contaminants, and sized by their relative concentration.

tinyurl.com/map20wtr

8 2 1 0

3 months ago

#30DayMapChallenge 15: Fire

@sylviaimani.bsky.social visualized how Uganda’s transition toward electric cooking aligns with the reach of the national grid. Regions with denser grid networks show a strong correlation with higher household adoption of electric cooking technologies.

12 2 1 0

3 months ago

For #30DayMapChallenge Day 18: Out of this world, use the `fill_z_offset` param in mapgl to "elevate" your data.

Just be careful - if you choose a value too high, you might lose your data in the sky!

#rstats

7 2 0 0

3 months ago

I took me too long to accept that "Amsterdam is just 10th Ave"

1 0 0 0

3 months ago

#30DayMapChallenge day 10: Air

@jessiefin.bsky.social + Francisco Marmolejo-Cossío visualize the presence of ladrilleras, or brick kilns, which emit pollution across the state. Data cleaned by Jacqueline Calderón and Lizet Jarquin at UASLP.

Full interactive map: tinyurl.com/map10-air

8 2 1 0

3 months ago

#30DayMapChallenge day 9 asked us to get off our screens. @annaloganmc.bsky.social's "analog" map is a hand-painted postcard! 📫

"I chose to paint a postcard of a map of Ann Arbor where I currently live showing the Huron River!" she says

13 5 0 0

3 months ago

DOHMH New York City Restaurant Inspection Results | NYC Open Data

tho it's very possible that the dataset misnamed a Subway or two. I filtered for the DOHMH restaurants doing business as "Subway": data.cityofnewyork.us/Health/DOHMH-New-York-City-Restaurant-Inspection-Results/43nn-pn8j

0 0 0 0

3 months ago

The LIC Subway between the Queens Plaza and Queensboro Plaza stations is there, but so close to both stations (and CBs are small there) that we only see a slit of purple! That's also the only Subway in LIC per Google Maps but I wonder if that one is also not up to date.

1 0 2 0

3 months ago

Of course! Glad you liked it

1 0 1 0

4 months ago

Proposing the Subway-subway (🥪-🚇) index

6 0 1 0

4 months ago

Great map(s) by @jennahgosciak.bsky.social ---can we count that for 6 days of mapping??---that show both the permanence and the vulnerability of ecological concepts in our urban landscapes!
#30DayMapChallenge

4 0 0 0

4 months ago

We are slowly catching up to the #30DayMapChallenge!

In our day 3: polygons submission, @zhixuanqi.bsky.social questioned the boundaries and fuzziness of polygons with an animated map that invites us to think about the (not-so-well-defined) idea of neighborhoods.

17 4 0 1

4 months ago

Another dog map, this is 1 dog = 1 dot. And hopefully 1 day = 1 map for the next 30 days in our working group page 🗺️

1 0 0 0

4 months ago

N-gram novelty is widely used to evaluate language models' ability to generate text outside of their training data. More recently, it has also been adopted as a metric for measuring textual creativity. However, theoretical work on creativity suggests that this approach may be inadequate, as it does not account for creativity's dual nature: novelty (how original the text is) and appropriateness (how sensical and pragmatic it is). We investigate the relationship between this notion of creativity and n-gram novelty through 7542 expert writer annotations (n=26) of novelty, pragmaticality, and sensicality via close reading of human and AI-generated text. We find that while n-gram novelty is positively associated with expert writer-judged creativity, ~91% of top-quartile expressions by n-gram novelty are not judged as creative, cautioning against relying on n-gram novelty alone. Furthermore, unlike human-written text, higher n-gram novelty in open-source LLMs correlates with lower pragmaticality. In an exploratory study with frontier close-source models, we additionally confirm that they are less likely to produce creative expressions than humans. Using our dataset, we test whether zero-shot, few-shot, and finetuned models are able to identify creative expressions (a positive aspect of writing) and non-pragmatic ones (a negative aspect). Overall, frontier LLMs exhibit performance much higher than random but leave room for improvement, especially struggling to identify non-pragmatic expressions. We further find that LLM-as-a-Judge novelty scores from the best-performing model were predictive of expert writer preferences.

N-gram novelty is widely used as a measure of creativity and generalization. But if LLMs produce highly n-gram novel expressions that don’t make sense or sound awkward, should they still be called creative? In a new paper, we investigate how n-gram novelty relates to creativity.

41 10 1 2

4 months ago

New #NeurIPS2025 paper: how should we evaluate machine learning models without a large, labeled dataset? We introduce Semi-Supervised Model Evaluation (SSME), which uses labeled and unlabeled data to estimate performance! We find SSME is far more accurate than standard methods.

21 7 1 4