Gabriel Agostini's Avatar

Gabriel Agostini

@gsagostini.bsky.social

PhD student at Cornell Tech | he/him | cities + equity + spatial everything | fan of cats and Taylor Swift | gsagostini.github.io

170 Followers  |  249 Following  |  32 Posts  |  Joined: 12.02.2025
Posts Following

Posts by Gabriel Agostini (@gsagostini.bsky.social)

Post image

New paper! The Linear Representation Hypothesis is a powerful intuition for how language models work, but lacks formalization. We give a mathematical framework in which we can ask and answer a basic question: how many features can be stored under the hypothesis? 🧡 arxiv.org/abs/2602.11246

17.02.2026 16:37 β€” πŸ‘ 43    πŸ” 14    πŸ’¬ 1    πŸ“Œ 2

I don’t think the methods would be hard to replicateβ€”there is code on my GitHub repo! But I’d imagine we’d need some domain expertise about what are good sources of (noisy, biased) fine-grained migration records in these countries, or about the frequency of census data. Happy to chat more!

06.02.2026 20:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

New research is offering new insight on how Americans move β€” all the way to the neighborhood level.

A new dataset, MIGRATE, maps annual moves with 4,600‑times more detail than standard public data, revealing patterns hidden in county‑level reporting:

https://bit.ly/49XSD6w

05.02.2026 19:55 β€” πŸ‘ 5    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
Inferring fine-grained migration patterns across the United States - Nature Communications This study releases a very high-resolution migration dataset that reveals trends that shape daily life: rising moves into high-income neighborhoods, racial gaps in upward mobility, and wildfire-driven...

We are excited to see what you do with this data, and hope to build on this work in the future. You can read our full paper at nature.com/articles/s41467-025-68019-2

This is joint work with Rachel Young, Maria Fitzpatrick, @nkgarg.bsky.social, and @emmapierson.bsky.social! 9/9

05.02.2026 17:30 β€” πŸ‘ 12    πŸ” 4    πŸ’¬ 0    πŸ“Œ 1
MIGRATE

There are over 100 academic, non-profit, governmental, and journalistic research teams from all over the world already using MIGRATE to study topics across the health, environmental, and social sciences! If you are a researcher interested in data access, visit migrate.tech.cornell.edu. 8/9

05.02.2026 17:30 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

MIGRATE is also useful for studying local migration trends. For example, it reveals dramatic rates of out-migration after wildfires in California that are invisible in previous Census datasets. 7/9

05.02.2026 17:30 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1
Post image

We found, for example, racial disparities in upward mobility β€”that is, the rate at which people move to higher-income areas varies according to the racial composition of their current area of residence, even after controlling for income levels. 6/9

05.02.2026 17:30 β€” πŸ‘ 5    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Post image

A great advantage of MIGRATE is that CBG-level data allows us to more accurately discuss socioeconomic and demographic trends. We documented national migration patterns, such as how far do people move and what are the characteristics of the areas they move to. 5/9

05.02.2026 17:30 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

MIGRATE is over 4,000x more spatially granular than publicly available migration datasets, highly correlated with Census data, and less biased than proprietary address data. We’re making it available to researchers asking all those important migration questions!! 4/9

05.02.2026 17:30 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We created and released MIGRATE: annual flows between all 47 billion pairs of US Census Block Groups. To do it, we developed an iterative-proportional-fitting based algorithm that harmonizes granular but biased proprietary data with coarser but more reliable Census data. 3/9

05.02.2026 17:30 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Migration data is key for studying responses to environmental disasters, policy impacts, etc. But public, county-level migration data is too coarse to capture many important dynamics. Proprietary, individual-level address histories are highly granular, but potentially biased. 2/9

05.02.2026 17:30 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

Our paper β€œInferring fine-grained migration patterns across the United States” is now out in @natcomms.nature.com! We released a new, highly granular migration dataset. 1/9

05.02.2026 17:30 β€” πŸ‘ 70    πŸ” 27    πŸ’¬ 2    πŸ“Œ 5
Post image

November is over, but we still have some #30DayMapChallenge entries to share! And for our transport-themed day 26 map, MBTA data analyst Joe Hilleary takes us on a ride back in time: he shows current bus routes in Greater Boston by the earliest known year in which a direct percursor route ran a bus.

09.12.2025 17:01 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

We have a new paper in Science Advances proposing a simple test for bias:

Is the same person treated differently when their race is perceived differently?

Specifically, we study: is the same driver likelier to be searched by police when they are perceived as Hispanic rather than white?

1/

24.11.2025 18:14 β€” πŸ‘ 44    πŸ” 16    πŸ’¬ 2    πŸ“Œ 1

My best workflow improvement since starting to work with spatial libraries in Python was to always include a `crs` dictionary on a variables file listing crs for lat-long projections, equidistant projections, and "maybe not satisfying any desiderata but the prettiest out there" projections.

24.11.2025 19:10 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

#30DayMapChallenge day 20: water

This map of Arsenic and Cadmium levels in Mexico's water show non-trace concentrations of Total and Soluble Arsenic and Cadium. Points are colored by the presence of high amounts of contaminants, and sized by their relative concentration.

tinyurl.com/map20wtr

24.11.2025 18:53 β€” πŸ‘ 8    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Post image

#30DayMapChallenge 15: Fire

@sylviaimani.bsky.social visualized how Uganda’s transition toward electric cooking aligns with the reach of the national grid. Regions with denser grid networks show a strong correlation with higher household adoption of electric cooking technologies.

21.11.2025 18:35 β€” πŸ‘ 12    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

For #30DayMapChallenge Day 18: Out of this world, use the `fill_z_offset` param in mapgl to "elevate" your data.

Just be careful - if you choose a value too high, you might lose your data in the sky!

#rstats

18.11.2025 19:03 β€” πŸ‘ 7    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

I took me too long to accept that "Amsterdam is just 10th Ave"

17.11.2025 14:47 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

#30DayMapChallenge day 10: Air

@jessiefin.bsky.social + Francisco Marmolejo-CossΓ­o visualize the presence of ladrilleras, or brick kilns, which emit pollution across the state. Data cleaned by Jacqueline CalderΓ³n and Lizet Jarquin at UASLP.

Full interactive map: tinyurl.com/map10-air

15.11.2025 17:19 β€” πŸ‘ 8    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Post image

#30DayMapChallenge day 9 asked us to get off our screens. @annaloganmc.bsky.social's "analog" map is a hand-painted postcard! πŸ“«

"I chose to paint a postcard of a map of Ann Arbor where I currently live showing the Huron River!" she says

14.11.2025 20:42 β€” πŸ‘ 13    πŸ” 5    πŸ’¬ 0    πŸ“Œ 0
DOHMH New York City Restaurant Inspection Results | NYC Open Data

tho it's very possible that the dataset misnamed a Subway or two. I filtered for the DOHMH restaurants doing business as "Subway": data.cityofnewyork.us/Health/DOHMH-New-York-City-Restaurant-Inspection-Results/43nn-pn8j

14.11.2025 12:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

The LIC Subway between the Queens Plaza and Queensboro Plaza stations is there, but so close to both stations (and CBs are small there) that we only see a slit of purple! That's also the only Subway in LIC per Google Maps but I wonder if that one is also not up to date.

14.11.2025 12:23 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

Of course! Glad you liked it

14.11.2025 11:58 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Proposing the Subway-subway (πŸ₯ͺ-πŸš‡) index

13.11.2025 16:37 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Great map(s) by @jennahgosciak.bsky.social ---can we count that for 6 days of mapping??---that show both the permanence and the vulnerability of ecological concepts in our urban landscapes!
#30DayMapChallenge

11.11.2025 19:03 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

We are slowly catching up to the #30DayMapChallenge!

In our day 3: polygons submission, @zhixuanqi.bsky.social questioned the boundaries and fuzziness of polygons with an animated map that invites us to think about the (not-so-well-defined) idea of neighborhoods.

07.11.2025 17:19 β€” πŸ‘ 17    πŸ” 4    πŸ’¬ 0    πŸ“Œ 1

Another dog map, this is 1 dog = 1 dot. And hopefully 1 day = 1 map for the next 30 days in our working group page πŸ—ΊοΈ

06.11.2025 19:59 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
N-gram novelty is widely used to evaluate language models' ability to generate text outside of their training data. More recently, it has also been adopted as a metric for measuring textual creativity. However, theoretical work on creativity suggests that this approach may be inadequate, as it does not account for creativity's dual nature: novelty (how original the text is) and appropriateness (how sensical and pragmatic it is). We investigate the relationship between this notion of creativity and n-gram novelty through 7542 expert writer annotations (n=26) of novelty, pragmaticality, and sensicality via close reading of human and AI-generated text. We find that while n-gram novelty is positively associated with expert writer-judged creativity, ~91% of top-quartile expressions by n-gram novelty are not judged as creative, cautioning against relying on n-gram novelty alone. Furthermore, unlike human-written text, higher n-gram novelty in open-source LLMs correlates with lower pragmaticality. In an exploratory study with frontier close-source models, we additionally confirm that they are less likely to produce creative expressions than humans. Using our dataset, we test whether zero-shot, few-shot, and finetuned models are able to identify creative expressions (a positive aspect of writing) and non-pragmatic ones (a negative aspect). Overall, frontier LLMs exhibit performance much higher than random but leave room for improvement, especially struggling to identify non-pragmatic expressions. We further find that LLM-as-a-Judge novelty scores from the best-performing model were predictive of expert writer preferences.

N-gram novelty is widely used to evaluate language models' ability to generate text outside of their training data. More recently, it has also been adopted as a metric for measuring textual creativity. However, theoretical work on creativity suggests that this approach may be inadequate, as it does not account for creativity's dual nature: novelty (how original the text is) and appropriateness (how sensical and pragmatic it is). We investigate the relationship between this notion of creativity and n-gram novelty through 7542 expert writer annotations (n=26) of novelty, pragmaticality, and sensicality via close reading of human and AI-generated text. We find that while n-gram novelty is positively associated with expert writer-judged creativity, ~91% of top-quartile expressions by n-gram novelty are not judged as creative, cautioning against relying on n-gram novelty alone. Furthermore, unlike human-written text, higher n-gram novelty in open-source LLMs correlates with lower pragmaticality. In an exploratory study with frontier close-source models, we additionally confirm that they are less likely to produce creative expressions than humans. Using our dataset, we test whether zero-shot, few-shot, and finetuned models are able to identify creative expressions (a positive aspect of writing) and non-pragmatic ones (a negative aspect). Overall, frontier LLMs exhibit performance much higher than random but leave room for improvement, especially struggling to identify non-pragmatic expressions. We further find that LLM-as-a-Judge novelty scores from the best-performing model were predictive of expert writer preferences.

N-gram novelty is widely used as a measure of creativity and generalization. But if LLMs produce highly n-gram novel expressions that don’t make sense or sound awkward, should they still be called creative? In a new paper, we investigate how n-gram novelty relates to creativity.

04.11.2025 15:08 β€” πŸ‘ 41    πŸ” 10    πŸ’¬ 1    πŸ“Œ 2
Post image

New #NeurIPS2025 paper: how should we evaluate machine learning models without a large, labeled dataset? We introduce Semi-Supervised Model Evaluation (SSME), which uses labeled and unlabeled data to estimate performance! We find SSME is far more accurate than standard methods.

17.10.2025 16:29 β€” πŸ‘ 21    πŸ” 7    πŸ’¬ 1    πŸ“Œ 4