Disclaimers: All opinions are my own. The results shown above were approved for release under Disclosure Review Board (DRB) approval number CBDRBβFY25β0280.
14.07.2025 12:38 β π 0 π 0 π¬ 0 π 0@jlrothbaum.bsky.social
Economist, U.S. Census Bureau, Returned Peace Corps Volunteer, Ecuador (All opinions are mine).
Disclaimers: All opinions are my own. The results shown above were approved for release under Disclosure Review Board (DRB) approval number CBDRBβFY25β0280.
14.07.2025 12:38 β π 0 π 0 π¬ 0 π 0Thanks so much to my coauthors Adam Bee, John Creamer, Josh Mitchell, Nikolas Mittag, Elizabeth Pelletier, Carl Sanders, Lawrence Schmidt, and Matt Unrath. See how NEWS affects estimates for different groups at jrothbaum.github.io/news.html and the official release at www.census.gov/data/experim...
14.07.2025 12:38 β π 1 π 0 π¬ 1 π 0As noted above, the bias can vary a lot across groups and over time. Underreporting of UI benefits can cause bias in child poverty (their parents are likely to work and therefore collect UI in a downturn) but wonβt impact elderly poverty much.
14.07.2025 12:38 β π 0 π 0 π¬ 1 π 0Likewise, we see more missing income, when income shifts from well-reported sources (like wage and salary earnings) to ones with greater underreporting (like unemployment insurance, or UI) in 2020 and 2021.
14.07.2025 12:38 β π 0 π 0 π¬ 1 π 0We said Pandemic nonresponse bias started affecting the data in 2020, but how do we know this? We can look at the results after each step. Our weighting adjustment only starts affecting our estimates for surveys conducted in 2020, affecting income estimates from 2019 forward.
14.07.2025 12:38 β π 0 π 0 π¬ 1 π 0We do several things, including 1) weighting to adjust for nonresponse bias (income may be correlated with survey response), 2) imputation (not everyone answers income questions on surveys), 3) and combining survey and adrec data (whatβs the right number when they disagree?)
14.07.2025 12:38 β π 0 π 0 π¬ 1 π 0Our post-tax income+in-kind transfer measure mirrors the resource measure used to calculate the Supplemental Poverty Measure (SPM).
The NEWS SPM rate is 1.7 to 3.5pp lower than the survey, depending on the year, with as many as 11.5 million fewer people in poverty.
Split the effect by age, and we see the biggest change is among seniors, who tend to underreport other sources of retirement income (from 2018).
14.07.2025 12:38 β π 1 π 0 π¬ 1 π 0We estimate three income measures: money income, post-tax income, and post-tax income+in-kind transfers (excluding health insurance).
Relative to the survey, our estimates of all three measures increase across the income distribution (shown from 2018)
In the prior release, we expanded the resource measures we estimate to include taxes, credits, and in-kind benefits. We use linked adrecs to address survey underreporting of multiple safety net programs and linked tax returns to improve estimates of taxes and filing behavior.
14.07.2025 12:38 β π 0 π 0 π¬ 1 π 0Beyond the tldr;! We use CPS ASEC (source of official income and poverty), 1040s, W2s, info tax returns, LEHD, ACS, census, OASDI and SSI payments, federal and state safety net data (housing assistance, SNAP, TANF, and WIC), firm data, and commercial data on home values.
14.07.2025 12:38 β π 0 π 0 π¬ 1 π 0This is the third release of the National Experimental Wellbeing Statistics (NEWS) Project at Census. In this release, we use the same methods as the prior one, but cover additional years.
Latest release here: www.census.gov/data/experim...
Prior release here: bsky.app/profile/jlro...
Lots of estimates by group and year, some examples:
β’ Pre-tax income: jrothbaum.github.io/news/income/...
β’ + Taxes and credits: jrothbaum.github.io/news/income/...
β’ Official poverty: jrothbaum.github.io/news/poverty...
β’ Supplemental poverty: jrothbaum.github.io/news/poverty...
This varies by group. Parents have mostly wage and salary earnings, which is well reported: not much normal bias, but lots of underreporting of UI in 2020. Those 65+ have lots of retirement income: lots of normal bias, but not much change in 2020 or 2021.
14.07.2025 12:38 β π 0 π 0 π¬ 1 π 0But the bias varies by year 1) in each year thereβs βnormalβ bias from underreporting of income, like pensions, 2) From 2020 on, high income households respond at higher rates, 3) some income is reported better than others, and UI is not well reported, so UI β in 2020 => bias β
14.07.2025 12:38 β π 0 π 0 π¬ 1 π 0The difference between NEWS and official survey estimates can be large. In 2020, the NEWS estimate of official poverty is 2.4pp lower than the survey, with 8 million fewer people in poverty.
14.07.2025 12:38 β π 0 π 0 π¬ 1 π 0New research from the NEWS project on income and poverty from 2016-2021
Using survey, census, admin and commercial data, we show that survey estimates understate income and overstate poverty.
Release at www.census.gov/data/experim..., interactive plots at jrothbaum.github.io/news.html
In my testing it's 2-5x faster than my best attempt to do this using Stata's python integration (and my python-based solution is way faster than the standard code for large files with batched file handling and multiprocessing).
25.05.2025 14:04 β π 1 π 0 π¬ 0 π 0I've done benchmarks (see github.com/jrothbaum/st...). Stata is efficient at loading dta files and I can't match that, but parquet files are faster to load if the file is large and you only need a subset of columns - that's where parquet shines
25.05.2025 13:53 β π 0 π 0 π¬ 1 π 0Parquet is a standard file format with some big advantages: standard format for use in R and python (for multilanguage projects), super compressed relative to dta files (5-10% the size on disk). www.databricks.com/glossary/wha...
25.05.2025 13:53 β π 1 π 0 π¬ 1 π 0If you use Stata and parquet files, I developed a stata plugin that could help: ideas.repec.org/c/boc/bocode.... You can read/write parquet files directly from stata (Caveat: tested on windows and linux, but not mac)
25.05.2025 13:53 β π 2 π 3 π¬ 1 π 0Reposting the image in a different format...
29.01.2025 15:28 β π 0 π 0 π¬ 0 π 02) estimating income and poverty when many or all of the adrecs are not yet available, both for timely estimates (some adrecs arrive with months or years of lag) and going back in time, and 3) estimating income and poverty at lower geographic levels (state, county, tract). (n/n)
29.01.2025 15:22 β π 0 π 0 π¬ 0 π 0We plan to release more years of data later in 2025. And weβre on to the next set of goals: 1) addressing underreporting (in the survey and adrecs) of other items such as self-employment earnings and rental income. (19/n)
29.01.2025 15:22 β π 1 π 0 π¬ 1 π 0We made lots of other programming improvements to make our estimates better, faster, and more replicable - detailed in the paper. (18/n)
29.01.2025 15:22 β π 0 π 0 π¬ 1 π 0This allows us to match the distribution of earnings conditional on worker characteristics and the unconditional distribution of earnings (conditional on our model assumptions). That was not guaranteed in our prior approach. (17/n)
29.01.2025 15:22 β π 0 π 0 π¬ 1 π 0For most respondents, we use their adrec earnings (consistent with the audit results). However, for the others, we either use the survey or impute a βtrueβ earnings value drawn the estimated posterior distribution for that worker. (16/n)
29.01.2025 15:22 β π 0 π 0 π¬ 1 π 0We improve on prior work with two pieces of info: 1) estimates from randomized audits of how often reported wage and salary earnings are adjusted at audit and 2) more peopleβs survey and adrec earnings agree than youβd expect from a model of random noise in survey reports. (15/n)
29.01.2025 15:22 β π 0 π 0 π¬ 1 π 0We updated how we combine survey and adrec wage and salary earnings. Before, we picked the survey or adrec based on which seemed more accurate for each group of workers. Now, we directly model posterior distributions of earnings for each worker. (14/n)
29.01.2025 15:22 β π 0 π 0 π¬ 1 π 0Still here? Thereβs more! We made lots of changes under the hood to improve our estimates and our process. (13/n)
29.01.2025 15:22 β π 0 π 0 π¬ 1 π 0