45. Academia doesn't reward building useful tools nearly as much as it should
28.07.2025 00:44 โ ๐ 28 ๐ 4 ๐ฌ 2 ๐ 1@tcarpenter.bsky.social
๐งช Data science, survey science, social science ๐ป Director of Data Science @ Microsoft Garage [Posts do not represent my employer] ๐งฎ Stats, R, python ๐ Science, Research: measurement, social biases, emotion. Ex-academic but scientist at heart
45. Academia doesn't reward building useful tools nearly as much as it should
28.07.2025 00:44 โ ๐ 28 ๐ 4 ๐ฌ 2 ๐ 114. We mostly evaluate latent variable models with the equivalent of Rorschach tests
27.07.2025 16:20 โ ๐ 8 ๐ 1 ๐ฌ 1 ๐ 05. You should use a precision-recall curve for a binary classifier, not an ROC curve
27.07.2025 13:42 โ ๐ 23 ๐ 2 ๐ฌ 1 ๐ 0Wow. Scientists have edited mosquito DNA to prevent the spread of malaria to humans "while supporting essential physiological functions... and negligible fitness costs" to the mosquito population.
Potentially ending the mosquito-born spread of malaria to humans.
www.nature.com/articles/s41...
โฆ set of paths consistently supported by the data. Even getting that down is a trick. And making sense of it is fraught and doesnโt get you much further than one would get from regression. But at least then we would have some confidence we understand the correlational relationships!
23.07.2025 16:18 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0โฆ the model is correct and then gives you what the path would be under that specification. Thereโs nothing different when we go to SEM other than your ability to p-hack goes up exponentially. IMO this would be a great place to use machine learning approaches to train / tune models to find โฆ
23.07.2025 16:18 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0โฆ all those hypotheses together (in the same way that ANOVA contest many multiple comparisons at once). Thereโs nothing different between this and running a bunch of regressions and claiming the results support the way you specified those models. In reality, itโs the reverse. Regression assumes โฆ
23.07.2025 16:18 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0Yes and see this a lot in social too. Proper use of SEM implies a particular philosophy of hypothesis testing in regression contexts. An omitted path is hypothesizing that path is exactly 0. A non-omitted path hypothesizing it is non-zero. Model fit is effectively the joint set of โฆ
23.07.2025 16:18 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0Yikes!
23.07.2025 03:52 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0โฆ SEM for causal discovery. However, if you have a good read on the causal process, it can be great for estimating parameters such as factor, loadings or paths with latent variables
23.07.2025 03:35 โ ๐ 5 ๐ 0 ๐ฌ 0 ๐ 0This is probably not anything you donโt already know โฆ. But I did a lot of SEM work and will repeat it anyway. The model assumes you know the causal structure. Fit indices will confirm that the model is a fit to the data, but many incorrect models can fit the data. So I would not use โฆ
23.07.2025 03:34 โ ๐ 9 ๐ 1 ๐ฌ 2 ๐ 1Curious how this compares to the cost of living per state
19.07.2025 20:27 โ ๐ 2 ๐ 0 ๐ฌ 0 ๐ 0Plot that depicts the average importance people in my data assign to their friendships (y-axis, on a scale from 1 to 5, depicted with 95% confidence intervals) by their age (x-axis, from 18 to 60). Depicted are 3 different ways to model importance of friends as a function of age. Using age as a linear predictor: this imposes a linear trajectory which comes with very tight confidence intervals (i.e., uncertainty is low). Using age as a categorical predictor: this imposes no trajectory whatsoever but instead simply reproduces the means by age. The confidence intervals are very wide, in particular for those ages not well represented in the data (i.e., uncertainty is high). Age splines: This results in a smooth trajectory that follows some of the bumps in the data, but not all of them. The confidence intervals are somewhere between the linear and the categorical case (i.e., uncertainty is medium)
Let's say you want to include age as a predictor in your model. How do you do that?
Here's an illustration of three options -- it's for a paper I'm working on (so if you feel like anything could be tweaked...).
There should be a corner at Home Depot where a guy with a table saw will slice you off custom lengths of hot dog from an infinite hot dog coming out of the wall
05.07.2025 00:15 โ ๐ 4023 ๐ 610 ๐ฌ 89 ๐ 52Adele shattering a glass in her hand
There were two girls at Wawa just now talking about funny movies and one said, โHave you ever seen the movie Office Space? Itโs an old people movie but itโs funnyโ
25.06.2025 01:10 โ ๐ 6995 ๐ 379 ๐ฌ 421 ๐ 63Counterpoint: the ability to chat with an article or literature and find patterns in our own work that perhaps we missed I think has a lot of potential to augment our scientific work
24.06.2025 19:09 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0๐คThrilled to share our latest workโ๏ธ
Have you ever wondered what LLMs know but they are not saying?
We built an auditing framework to study information suppression in LLMs, and demonstrated it to quantify and characterize censorship in DeepSeek.
Read more:
arxiv.org/abs/2506.12349
Pleased to share our ICML Spotlight with @eberleoliver.bsky.social, Thomas McGee, Hamza Giaffar, @taylorwwebb.bsky.social.
Position: We Need An Algorithmic Understanding of Generative AI
What algorithms do LLMs actually learn and use to solve problems?๐งต1/n
openreview.net/forum?id=eax...
building intuition around systems of matrix data and how we manipulate them should be right after (or right before) basic calc (integrals, derivatives, partial derivatives)
18.06.2025 05:29 โ ๐ 1 ๐ 0 ๐ฌ 2 ๐ 0Why?
18.06.2025 03:50 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0Someone please explain why linear algebra isnโt taught more in high school in the United States? Seems like maybe if youโre lucky you get a few lectures on matrices and thatโs it.
18.06.2025 03:46 โ ๐ 6 ๐ 0 ๐ฌ 2 ๐ 0Is it 2015 or 2025?
17.06.2025 15:26 โ ๐ 2 ๐ 0 ๐ฌ 0 ๐ 0โData available upon requestโ
17.06.2025 15:25 โ ๐ 7 ๐ 0 ๐ฌ 1 ๐ 0Under the Trump agenda, energy will cost more. And when energy costs more, everything costs more.
www.nytimes.com/2025/06/04/c...
Many academics are taught โdo moreโ instead of โprioritizeโ. They are taught the academic superhero myth, that โtruly great/smartโ scholars can handle it. So they sabotage their own success in service to the cult of personality
05.06.2025 14:00 โ ๐ 7 ๐ 2 ๐ฌ 0 ๐ 0Galaxy brain: what does anything measure?
05.06.2025 13:35 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0I love this technique because it gives a cool way to isolate change and stability components of within-person measurement using latent variables. The stable portion of IATs is far more predictive of other individual-difference measures than one would think given traditional scoring/analyses
04.06.2025 22:59 โ ๐ 7 ๐ 1 ๐ฌ 2 ๐ 0