🚨Free data alert!! 🚨 Please share.
Large new dataset of Amazon product reviews, including full text and photos and product characteristics, with individual *reviews labeled as fake reviews*.
I believe this is the first publicly available data of this kind.
github.com/bretthollenb...
11.07.2025 21:17 —
👍 127
🔁 42
💬 1
📌 2
Designing better out-of-the-box histograms
Given how common histograms are in BI tools, you might think they’re easy to design. Think again. We share challenges we encountered, and how we handled them, while designing better out-of-the-box…
Histograms are incredibly useful, interpretable, and common in BI. But building histograms that work well out of the box — no matter the data — is trickier than it sounds. We share some of the challenges faced, and decisions made, when designing histograms for Observable Canvases:
05.06.2025 19:30 —
👍 26
🔁 7
💬 0
📌 1
What cyborg work looks like as an academic, by Robert Ghrist, a mathematician and associate dean of undergraduate education at the University of Pennsylvania.
01.05.2025 15:36 —
👍 88
🔁 18
💬 6
📌 1
Model to Meaning: How to interpret statistical models with marginaleffects for R and Python
📚😅🎉
Yay!! I just submitted the complete manuscript of my upcoming book to the publisher!
Learn to easily and clearly interpret (almost) any stats model w/ R or Python. Simple ideas, consistent workflow, powerful tools, detailed case studies.
Read it for free @ marginaleffects.com
#RStats #PyData
10.04.2025 19:06 —
👍 592
🔁 147
💬 21
📌 9
We streamlined six new DID-like estimators and created this tutorial for implementation in R.
yiqingxu.org/packages/fec...
Hope you no longer need to spend months figuring out what these estimators are and how to use them.
21.02.2025 04:41 —
👍 327
🔁 97
💬 6
📌 6
The Rproj File – Positron
We've added an article about RStudio's Rproj files and how to adapt related workflows, if you're starting to kick the tires on Positron. If this interests you, check it out 👀
positron.posit.co/rstudio-rpro...
#rstats #rstudio #positron
22.01.2025 17:43 —
👍 135
🔁 31
💬 3
📌 4
📊 vs. 🥧
I made a tiny teaching tool to help me interactively demo + share differences between 📊 and 🥧
Play: I find that tinkering with data + visuals in class reinforces understanding far more than slides or readings
Save + share: Copy the url to link the current data
👉 barvpie.netlify.app
25.11.2024 16:28 —
👍 90
🔁 23
💬 6
📌 2
My PhD syllabus for Introduction to Quantitative Marketing @rotmanschool. Updated for 2025. Comments welcome.
Feel free to suggest additional papers. Self promotion encouraged! All University of Toronto PhD students welcome to audit. Please get in touch.
15.12.2024 14:22 —
👍 23
🔁 6
💬 2
📌 2
The enmity paradox - Scientific Reports
Scientific Reports - The enmity paradox
In 24,678 people in 176 rural Honduras villages, we found that villagers have an average of 6.89 (SD 3.79) friends, and these friends have 8.40 (SD 2.52) friends.
Villagers have an average of 1.26 (SD 1.70) enemies, and these enemies have 3.40 (SD 2.11) enemies.
www.nature.com/articles/s41... 7/
23.11.2024 15:20 —
👍 12
🔁 4
💬 1
📌 0
Mirrored histogram showing “weird” parts of the population: treated people who were unlikely to be treated, and untreated people who were likely to be treated
Mirrored histogram showing pseudo-populations of treated and untreated people that have been reweighted to be more comparable and unconfounded
Table showing potential and realized outcomes for 9 simulated people
Before we calculate these different treatment effects with the realized outcomes instead of the hypothetical potential outcomes, let's look really quick at the practical difference between the true ATE, AT 1, and ATU. All three estimands are useful for policymaking!
The ATE is -15, implying that mosquito nets cause a 15 point reduction in malaria risk for every person in the country. This includes people who live at high elevations where mosquitoes don't live, people who live near mosquito-infested swamps, people who are rich enough to buy Bill Gates's mosquito laser, and people who can't afford a net but would really like to use one. If we worked in the Ministry of Health and wanted to know if we should make a new national program that gave everyone a free bed net, the overall reduction in risk is -15, which is probably pretty good!
The ATT is -16.29, which is bigger than the ATE. The effect of net usage is bigger for people who are already using the nets. This is because of underlying systematic reasons, or selection bias. Those using nets want to use them because they need them more or can access them more easily-they might live in areas more prone to mosquitoes, or they can afford to buy their own nets, or something else. They know themselves and understand some notion of their personal individual causal effect and seek out nets. If we removed access to their nets, it would have a strong effect.
The ATU is -13.63, which is smaller than the ATE. The effect of net usage is smaller for people who aren't using the nets. Again, this is because of selection bias. Those not using nets are likely not using them for systematic reasons-they live far away from mosquitoes, they've received a future malaria vaccine, they have some other form of mosquito abatement, or something else. Because they can read their own minds, they know that mosquito net use won't do much for them personally, so they don't seek out nets. If we expanded access to nets to them, they wouldn't benefit
From the archives: Have you (like me!) wondered what the ATT means and how it's different from average treatment effects? I use #rstats to explore why we care about (and how to calculate) the ATE, ATT, and ATU #polisky #episky #econsky www.andrewheiss.com/blog/2024/03...
22.11.2024 14:50 —
👍 205
🔁 44
💬 8
📌 5
A few things I've been working on lately:
elmer, elmer.tidyverse.org, is a new package to make it easier to work with LLMs (hosted and local) from #rstats. It includes helps for structured data extraction and tool calling, and an easy way to upload a plot. Joint work with Joe Cheng.
29.10.2024 22:13 —
👍 229
🔁 55
💬 10
📌 5