Data Science Weekly - Issue 637, by @DataSciNews open.substack.com/pub/datascie...
05.02.2026 22:55 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0@datascienceweekly.bsky.social
Data Science Weekly - Issue 637, by @DataSciNews open.substack.com/pub/datascie...
05.02.2026 22:55 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0On a positive note, here's a new blog post highlighting some polyglot data science tools in R and Python that I've enjoyed lately
#rstats #pydata
www.practicalsignificance.com/posts/favori...
The first data science book that has a chapter on monads reproducible-data-science.dev
Learn how to build robust #DataScience pipelines with #RStats, #Python , #Julia and #Nix !
Are you a #Stata user? Maybe you work with one?
Have you ever found yourself copy-pasting from the results window?
It's annoying as hell! And terrible practice. So I wrote a blog post on using #rstats to extract results from Stata log files
benharrap.com/post/2026-02...
On this page Whatโs the difference between statistical significance and substantial significance? Can we measure substantial significance with statistics? What are all the different ways we can look at model coefficients? Print the object name Use summary() Use tidy() from the {broom} package Use model_parameters() and model_details() from the {parameters} and {performance} packages Make nice polished side-by-side regression tables with {modelsummary} Make automatic coefficient plots with modelplot() from {modelsummary} Plot model predictions and marginal effects Automatic interpretation with {report}
Posted a helpful little set of FAQs about regression for my causal inference class, including illustrations of statistical vs. substantive signficance and all the different things you can do with #rstats model objects
evalsp26.classes.andrewheiss.com/news/2026-02...
A new creative writing style bench and leaderboard for LLMs surgehq.ai/blog/hemingw...
05.02.2026 07:38 โ ๐ 22 ๐ 5 ๐ฌ 0 ๐ 1Did you learn differential privacy (in part or in whole) from my course? Either the videos, lecture notes, or some combination? Please send me a DM or an email, I'm trying to gather some info.
(In case you missed it, here's the course: www.gautamkamath.com/CS860-fa2020..., ft full notes & videos)
Data Science Weekly - Issue 636, by @DataSciNews open.substack.com/pub/datascie...
29.01.2026 13:50 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0Things are grim. But in more frivolous news...
@jamesbrandecon.bsky.social and I have been chipping away at `dbreg`, a ๐ฆ for running big regression models on database backends. For the right kinds of problems, the speed-ups are near magical.
Website: grantmcdermott.com/dbreg/
#rstats
[1/2]
Iโve been thinking about how to find R packages by functionality when you donโt already know the package name.
So over the holidays, Claude Code and I built The Warehouse: a functionality-first R package directory that helps you find packages by what they do.
rwarehouse.netlify.app
#rstats
Two approaches to generating pipelines:
Parametric: define parameters, tool generates SQL
Template-based: write SQL templates with variables
dbt took templates. Automation tools took parametric. Neither is wrong - they optimize for different teams.
Data Science Weekly - Issue 635, by @DataSciNews open.substack.com/pub/datascie...
22.01.2026 21:19 โ ๐ 1 ๐ 1 ๐ฌ 0 ๐ 0Data Science Weekly - Issue 635, by @DataSciNews open.substack.com/pub/datascie...
22.01.2026 21:19 โ ๐ 1 ๐ 1 ๐ฌ 0 ๐ 0crying and laughing at the same time is good for the soul
22.01.2026 20:58 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0Don't know who needs to hear this, but don't be afraid of starting a blog in 2026. SEO will be brutal at first, but you can find an audience through BlueSky
Regarding stack options, Quarto is good if you are doing an R/Python heavy data science site. Or for a modern CMS I recommend Statamic
For my #popgen nerds out there, I've finished building a Python package called hapnet - an easy to use, straight out the box tool for building #haplotype networks. Just plug your data in, run a single line of code and voila! A pretty color coded network. Still beta testing!
pypi.org/project/hapn...
We just published a JOSIS paper on what spatial data science languages have in common and what they still need. Insights from across the R, Python & Julia ecosystems.
URL: doi.org/10.5311/JOSI...
#SpatialDataScience #GISchat #OpenSource #RSpatial #GeoPython #JuliaGeo
Dark themed map of Calgary
Light themed map of Calgary
Been playing with this Python script to generate cool looking maps github.com/originalanku...
They're fairly high res, 3630x4830. Uses OpenStreetMap data
1/6 #Calgary #yyc #maps
OPPORTUNITY: The Space Telescope Science Institute in Baltimore, MD, is searching for an Archive Analyst, with a focus on userโfacing documentation and tutorials, to help advance our stateโofโtheโart astronomical data archive. Python experience required: https://bit.ly/3YQDM7y
21.01.2026 20:25 โ ๐ 29 ๐ 24 ๐ฌ 0 ๐ 1Some thoughts on forecast accuracy using rolling scaled measures robjhyndman.com/hyndsight/ro... #rstats
20.01.2026 23:38 โ ๐ 17 ๐ 4 ๐ฌ 1 ๐ 1๐ค New update to the LLMs+R guide (ENG+SPA)
14 new pkgs plus reading materials
* AI assistants in RStudio
* Ralph Wiggum Coding in R
* local model tools
* auto-routing to the most cost-efficient model
* manage prompts w/ external files
* translations in the IDE
#rstats
luisdva.github.io/llmsr-book/
Hey #RStats and #PoliticalScience folks โ update on a resource I've maintained:
American political data & R guide:
Presidential results 1864-2024
Senate/House composition since 1920s
Historical margins & ideology scores
Fully reproducible
github.com/jaytimm/amer...
#OpenData #DataViz #Elections
such a classic!
22.01.2026 09:23 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0I'm glad this isn't my job. #rstats #statsky #mathjokes
21.01.2026 22:40 โ ๐ 13 ๐ 2 ๐ฌ 1 ๐ 0Event-study plots in DiD are incredibly persuasiveโbut, honestly, not always honest.
Why?
If parallel trends or no anticipation fail, DiD estimates are biased. Testing against a zero-effect null then becomes misleading.
๐จNew ๐: arxiv.org/abs/2512.06804
#CausalInference #EconSky #StatsSky #rstats
How to create a more accessible line chart nrennie.rbind.io/blog/accessi... another great #Rstats post from @nrennie.bsky.social as usual ๐ฎ๐ฏ
21.01.2026 10:24 โ ๐ 17 ๐ 6 ๐ฌ 1 ๐ 1I'm super excited to share the next version of plumber2 with all of you. Headlining this release is support for OpenTelemetry throughout the routing logic, build in authentication logic, and much improved performance.
Try it out and report back!
#rstats
rspatialdata: a repository of spatial datasets & tutorials for spatial analysis & visualization in #rstats, supporting real-world applications such as estimating air pollution, quantifying disease burden, and monitoring progress toward the SDGs๐๐ป๐
๐ rspatialdata.github.io
If you're teaching first year statisitcs (particularly for biology/ag), the data package `smbdata`, featuring all data from the Welham et al (2015) book, is now published on CRAN. Thank you to the authors for their permission to put it on CRAN!
cran.r-project.org/web/packages... #rstats
I use the Zettelkasten method for all my technical notes.
The insight: don't organize by folder, organize by connection. Instead of deciding "does this note about data quality go in the 'pipelines' folder or the 'governance' folder?", you just link.
After five years, I have 1000s connected notes.