's Avatar

@datascienceweekly.bsky.social

62 Followers  |  349 Following  |  36 Posts  |  Joined: 13.11.2024  |  1.9492

Latest posts by datascienceweekly.bsky.social on Bluesky

Preview
Data Science Weekly - Issue 637 Curated news, articles and jobs related to Data Science, AI, & Machine Learning

Data Science Weekly - Issue 637, by @DataSciNews open.substack.com/pub/datascie...

05.02.2026 22:55 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Some Favorite Data Science Tools Going into 2026 โ€“ Practical Significance A blog post highlighting some of data science tools Iโ€™m excited about going into the new year.

On a positive note, here's a new blog post highlighting some polyglot data science tools in R and Python that I've enjoyed lately

#rstats #pydata

www.practicalsignificance.com/posts/favori...

23.01.2026 00:00 โ€” ๐Ÿ‘ 15    ๐Ÿ” 2    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 2
Post image

The first data science book that has a chapter on monads reproducible-data-science.dev

Learn how to build robust #DataScience pipelines with #RStats, #Python , #Julia and #Nix !

01.02.2026 11:47 โ€” ๐Ÿ‘ 27    ๐Ÿ” 10    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 2
Preview
Using R to extract results from Stata log files โ€“ Ben Harrap

Are you a #Stata user? Maybe you work with one?

Have you ever found yourself copy-pasting from the results window?

It's annoying as hell! And terrible practice. So I wrote a blog post on using #rstats to extract results from Stata log files

benharrap.com/post/2026-02...

04.02.2026 04:34 โ€” ๐Ÿ‘ 11    ๐Ÿ” 5    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 1
On this page
Whatโ€™s the difference between statistical significance and substantial significance?
Can we measure substantial significance with statistics?
What are all the different ways we can look at model coefficients?
Print the object name
Use summary()
Use tidy() from the {broom} package
Use model_parameters() and model_details() from the {parameters} and {performance} packages
Make nice polished side-by-side regression tables with {modelsummary}
Make automatic coefficient plots with modelplot() from {modelsummary}
Plot model predictions and marginal effects
Automatic interpretation with {report}

On this page Whatโ€™s the difference between statistical significance and substantial significance? Can we measure substantial significance with statistics? What are all the different ways we can look at model coefficients? Print the object name Use summary() Use tidy() from the {broom} package Use model_parameters() and model_details() from the {parameters} and {performance} packages Make nice polished side-by-side regression tables with {modelsummary} Make automatic coefficient plots with modelplot() from {modelsummary} Plot model predictions and marginal effects Automatic interpretation with {report}

Posted a helpful little set of FAQs about regression for my causal inference class, including illustrations of statistical vs. substantive signficance and all the different things you can do with #rstats model objects

evalsp26.classes.andrewheiss.com/news/2026-02...

03.02.2026 19:49 โ€” ๐Ÿ‘ 69    ๐Ÿ” 11    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 1
Hemingway-bench AI Writing Leaderboard Stop rewarding slop. Hemingway-bench is an AI writing leaderboard that takes real-world writing tasks and puts them in front of master wordsmiths. Our goal: to push AI writing from two-second vibes to...

A new creative writing style bench and leaderboard for LLMs surgehq.ai/blog/hemingw...

05.02.2026 07:38 โ€” ๐Ÿ‘ 22    ๐Ÿ” 5    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1
CS 860 - Algorithms for Private Data Analysis- Fall 2020

Did you learn differential privacy (in part or in whole) from my course? Either the videos, lecture notes, or some combination? Please send me a DM or an email, I'm trying to gather some info.

(In case you missed it, here's the course: www.gautamkamath.com/CS860-fa2020..., ft full notes & videos)

05.02.2026 15:23 โ€” ๐Ÿ‘ 8    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Data Science Weekly - Issue 636 Curated news, articles and jobs related to Data Science, AI, & Machine Learning

Data Science Weekly - Issue 636, by @DataSciNews open.substack.com/pub/datascie...

29.01.2026 13:50 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
dbreg

Things are grim. But in more frivolous news...

@jamesbrandecon.bsky.social and I have been chipping away at `dbreg`, a ๐Ÿ“ฆ for running big regression models on database backends. For the right kinds of problems, the speed-ups are near magical.

Website: grantmcdermott.com/dbreg/

#rstats

[1/2]

26.01.2026 16:57 โ€” ๐Ÿ‘ 69    ๐Ÿ” 16    ๐Ÿ’ฌ 4    ๐Ÿ“Œ 1
The Warehouse

Iโ€™ve been thinking about how to find R packages by functionality when you donโ€™t already know the package name.

So over the holidays, Claude Code and I built The Warehouse: a functionality-first R package directory that helps you find packages by what they do.
rwarehouse.netlify.app

#rstats

27.01.2026 15:12 โ€” ๐Ÿ‘ 70    ๐Ÿ” 22    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 3
Preview
Designing a Declarative Data Stack: From Theory to Practice Explore the journey of building a declarative data stack - from architectural decisions to practical implementation. Learn how to separate business logic from technical implementation using templates, automation, and modern orchestration tools.

Two approaches to generating pipelines:

Parametric: define parameters, tool generates SQL
Template-based: write SQL templates with variables

dbt took templates. Automation tools took parametric. Neither is wrong - they optimize for different teams.

29.01.2026 08:32 โ€” ๐Ÿ‘ 5    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Data Science Weekly - Issue 635 Curated news, articles and jobs related to Data Science, AI, & Machine Learning

Data Science Weekly - Issue 635, by @DataSciNews open.substack.com/pub/datascie...

22.01.2026 21:19 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Data Science Weekly - Issue 635 Curated news, articles and jobs related to Data Science, AI, & Machine Learning

Data Science Weekly - Issue 635, by @DataSciNews open.substack.com/pub/datascie...

22.01.2026 21:19 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

crying and laughing at the same time is good for the soul

22.01.2026 20:58 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Don't know who needs to hear this, but don't be afraid of starting a blog in 2026. SEO will be brutal at first, but you can find an audience through BlueSky

Regarding stack options, Quarto is good if you are doing an R/Python heavy data science site. Or for a modern CMS I recommend Statamic

08.01.2026 16:05 โ€” ๐Ÿ‘ 13    ๐Ÿ” 2    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
Client Challenge

For my #popgen nerds out there, I've finished building a Python package called hapnet - an easy to use, straight out the box tool for building #haplotype networks. Just plug your data in, run a single line of code and voila! A pretty color coded network. Still beta testing!
pypi.org/project/hapn...

10.01.2026 03:52 โ€” ๐Ÿ‘ 8    ๐Ÿ” 4    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
Post image Post image Post image

We just published a JOSIS paper on what spatial data science languages have in common and what they still need. Insights from across the R, Python & Julia ecosystems.

URL: doi.org/10.5311/JOSI...

#SpatialDataScience #GISchat #OpenSource #RSpatial #GeoPython #JuliaGeo

11.01.2026 16:01 โ€” ๐Ÿ‘ 54    ๐Ÿ” 21    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 2
Dark themed map of Calgary

Dark themed map of Calgary

Light themed map of Calgary

Light themed map of Calgary

Been playing with this Python script to generate cool looking maps github.com/originalanku...

They're fairly high res, 3630x4830. Uses OpenStreetMap data

1/6 #Calgary #yyc #maps

20.01.2026 09:45 โ€” ๐Ÿ‘ 13    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

OPPORTUNITY: The Space Telescope Science Institute in Baltimore, MD, is searching for an Archive Analyst, with a focus on userโ€‘facing documentation and tutorials, to help advance our stateโ€‘ofโ€‘theโ€‘art astronomical data archive. Python experience required: https://bit.ly/3YQDM7y

21.01.2026 20:25 โ€” ๐Ÿ‘ 29    ๐Ÿ” 24    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1
Preview
Rolling scaled forecast accuracy โ€“ Rob J Hyndman When we compute a MASE or RMSSE using a rolling origin, should the scaling factor be recalculated every time?

Some thoughts on forecast accuracy using rolling scaled measures robjhyndman.com/hyndsight/ro... #rstats

20.01.2026 23:38 โ€” ๐Ÿ‘ 17    ๐Ÿ” 4    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Large Language Model tools for R

๐Ÿค– New update to the LLMs+R guide (ENG+SPA)

14 new pkgs plus reading materials
* AI assistants in RStudio
* Ralph Wiggum Coding in R
* local model tools
* auto-routing to the most cost-efficient model
* manage prompts w/ external files
* translations in the IDE
#rstats
luisdva.github.io/llmsr-book/

21.01.2026 18:06 โ€” ๐Ÿ‘ 14    ๐Ÿ” 5    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image Post image Post image

Hey #RStats and #PoliticalScience folks โ€” update on a resource I've maintained:

American political data & R guide:

Presidential results 1864-2024
Senate/House composition since 1920s
Historical margins & ideology scores
Fully reproducible

github.com/jaytimm/amer...
#OpenData #DataViz #Elections

18.01.2026 20:32 โ€” ๐Ÿ‘ 20    ๐Ÿ” 7    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

such a classic!

22.01.2026 09:23 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Biologist talks to statistician
YouTube video by RunProgramRun Biologist talks to statistician

I'm glad this isn't my job. #rstats #statsky #mathjokes

21.01.2026 22:40 โ€” ๐Ÿ‘ 13    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image

Event-study plots in DiD are incredibly persuasiveโ€”but, honestly, not always honest.

Why?
If parallel trends or no anticipation fail, DiD estimates are biased. Testing against a zero-effect null then becomes misleading.

๐ŸšจNew ๐Ÿ“„: arxiv.org/abs/2512.06804

#CausalInference #EconSky #StatsSky #rstats

21.01.2026 09:29 โ€” ๐Ÿ‘ 10    ๐Ÿ” 5    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Post image Post image

How to create a more accessible line chart nrennie.rbind.io/blog/accessi... another great #Rstats post from @nrennie.bsky.social as usual ๐Ÿ”ฎ๐ŸŽฏ

21.01.2026 10:24 โ€” ๐Ÿ‘ 17    ๐Ÿ” 6    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Preview
plumber2 0.2.0 The next version of plumber2 has hit CRAN. Read all about the new features such as OpenTelemetry (OTEL) support, authentication, new tags, and performance improvements here.

I'm super excited to share the next version of plumber2 with all of you. Headlining this release is support for OpenTelemetry throughout the routing logic, build in authentication logic, and much improved performance.

Try it out and report back!

#rstats

20.01.2026 11:43 โ€” ๐Ÿ‘ 17    ๐Ÿ” 4    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

rspatialdata: a repository of spatial datasets & tutorials for spatial analysis & visualization in #rstats, supporting real-world applications such as estimating air pollution, quantifying disease burden, and monitoring progress toward the SDGs๐ŸŒ๐Ÿ’ป๐Ÿ“Š

๐Ÿ‘‰ rspatialdata.github.io

18.01.2026 15:19 โ€” ๐Ÿ‘ 78    ๐Ÿ” 36    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

If you're teaching first year statisitcs (particularly for biology/ag), the data package `smbdata`, featuring all data from the Welham et al (2015) book, is now published on CRAN. Thank you to the authors for their permission to put it on CRAN!

cran.r-project.org/web/packages... #rstats

21.01.2026 01:53 โ€” ๐Ÿ‘ 32    ๐Ÿ” 8    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Zettelkasten Started by [[Niklas Luhmann]], also called โ€œSlip Boxโ€. This approach will further dilute the idea of a librarian who organizes books in main categories and sub-folder and focuses only on the thought a...

I use the Zettelkasten method for all my technical notes.

The insight: don't organize by folder, organize by connection. Instead of deciding "does this note about data quality go in the 'pipelines' folder or the 'governance' folder?", you just link.

After five years, I have 1000s connected notes.

22.01.2026 08:14 โ€” ๐Ÿ‘ 7    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@datascienceweekly is following 20 prominent accounts