Building notes: Linear regression viz in 3D
Listening to Chappell Roanโs new single โSubwayโ on an hour-long loopโcanโt tell if Iโm crying because of the song or the XQuartz installation error.
A fool's errand? Making a 3D ggplot? I had Chappell Roan's newest single on loop and tortured myself into getting something complicated to work when I found an easier solution #dataviz #ggplot2 #rstats
05.08.2025 01:23 โ ๐ 9 ๐ 4 ๐ฌ 2 ๐ 1
Introducing the Data Explorer in Positron!
Quickly view raw data files (CSV, Parquet, etc.) or dataframes from your #Python / #RStats sessions with a data grid, summary panel, and filter bar.
Learn more: positron.posit.co/data-explore... #Positron
30.07.2025 18:39 โ ๐ 90 ๐ 15 ๐ฌ 3 ๐ 2
Air, an extremely fast R formatter
We are thrilled to announce Air, a new R formatter.
Okay, I saw @libbyheeren.bsky.social mention Air recently but did not realize what it was until today's Data Science Hangout. Looks very cool for formatting code!
www.tidyverse.org/blog/2025/02...
31.07.2025 16:16 โ ๐ 18 ๐ 5 ๐ฌ 1 ๐ 0
Wrote a new, modern stats curriculum.
Teach about probability and sampling via computational examples / simulations with real data. It's unbelievably helpful for intuition. Everything else follows.
Online and open-source: jrudoler-teaching.github.io/understandin...
31.07.2025 20:20 โ ๐ 5 ๐ 2 ๐ฌ 0 ๐ 0
And if you canโt join, raise a toast and share her classic How To Name Files (github.com/jennybc/how-...) with all your teams so she doesnโt have to SET YOUR COMPUTER(S) ON FIRE ๐ฅ.
31.07.2025 00:53 โ ๐ 21 ๐ 6 ๐ฌ 0 ๐ 0
GitHub - cwida/FastLanes: Next-Gen Big Data File Format
Next-Gen Big Data File Format. Contribute to cwida/FastLanes development by creating an account on GitHub.
New data oriented file format just dropped.
FastLanes, "like Parquet, but with 40% better compression and 40ร faster decoding". ๐
Seems it can exploit correlations between columns and have fully SIMD friendly encodings to help with vectorization.
github.com/cwida/FastLa...
24.07.2025 15:12 โ ๐ 73 ๐ 16 ๐ฌ 3 ๐ 2
Revisiting Moneyball
Data, sports, payrolls, and memes
I have finally published my post about Moneyball, as promised (a long time ago). If you're interested in baseball, numbers, or movies, please take a look. I'd love to know what you think.
cc @matsonj.com @alexnoonan.bsky.social
djpardis.medium.com/revisiting-m...
24.07.2025 16:58 โ ๐ 9 ๐ 3 ๐ฌ 1 ๐ 2
Screenshot of the text of the linked blogpost 1/4
Screenshot of the text of the linked blogpost 2/4
Screenshot of the text of the linked blogpost 3/4
Screenshot of the text of the linked blogpost 4/4
~~ making sense of academic statistics ~~
i wrote about the confusing relationship between statistics and data analysis, and also about how statistics relates to science
#statistics #rstats #datascience
www.alexpghayes.com/post/making-...
15.07.2025 20:15 โ ๐ 109 ๐ 19 ๐ฌ 14 ๐ 8
Importing Data with Python
Importing data is a key step in the data science workflow. Here we compare data import for two key Python data-frame libraries - Polars and Pandas.
Importing data is a key step in the data science workflow. In our latest Python blog post, we compare this process for two key libraries - Polars and Pandas - emphasising how to convert to the correct data-type and why you should validate the structure and content of the imported data.
#rstats
17.07.2025 10:24 โ ๐ 5 ๐ 3 ๐ฌ 0 ๐ 0
even if it's fake it's real - sippey.com
Michael Sippey's blog, published semi-regularly since 1995.
sippey.com/2010/11/even... this post was so influential in my understanding of how to think of internet culture, and it's held up so well in everything except the optimistic positivity
10.07.2025 13:03 โ ๐ 8 ๐ 4 ๐ฌ 2 ๐ 1
Designing the Tools that Shape Data Science โ with Dr Hadley Wickham โ The Random Sample
To listen, just search for The Random Sample wherever you get your podcasts, or head to our website: www.therandomsample.com.au/podcast/hadl...
#rstats #rstudio #datascience #statistics #opensource #programming #rstudio @robjhyndman.com @posit.co
02.06.2025 02:15 โ ๐ 5 ๐ 2 ๐ฌ 0 ๐ 0
Writing a basic Linux device driver when you know nothing about Linux drivers or USB
i've always been curious about how to write a Linux USB device driver and this blog post looks like a great intro: crescentro.se/posts/writin...
26.06.2025 19:08 โ ๐ 184 ๐ 14 ๐ฌ 5 ๐ 1
I wrote a blog post to celebrate 10 years of loo package ๐ (R package implementing fast Pareto smoothed importance sampling cross-validation and many other useful methods for cross-validation)
26.06.2025 11:01 โ ๐ 68 ๐ 15 ๐ฌ 0 ๐ 1
This week's #ChemSciPicks comes from @graemeday.bsky.social (University of Southampton), @ffmmgg.bsky.socialโฌ, Chengxi Zhaoโฌ, Xenphon Evangelopoulos, and @aicooper.bsky.socialโฌ (University of Liverpool).
Read the full paper here: doi.org/10.1039/D5SC...
#ChemSky
18.06.2025 09:00 โ ๐ 7 ๐ 4 ๐ฌ 1 ๐ 1
โ71-75: Extremely grossโ
That tracks
19.06.2025 02:30 โ ๐ 15 ๐ 1 ๐ฌ 1 ๐ 0
Really nice interactives from @rospearce.bsky.social in this deep dive into Glassdoor company-review data
www.economist.com/interactive/...
18.06.2025 21:08 โ ๐ 31 ๐ 6 ๐ฌ 1 ๐ 0
Writing Manually (In Times of AI-generated Content)
In times of [[AI Writing]], writing manually is more important than ever. As with many [[Generative AI|AI Generated]] texts, Iโd rather see the prompts, it would have more soul and character than theโฆ
Is writing a manual like driving a car manually, instead of automatically?
In my experience with AI-writing, every time I use it for a bigger task (restructuring or telling me the missing chapters), it does things I don't like, most importantly, distracting me from my actual task: writing.
19.06.2025 07:59 โ ๐ 5 ๐ 1 ๐ฌ 1 ๐ 0
R Package Quality: Validation and beyond!
Not all R packages are clearly โgoodโ or โriskyโ, most fall somewhere in between. This post introduces a scoring framework to help users assess package quality, based on documentation, code, maintenance, and popularity. We also share key principles to ensure the scores are useful, fair, and adaptable to different contexts.
At Jumping Rivers, we've developed a scoring framework to help users assess R package quality.
Our latest blog post, "R Package Quality: Validation and Beyond!", walks through this new framework and shares guiding principles that ensure the scores are fair, flexible, and context-aware.
#rstats #R
19.06.2025 13:15 โ ๐ 8 ๐ 4 ๐ฌ 0 ๐ 0
YouTube video by Apache Iceberg
Scalable Lakehouse Architecture with Iceberg & Polaris: A Battle-tested Playbook
Scalable Lakehouse Architectures with Iceberg and Polaris!
Simon from Tactile shared insights on tackling bottlenecks in data loading using Apache Iceberg and our open-source dlt library.
youtu.be/gb5fwIO4pX0?...
#databs #iceberg
12.06.2025 07:15 โ ๐ 3 ๐ 1 ๐ฌ 0 ๐ 0
I wrote a short blog post about our experimental work on the logic of guesses:
xphi.net/2025/06/03/e...
11.06.2025 15:25 โ ๐ 13 ๐ 3 ๐ฌ 1 ๐ 0
GitHub - ryancdotorg/freq: Like `sort | uniq -c | sort -rn` but better
Like `sort | uniq -c | sort -rn` but better . Contribute to ryancdotorg/freq development by creating an account on GitHub.
There are now binary builds of my data analysis tool `freq` available for Linux, MacOS and Windows, and I've added a few more features.
It's great for ad-hoc log file analysis.
github.com/ryancdotorg/...
04.06.2025 19:02 โ ๐ 21 ๐ 6 ๐ฌ 0 ๐ 0
I think the thing I'm most excited to see over the next ~10 years of #dataviz is web-based content that interweaves long-form text and modular interactives.
Not as heavy as scrollytelling and not as aimless as a dashboard, but something in between.
This is what I was going for with the QR project!
04.06.2025 14:46 โ ๐ 37 ๐ 7 ๐ฌ 3 ๐ 1
A blog/newsletter of #rstats stuff by Damie Pak (โชโช@damiepak.bsky.social)โฌ. Kinda like a travel blog but it's about transitioning into industry from a postdoc. I create silly things for you to enjoy. Entirely free.
๐งฎ Statistics & data science
๐ Clinical trials & R&D & Epidemiology
๐ป R enthusiast
๐ฉโ๐ป Stats @ loyal.com
https://jesslgraves.github.io
Computational geographer. Associate Professor at AMU, Poznan, Poland. Co-author of http://r.geocompx.org, http://py.geocompx.org, and http://tmap.geocompx.org books. #rstats #rspatial #geocompx
https://jakubnowosad.com/
Stats & DS @ UCSB '26, data science intern at the PCCTC.
Currently into: biostatistics, data visualization, summer reading, fencing. Warning for the occasional cat picture.
Professor in computational Bayesian modeling at Aalto University, Finland. Bayesian Data Analysis 3rd ed, Regression and Other Stories, and Active Statistics co-author. #mcmc_stan and #arviz developer.
Web page https://users.aalto.fi/~ave/
Cognitive scientist at the University of Edinburgh. Causality, computation, evolution.
Lab: https://quillienlab.github.io/
Hacker. Enby. Administrative inconvenience. Purveyor of technically sophisticated shitposts.
Suing the UK for more gender, help with my legal bills: https://enby.org.uk/
Mastodon: https://infosec.exchange/@ryanc
Non-binary trans androgynous (they/them)
creative coder, preferrer of democracy
also into coffee shops, sad folk music, basketball, and playing chess poorly
data viz: http://perthirtysix.com
creative coding: http://shrikhalpada.dev/projects
Messing around with boats and databases.
Okta pays the bills. Sometimes data systems, sometimes security, sometimes ai/ml, sometimes a blend of it all.
he/him - writing statistical software at Posit, PBC (nรฉe RStudio)๐ฅ
simonpcouch.com, @simonpcouch elsewhere
Assistant Head for Secondary Progress and Digital Strategy | Experienced Computer Science teacher and Senior Examiner | NPQSL | Aspiring astrophotographer | #EdTech #EduSky #STEM #Cybersecurity #DataScience #AI #MIEE |
Indie dev making small games.
Current making DEEP SPACE EXPLOITATION!
Discord: discord.gg/cQCET4NrgG
Steam: store.steampowered.com/app/3656660/?utm_source=bluesky
I'm a confirmed Episcopalian
I'm a husband
I'm a father
I'm a son and brother
I'm a mentor
I'm a flight test engineer
I'm a glider instructor pilot
I'm a tenor saxophonist
I'm a human factors observer
I'm into R&B (reporting and blocking)
Data, biotech, engineering, biology, aws
Data scientist โข #rstats โข Stan โข ๐ฑ PyMC โข viz โข travel โข reuse, repurpose
PhD researcher in Machine Learning at Imperial College. Visiting at University of Oxford.
Interested in all things involving causality and Bayesian machine learning. Recently I have also been interested in scaling theory.
https://anish144.github.io/
I am here for all interesting and funny posts on the social sciences, broadly understood and including open science and meta science, academia, teaching and research. https://linktr.ee/ingorohlfing
Associate Prof of Special Education @KentState.
Ph.D. @MSUCollegeOfEd.
Researcher. Teacher educator. Co-editor https://journals.sagepub.com/home/aei
Neuroscientist studying Parkinson's disease | check out my #rstats website: https://mattkmiecik.com/