Dylan Pieper's Avatar

Dylan Pieper

@dylanpieper.bsky.social

Data scientist @ Pitt β€’ Dog dad πŸ• β€’ Pilot πŸͺ‚ β€’ #rstats β€’ https://dylanpieper.github.io

218 Followers  |  1,602 Following  |  94 Posts  |  Joined: 11.11.2023  |  2.0829

Latest posts by dylanpieper.bsky.social on Bluesky

Preview
a colorful background with the words " the more you know " and a star ALT: a colorful background with the words " the more you know " and a star

A little known fact is that RStudio rendering is powered by users’ electromagnetic fields (i.e., β€œgood vibes”) and the exodus to Positron has severely limited its ability to compile code. #rstats

05.08.2025 11:48 β€” πŸ‘ 9    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Remember this #rstats post? I wasn't the only one talking about it & the tidyverse team was listening 😎 #databs

New #dplyr functions? They're looking for feedback!!
πŸ€” replace_when, recode_values, replace_values

πŸ‘€ Read this:
github.com/tidyverse/ti...

πŸ—£οΈ Comment on PR:
github.com/tidyverse/ti...

04.08.2025 17:30 β€” πŸ‘ 50    πŸ” 13    πŸ’¬ 4    πŸ“Œ 2
Preview
We Need to Talk About Pedocon Theory The connection between Donald Trump and Jeffrey Epstein is no accident, but reveals a deep logic at the heart of reactionary politics.

I think pedocon theory is right. It’s empirically adequate, parsimonious, fits within a broader theoretical framework, and has immense explanatory breadth and depth www.liberalcurrents.com/we-need-to-t...

29.07.2025 11:10 β€” πŸ‘ 76    πŸ” 27    πŸ’¬ 2    πŸ“Œ 1

I am such a sucker for frivolous uses of AI. Here's an anthem for the tidyverse: suno.com/s/iVMVs4IoyA...

11.07.2025 20:35 β€” πŸ‘ 10    πŸ” 2    πŸ’¬ 3    πŸ“Œ 1

Modules + Claude code for simple but labor intensive edits across files

12.07.2025 11:28 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Very cool to see authors of this article mentioning the importance of sharing project-, data-, AND variable-level documentation alongside data in a repository, and linking to the templates I've provided on OSF as an example! 🌟

doi.org/10.1515/ling...

08.07.2025 19:05 β€” πŸ‘ 29    πŸ” 7    πŸ’¬ 2    πŸ“Œ 0

Tbh I relate to that big yellow spike of with mad uncertainty around age 30. 🀣

08.07.2025 12:27 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

As a data manager, good documentation not only helps me do my job better, but also helps me annoy you less! πŸ˜…

Good documentation about inclusion criteria, READMEs about oddities in the data, consort diagrams and tracking to explain missing data, and so on, are all ways to ensure I bug you less! πŸ›πŸœπŸ

27.06.2025 17:46 β€” πŸ‘ 22    πŸ” 2    πŸ’¬ 0    πŸ“Œ 1

I think if you’re curious and truly care about problem solving you might have a ~temporary~ feeling of closure or a premature commit. But you will keep iterating (opening/closing) as you explore the problem space and how it works, validate the throughput, and improve the methods. Stay curious!

19.06.2025 13:57 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Pitfalls of premature closure with LLM assisted coding When LLM models generates clean, professional-looking code, it's tempting to stop exploring alternatives. But therein lies the risks that comes with premature closure. So what is premature closure?

New to me is the term "premature closure", where you too quickly latch on to the first solution you see. Always a danger in coding, but particularly so today when LLMs can give you a plausible fix so so quickly.

www.shayon.dev/post/2025/16...

18.06.2025 14:17 β€” πŸ‘ 99    πŸ” 15    πŸ’¬ 7    πŸ“Œ 4
Compare Similarity Across Text, Factors, or Numbers Compare lists of texts, factors, or numerical values to measure their similarity. The motivating use case is evaluating the similarity of large language model responses across models, providers, or pr...

I would use cosine in stringdist. If you have lists of job descriptions (from two sources with each idx being a similar job), you can use my package samesies. dylanpieper.github.io/samesies/

19.06.2025 11:11 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

So cool! Any intros or docs planned for helping people familiar with the futureverse make the leap to marai?

14.06.2025 10:56 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Functional Programming Tools A complete and consistent functional programming toolkit for R.

Bleeding edge update for the #tidyverse purrr package with even more seamless #rstats parallel maps.

Introducing our shiniest new adverb: `in_parallel()`. Just wrap your function to take advantage of blazing fast parallel processing via mirai.

pak::pak("tidyverse/purrr")

purrr.tidyverse.org/dev/

13.06.2025 15:32 β€” πŸ‘ 103    πŸ” 32    πŸ’¬ 6    πŸ“Œ 1
Prior Predictive Checks with marginaleffects and brms – Vincent Arel-Bundock

One cool thing you can/should do is sample from priors only, and plot the distribution of the actual quantity of interest (ex: risk ratio). I find this very useful. This is actually super easy with brms. arelbundock.com/posts/margin...

12.06.2025 21:52 β€” πŸ‘ 21    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Preview
Engineers Shouldn’t Write ETL: A Guide to Building a High Functioning Data Science Department | Stitch Fix Technology – Multithreaded β€œWhat is the relationship like between your team and the data scientists?” This is, without a doubt, the question I’m most frequently asked when conducting i...

This blog post about engineering not doing ETL is nine years old… it’s worth reviewing

multithreaded.stitchfix.com/blog/2016/03...

11.06.2025 01:57 β€” πŸ‘ 20    πŸ” 5    πŸ’¬ 3    πŸ“Œ 0

The worst is when you write in active voice and then someone tries to edit all of it back into passive. Old habits die hard and the good fight continues.

04.06.2025 12:41 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

You could use surveydown and provide the LLM with the package docs

31.05.2025 13:21 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
screenshot of a code editor showing the following R code:

library(quantmod)
library(ggplot2)
library(lubridate)

startYear <- 2015
startDate <- paste0(startYear, '-01-01')
getSymbols(c('spy', 'btc-usd'), from= startDate)

# function factory that creates a scale function that only shows valid years.
# try to keep code that could change in here!
make_valid_year_scale_function <- function(start_year){
  function(){
    list(
      scale_x_continuous(breaks = seq(start_year, Sys.Date() |> year(), 1)),
      theme(panel.grid.minor.x = element_blank()) # use function after other theme funcs
    )
  }
}

# this makes it so I can add scale_x_valid_years() to any plot
scale_x_valid_years <- make_valid_year_scale_function(startYear)

screenshot of a code editor showing the following R code: library(quantmod) library(ggplot2) library(lubridate) startYear <- 2015 startDate <- paste0(startYear, '-01-01') getSymbols(c('spy', 'btc-usd'), from= startDate) # function factory that creates a scale function that only shows valid years. # try to keep code that could change in here! make_valid_year_scale_function <- function(start_year){ function(){ list( scale_x_continuous(breaks = seq(start_year, Sys.Date() |> year(), 1)), theme(panel.grid.minor.x = element_blank()) # use function after other theme funcs ) } } # this makes it so I can add scale_x_valid_years() to any plot scale_x_valid_years <- make_valid_year_scale_function(startYear)

Here's a functional programming trick for #rstats that I wish I started using sooner:

if you need a #ggplot2 scale to be reusable across multiple plots and dynamically configurable without relying on global state, consider using a function factory (a function that returns a function) to build it

29.05.2025 23:36 β€” πŸ‘ 36    πŸ” 6    πŸ’¬ 6    πŸ“Œ 0
Preview
shikokuchuo{net}: mirai 2.3.0 Advancing Async Computing in R

mirai - minimalist async framework for #RStats - released as an 'r-lib' package.

Blog post: Advancing Async Computing in R.
shikokuchuo.net/posts/26-mir...

mirai provides event-driven async for #RShiny and parallel processing for purrr #tidyverse.

Really excited to be working on this at Posit!

23.05.2025 14:11 β€” πŸ‘ 64    πŸ” 19    πŸ’¬ 1    πŸ“Œ 0
Preview
Restoring Gold Standard Science By the authority vested in me as President by the Constitution and the laws of the United States of America, including section 7301 of title 5, United

tl;dr β€” this EO co-opts the language of open science to implement a system of political control wherein presidential appointees are given broad latitude to designate any number of reasonable scientific activities and inferences as scientific misconduct, and to penalize those involved accordingly.

24.05.2025 21:27 β€” πŸ‘ 2498    πŸ” 1078    πŸ’¬ 103    πŸ“Œ 109

There's so much polarization around LLMs. They are way overhyped, I agree. But I also use them semi-regularly now.

Here's a thread of genuine use cases where I find them helpful. Please add your own!

20.05.2025 19:51 β€” πŸ‘ 95    πŸ” 22    πŸ’¬ 7    πŸ“Œ 13
Preview
Introducing {shinyfa}: Analyze Large Shiny App Codebases Faster with This R Package | Daly Analytics Discover {shinyfa}, a new R package designed to improve developer experience by analyzing and summarizing the structure of large Shiny applications. Perfect for consultants, teams, and contributors wo...

πŸ“¦ I’m excited to share a new #rstats package I’ve been working on: {shinyfa} built to help folks working on large or unfamiliar #rshiny apps ✨

The package scans your app folders and extracts out details on render*(), reactive() and input$ to a dataframe!

πŸ“– www.dalyanalytics.com/blog/shinyfa...

19.05.2025 13:47 β€” πŸ‘ 13    πŸ” 3    πŸ’¬ 2    πŸ“Œ 0
Post image

Playing around with satellite imagery of #madison to make some office art. #Rstats

18.05.2025 13:24 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

✨Use llms from #rstats with ellmer ✨Version 0.2.0 is on CRAN now. No blog post yet because I'm about to go on vacation, but in the meantime you can check out the release notes: github.com/tidyverse/el....

18.05.2025 14:13 β€” πŸ‘ 69    πŸ” 14    πŸ’¬ 3    πŸ“Œ 0

The kind of Friday morning content I needed to see. ❀️

16.05.2025 11:35 β€” πŸ‘ 14    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Text: posit conf 2024 Virtual Tickets Available, Atlanta, September 16-18. A drawing outline of the Atlanta skyline and abstract cubes.

Text: posit conf 2024 Virtual Tickets Available, Atlanta, September 16-18. A drawing outline of the Atlanta skyline and abstract cubes.

Registration for the posit::conf(2025) virtual experience is now open!

Join us virtually, Sept 16–18, and access live-streamed keynotes and 100+ talks, on-demand recordings, Q&A sessions, and our virtual networking platform.

Learn more in the blog post: posit.co/blog/posit-c...

#RStats #Python

15.05.2025 14:59 β€” πŸ‘ 21    πŸ” 17    πŸ’¬ 1    πŸ“Œ 2
Changelog

In case you missed it, we recently updated some of our packages, including many new features (again) in the #rstats #easystats {modelbased} package:
easystats.github.io/modelbased/n...
The last weeks we were working a lot on improving support and performance for Bayesian models and especially

15.05.2025 18:10 β€” πŸ‘ 14    πŸ” 5    πŸ’¬ 1    πŸ“Œ 0

Don’t forget you can distinct_all() which avoids this problem if you’re looking to filter completely duplicate rows

15.05.2025 11:40 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I'm still thinking about my favorite quote from the Posit Data Science Hangout today. It perfectly sums up what I hope I provide to the researchers I work with: a trusted partner, who is there to support them in their work.

Earn a reputation for being a good person to work with
- Cara Thompson

08.05.2025 18:58 β€” πŸ‘ 25    πŸ” 5    πŸ’¬ 0    πŸ“Œ 0

Data science = made with ❀️

Data science = made with sugar, spice, and everything nice πŸ€·πŸΌβ€β™‚οΈ

We’ll get there someday πŸ˜‚

08.05.2025 11:37 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@dylanpieper is following 19 prominent accounts