He doesnβt believe that βbrekkieβ is real; fair enough I barely believe it myself
17.02.2025 02:04 β π 7 π 0 π¬ 3 π 0@drob.bsky.social
Director of Engineering at Heap. #rstats fan. Dad x2. He/him
He doesnβt believe that βbrekkieβ is real; fair enough I barely believe it myself
17.02.2025 02:04 β π 7 π 0 π¬ 3 π 0My son came up with a silly language where you add -ie to the end of each word. Like a shirt is a βshirtieβ or milk is βmilkieβ
So I told him about Australia and he absolutely lost it
Blown away
Using OpenAIβs Deep Research is like collaborating with a PhD student
(It told me it would get right on it then ghosted me)
Thanks for sharing- I might get back into them!
01.02.2025 20:44 β π 4 π 0 π¬ 1 π 0By convention, I use all caps for any function or infix operator I want to be passed to SQL
Because that avoids the possibility of conflicting with an R function, which will lead to an error when dbplyr finds it and tries to apply it
(E.g. lowercase extract() would have had a conflict from tidyr)
Thatβs right, %FROM% isnβt from a package; dbplyr turns any unrecognized infix operator directly into SQL (much like it does with variable names)
Fun fact; %FrOm% would work too
Being able to read is OP for a dad
I take my son around the Natural History Museum, he asks about anything, and I rattle off what the plaque says
He thinks Iβm a goddamn genius
Here is a talk by @drob.bsky.social at @posit.co's conf 2019. The ideas shaped and voiced there are priceless. I've been suggesting this talk to my @datavizartskill.ikashnitsky.phd students ever since, and hopefully some of them found it as useful and motivating /3
youtu.be/th79W4rv67g?...
Apple Music Replay '24 #1 Taylor Swift: 25,520 #
"How did you spend 2024?"
"I'll tell you how I spent 5% of it"
library(tidyverse) library(adventdrob) input <- advent_input(2, 2024) x <- input$x is_safe <- function(report) { d <- diff(report) return((all(d > 0) || all(d < 0)) && all(between(abs(d), 1, 3))) } is_safe_part2 <- function(report) { return(is_safe(report) || any(map_lgl(seq_along(report), \(i) is_safe(report[-i])))) } input$x %>% str_split(" ") %>% map(as.numeric) %>% map_lgl(is_safe_part2) %>% sum()
My #rstats solution to Day 2 of #adventofcode
* I feel like half of Advent of Code puzzles need a diff(), especially in the early days!
* Didn't use much tidyverse today (except map_lgl and between, but those could have easily been replaced)
input %>% separate(x, c("first", "second"), convert = TRUE) %>% summarize(sum(abs(sort(second) - sort(first))))
When I woke up I realized my Part 1 could have been WAY shorter with sort π€¦ββοΈ
01.12.2024 15:44 β π 5 π 0 π¬ 0 π 0library(tidyverse) library(adventdrob) input <- advent_input(1, 2024) separated <- input %>% separate(x, c("first", "second"), convert = TRUE) # Part 1 separated %>% gather(type, value) %>% group_by(type) %>% mutate(rank = rank(value, ties.method = "first")) %>% ungroup() %>% spread(type, value) %>% summarize(sum(abs(second - first))) # Part 2 totals %>% count(first = second, sort = TRUE) %>% inner_join(separated, by = "first") %>% summarize(sum(first * n))
My #rstats solution to Day 1 of #adventofcode
* Fun use of gather and spread (I know I'm supposed to be using pivot_longer and pivot_wider, but old-dog-new-tricks)
* One step I got stuck on was setting a ties.method in rank()
Who is doing #rstats Advent of Code this year? βοΈπ
01.12.2024 02:47 β π 41 π 5 π¬ 10 π 1Upfront (allupfront.com) is fixing the childcare industry through accurate, complete data so the government, parents, and providers can all make decisions and operate with optimal results. Our SaaS platform cleans, validates, and provides vital insights on childcare data (price, hours, location, availability, etc.) and serves as a central hub for every stakeholder. A Techstars portfolio company, Upfront has seen rapid growth with customers such as the states of Maryland, Arizona, and North Carolina. We are looking for a engineer to maintain our system ingest data from our clients, clean and enrich that data, and integrate it into our production system. Responsibilities: Build a system for ingesting daily batches of data from a clientβs API Develop and maintain ETL scripts for processing and enriching that data Manage data quality and accuracy, such as developing automated tests We expect you to have: The ability to architect a data ingestion platform from the ground up Extensive experience with at least one platform for scheduled ETL pipelines Extensive experience in dbt, AWS and Snowflake Intermediate to advanced proficiency in Python Attention to detail and proactivity when it comes to data quality Extra points for: Skill at visualizing and drawing insights from data Experience pulling data from public websites Scrappy mindset- we're a small, but smart team and nothing is above or below our job title
My wife Dana is hiring a full-time Data Engineer at her company!
Great role for someone with strong experience in Python, dbt, and Snowflake who wants to join a growing startup in the government data space
Please forward to strong data folks you know!
www.linkedin.com/jobs/view/40...
Was there a period where you were using the early tools personally / in teaching before you uploaded them to CRAN?
Did that change over the course of reshape, reshape2, ggplot, ggplot2?
`rm -rf` β
β’ remove rf??? what does that even mean???? (nothing)
β’ boring
β’ hard to remember
`rm -fr` β
β’ remove forreal π
πΌπ
πΌ
β’ makes u smile every time
β’ will never forget
#rstats is actually fighting about Base versus Tidyverse again on this platform. We are so back
14.11.2024 23:34 β π 209 π 21 π¬ 12 π 2ggplot(data, aes(x,y)) +
geom_jitter(velocity = units(17000, "mph"))
help, my data is stuck in LEO
#RStats #ggplot #dataviz
I think about this piece at least once a week: nothinghuman.substack.com/p/the-tyrann...
11.11.2024 17:42 β π 9 π 3 π¬ 1 π 1Tired: P-hacking
Wired: Querymandering
Iβve been thinking about this too!!
I think one important shift in the last ten years is that data analysts are much more likely to use SQL + scripting, so βanalysts that can programβ is no longer a niche that gets its own title
Question for #databs folks:
I am searching for a recent write up of how data careers/titles are evolving. Has anyone written or read something that resonated on this lately?
Iβm hoping for a boots on the ground point-of-view of basically βwhere have all the data scientists goneβ π€
Remember in Squid Game where the contestants in mortal danger barely managed to reach safety through a popular vote
And then later the dull banality of regular life drove them to voluntarily re-enter mortal danger
Dunno what made me think of that
My fav starter packs so far, a thread:
stats: go.bsky.app/Ki7PjpS
stats: go.bsky.app/7TBN5rX
causal inference: go.bsky.app/FdemGAZ
package devs: go.bsky.app/N1569Qh
data peeps: go.bsky.app/8TdEfdK
medical stats: go.bsky.app/ArqEz36
bioinformatics: go.bsky.app/Ha64Gmv
r-ladies: go.bsky.app/Vgxwa2F
We've got a brand new, baby website for Positron! Take a look if you are interested in getting started, and please let us know how it goes:
positron.posit.co
I miss when bluesky was good
25.05.2023 14:18 β π 209 π 28 π¬ 2 π 1Twitter is San Francisco: still tech leader, but experiencing a doom loop
Mastodon is Boise: had a big wfh surge, but nothing actually there
Substack is Cambridge MA: go when you want to learn
This place is Miami