Erika Pullum (she/her)'s Avatar

Erika Pullum (she/her)

@erikapullum.bsky.social

Engineer, climber, currently working as Head of Data at Hex. www.erikapullum.com

318 Followers  |  217 Following  |  42 Posts  |  Joined: 19.11.2024  |  1.8761

Latest posts by erikapullum.bsky.social on Bluesky

We're looking for someone to join our lovely team at Data Orchard as an Analytics Engineer! This is a really diverse and exciting role working on a wide range of #data projects for our wonderful nonprofit clients and our own data products. Please do apply if this sounds up your street ๐Ÿ‘ฉโ€๐Ÿ’ป #dataSky

04.06.2025 10:45 โ€” ๐Ÿ‘ 1    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
The Data Visualisation Catalogue A handy guide and library of different data visualization techniques, tools, and a learning resource for data visualization.

This is fun and I haven't seen it before! datavizcatalogue.com/index.html

#dataBS

11.04.2025 17:31 โ€” ๐Ÿ‘ 4    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

The biggest challenge at a startup that's growing quickly is how often your job changes. What matters and where your focus needs to be is always changing.

15.01.2025 21:54 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I wish there was a way for the dbt Cloud Explore DAG to filter to content that exists *between* two nodes. Anyone know of a fun work-around for this with other tools?

#dataBS

15.01.2025 17:51 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

What's on my mind this week (as Head of Data at a growth stage startup)

* Hiring ๐Ÿ”œ
* Upcoming vendor contract negotiations
* Priorities pulse check with our stakeholders
* Couple lil IC tasks to standardize a metric a bit better

#dataBS

13.01.2025 20:09 โ€” ๐Ÿ‘ 11    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Naming is hard.

#dataBS

13.01.2025 16:53 โ€” ๐Ÿ‘ 16    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Perpetually flip flopping on self serve analytics

Some days it feels like it could be the holy grail - enabling downstream users to do their own exploration and analysis on governed metrics. Fewer ad hoc requests! Reporting automation!

Other days: So what? Does that actually move the needle?

07.01.2025 18:53 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

The questions to decide are:
* Where is the inflection point where ROI and effort are aligned -- aka how deep to invest
* For which functions should it be prioritized
* What does the support model look like
* What is the process for getting more help when needed

07.01.2025 20:36 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

As a leader, you can't ignore it and it's impossible to universally achieve. When making consequential decisions stakeholders want consensus and a partner.

When assets exist you still have to teach new people what they're called and where to find them. Metrics have to keep up with the biz.

07.01.2025 20:36 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Donโ€™t think of it as whether the data has error or not. Most of the time it will.

Think of it as whether the error in the data will cause you to make a different decision than if the data was perfectly clean. Most of the time it wonโ€™t.

02.01.2025 16:49 โ€” ๐Ÿ‘ 52    ๐Ÿ” 9    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

One of the biggest tips I have for anyone doing data analysis, especially data from people, is to spend some time drilling down to the most granular data and just looking at individual records. You will find the craziest shit you never imagined and your analysis will be better for it #databs

02.01.2025 14:40 โ€” ๐Ÿ‘ 42    ๐Ÿ” 6    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

My #dataBS new year's resolution is to stop feeling guilty when I tell someone to go self serve their QQ. What's yours?

02.01.2025 20:39 โ€” ๐Ÿ‘ 6    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Dear Santa, All I want for Christmas is for any of the abandoned GitHub repos that half-implemented the Hemingway writing app as a VS Code extension to finish the job. Please put your best TypeScript dev elf on this ๐Ÿ™

20.12.2024 01:18 โ€” ๐Ÿ‘ 24    ๐Ÿ” 3    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Watching Hex's internal hack week demos is like Christmas early ๐Ÿ˜

13.12.2024 21:09 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Has anyone cracked the code on the most efficient way to align on metric definitions with your customers - key word being efficient?

13.12.2024 16:01 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I'm going to cite this as evidence of how great our team is next time we get to hire, watch me.

12.12.2024 21:15 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

turning your pain into art I see ๐Ÿ˜†

12.12.2024 19:15 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

a poem by poetry:

poetry run dbt compile
command not found: dbt

poetry show | grep dbt
dbt 1.0.0.38.22

pip freeze | grep dbt
dbt==1.0.0.38.22

which dbt
dbt not found

(look, no one said it was a happy poem)
#dataBS

12.12.2024 14:50 โ€” ๐Ÿ‘ 30    ๐Ÿ” 4    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 0

Which is a bummer because unless you carefully guard the inputs the recursion can run away forever!

11.12.2024 22:52 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Fascinating! Confirmed this works in DuckDB, but would need some concepts ported to Snowflake to work in our warehouse. Nice work on this! #dataBS

11.12.2024 18:40 โ€” ๐Ÿ‘ 5    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Wait I found a solution in pure SQL using list transformations (works in DuckDB): gist.github.com/aranke/74206...

11.12.2024 16:44 โ€” ๐Ÿ‘ 6    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2

It nerd sniped our whole team in the best way! We used a recursive solution!

11.12.2024 14:22 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

The Boolean is the correct answer! The challenge is to create it. I havenโ€™t found a way to do it in SQL with lag functions but Iโ€™d love to see one!

11.12.2024 14:21 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

And full confession I did not solve it

10.12.2024 20:13 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

to make the base
select '2024-06-10' as d

union all

select '2024-08-20' as d

union all

select '2024-08-22' as d

union all

select '2024-09-17' as d

union all

select '2024-09-19' as d

union all

select '2024-11-01' as d

union all

select '2024-12-11' as d

union all

select '2024-12-21' as d

10.12.2024 20:12 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
A table of dates. 2024-06-10, 2024-09-17, and 2024-12-21 are labeled true. 2024-08-20, 2024-08-22, 2024-09-19, 2024-11-01, and 2024-12-11 are labeled false for is_after_cooldown

A table of dates. 2024-06-10, 2024-09-17, and 2024-12-21 are labeled true. 2024-08-20, 2024-08-22, 2024-09-19, 2024-11-01, and 2024-12-11 are labeled false for is_after_cooldown

Super #SQL brainteaser for your Tuesday that kept a few folks on our team thinking pretty hard until someone made an elegant solution.

The desired behavior per unit:
* First event qualifies
* Subsequent events qualify only if it's been more than 90 days since the last

Sample dates below #dataBS

10.12.2024 20:12 โ€” ๐Ÿ‘ 22    ๐Ÿ” 1    ๐Ÿ’ฌ 4    ๐Ÿ“Œ 3
A response from github user caitlinmoorman that says I.love.it.

A response from github user caitlinmoorman that says I.love.it.

aaaand it worked

09.12.2024 22:40 โ€” ๐Ÿ‘ 5    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
A PR title for #1607 that reads `Erika/delight caitlin`

A PR title for #1607 that reads `Erika/delight caitlin`

Continuing my theme of banger PRs -- this one improved documentation for a field confusing to a lot of folks

#dataBS

09.12.2024 22:35 โ€” ๐Ÿ‘ 6    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Just had a great chat about being the first data person at a startup.

Pros
* Building at high velocity is fun
* Work with smart people

Challenges
* Always more than you can do
* Hard to know how good to build when

Folks that have done it, what's on your pro/challenge list?

#datasky #dataBS

06.12.2024 16:54 โ€” ๐Ÿ‘ 14    ๐Ÿ” 2    ๐Ÿ’ฌ 5    ๐Ÿ“Œ 0

@erikapullum is following 20 prominent accounts