Hey, thank you so much for your answer - forgot to get back to you.
09.01.2025 16:52 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0@jjuanramos.bsky.social
Hey, thank you so much for your answer - forgot to get back to you.
09.01.2025 16:52 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0Absolutely - Iโm thinking about that, too: how does one make sure that the analytics or data engineer has explored the data of the model they have built?
Asking for descriptive stats seems like a good way to do so, as even in new tables there arenโt that many โdifferent-from-usualโ columns
Thanks for the recommendation!! Currently in the process of implementing dbt-checkpoint in our CI for making sure that our style guide is followed through
20.12.2024 15:11 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0@nicoritschel.com hey Nico! Iโm finding out about sidequery and it looks so cool.
Is it possible to query s3 parquet files? Thinking it might be a good way to explore data transformations that come out of dbt in a qualitative way
I can think of adding a PR template that makes the developer say โyeah, I audited the dataโ, and that might be a first good practice. Itโs just so contingent of the business requirements to make sure that the data can be useful
19.12.2024 08:15 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0How do folks handle developers validating data quality before doing a PR? Iโm talking about the small things, such as not naming โsomething_idโ a string column with unrelated column and such.
Impossible to check programmatically (afaik), but has huge impact on data consumption #databs
"I meet lots of idealistic folks who think that all theyโre missing is money, or credentials, or access to the levers of power. More often, what theyโre really missing is friends."
14.12.2024 15:39 โ ๐ 99 ๐ 23 ๐ฌ 1 ๐ 0After reading the latest Bennโs piece, open.substack.com/pub/benn/p/i..., Iโm curious: what cliffs to climb are there for data people?
I can see growth roles as pointed by Abhi, product roles, finance roles.
I guess itโs really open, harder question being how does one get there? #databs
Each time more convinced that, at sales-driven companies, departments different from the Sales one end up becoming cynical about the product. Thereโs this sense that youโre bluffing the customer, as if in the process of zeroing in selling more you expel the care and soul from the product
13.12.2024 13:05 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0Alright so now most of my followers are porn bots.
I wanted to become Bluesky famous but not like this ๐ซ
Anyone using the Activity Schema in practice? How does it hold up against more known ones such as Dimensional Modeling?
Asking because we are starting a project from scratch and curious about how it holds
#databs
New job starting today. Super excited.
Wish me luck!
So glad you did ๐
02.12.2024 07:25 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0Data Work would be akin to using the Hunting Horn in Monster Hunter: youโre not essential for the hunt, but you get to enjoy being the coolest while gifting buffs to the ones that can hit the hardest
28.11.2024 19:54 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0well this looks super cool! congrats, will get it for sure :)
27.11.2024 09:59 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0What I'm most excited about though is the focus on delivering value & the concretes way to do so brought by people such as @cedricchin.bsky.social + XmR Charts & @abhisivasailam.bsky.social + Metric Trees.
"Identifying the levers of the business & helping pulling them" is so hot for the data people
little update: uv added ~3s of overhead, so 4s it is.
Point remains, though: `sdf lint` seems an actual contender to displace sqlfluff in the linter category just because of how fast it is
and I say this without wanting to throw any shade to sqlmesh at all. They've done so many amazing design choices.
However, following JS' trend of using Rust (or Zig or whatever) as the tooling language seems the right decision. Speed matters. It seems to unlock other cool stuff along the way too.
What's cool about sdf is that it is the first transformation tool in the data space that feels fast. A pity that it's not open source.
Sqlmesh, in comparison, is really slow. Running `uv run sqlmesh plan` in a personal, small project takes ~7s to run, which is great if I go back to dbt, but still.