It was the worst of times, it was *tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab tab* is a far, far better thing that I do, than I have ever done; it is a far, far better rest that I go to than I have ever known.
17.01.2025 01:02 โ ๐ 30 ๐ 2 ๐ฌ 0 ๐ 0
As we work through integration details weโll share more about how this will work. But if you use dbt today, youโll be able to use this new tech.
14.01.2025 14:22 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
While SDF won't be included as part of the Apache 2.0 code base, we plan to make meaningful parts of SDFโs capabilities available to all dbt usersโwhether youโre using dbt Core or dbt Cloud.
14.01.2025 14:22 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
So what does this mean for dbt users? The first goal is to get SDFโs SQL parsing capabilities integrated into dbt.
14.01.2025 14:22 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Local Execution: Instead of having to hit your data platform in development, you can take that logical plan and execute it in a local environment.
14.01.2025 14:22 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Lineage: SDF has both the highest-fidelity and most high-performing SQL parsing on the market. And lineage and metadata is, of course, at the heart of the entire data control plane.
14.01.2025 14:22 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Because SDF understands your SQL, it can detect errors without connecting to the remote database. Troubleshooting all of a sudden becomes far faster, as errors get caught as you are typing, not when you do a dbt run.
14.01.2025 14:22 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
SDFโs ability to understand SQL means that it can power IntelliSense in your IDE of choice. With every keystroke, SDF understands what you are typing and can automatically suggest what comes next, including suggesting table and column names.
14.01.2025 14:22 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Developer experience: There are many things that will eventually go into this bucket, but here are two great examples
14.01.2025 14:22 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
SDF parses and compiles dbt projects really, really fast: Because itโs built in Rust, it simply runs faster than Python. As a result, SDF compiles the same dbt project multiple orders of magnitude faster than dbt Core. If youโre working in a large dbt project, this will vv impact your productivity.
14.01.2025 14:22 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Benefits for developers: faster, a better developer experience, lineage and local execution
14.01.2025 14:22 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Integration is easy. SDF has adopted dbtโs syntax, configuration, libraries, and Jinja natively, as part of the SDF runtime. As a result, for most dbt projects there will be no code changes required to take full advantage of SDFโs capabilities!
14.01.2025 14:22 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Unlike dbt historically (which has treated SQL as strings), SDF sees objects and types and syntax and semantics. In the same way that Virtual Machines (VMs) emulate physical hardware, SDF emulates the SQL compilers native to the data platforms you use.
14.01.2025 14:22 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
The toolchain is powered by a state-of-the-art development in SQL understanding. SDF represents each SQL dialect (Snowflake, Redshift, BigQuery, etc) as a complete ANTLR grammar with definitions for all datatypes, coercion rules, functions, scoping intricacies and more.
14.01.2025 14:22 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
It is written in Rust, highly parallelized, and designed for scale.
14.01.2025 14:21 โ ๐ 2 ๐ 0 ๐ฌ 2 ๐ 0
What is SDF? SDF is a high performance toolchain for SQL development packaged into one CLI; a multi-dialect SQL compiler, type system, transformation framework, linter, and language server.
14.01.2025 14:21 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
Screen shot of article about podcast "The intersection of UI, exploratory data analysis and SQL"
An excellent overview of the evolution of EDA and data viz: from Tukey to BI tools, Python/R/JS tools, in-process databases e.g. DuckDB, WASM...and the exciting future changes to how data is done. Thanks @jthandy.bsky.social y.bsky.social @hamilton.bsky.social
roundup.getdbt.com/p/the-inters...
23.12.2024 06:27 โ ๐ 3 ๐ 1 ๐ฌ 0 ๐ 0
jealous. i never got this upgrade. i just rely on my 'doesn't need that much sleep' superpower :P
25.11.2024 19:15 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
Yep, agree. Unsurprisingly I see a lot of this as an ecosystem problem and think SWE is ahead because of the persistent compounding effects of OSS over the course of multiple decades. I think data people are constantly forced to make this shit choice when they should be able to have both.
22.11.2024 21:41 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
sorry, legit not meant as a snipe, i found your post provocative.
22.11.2024 01:59 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
why can't we have both?
consistent underlying platform, different development experiences... right?
C developers still fight about vim / emacs, they're still united by language.
i think we often make things in data very hard that should be very easy.
22.11.2024 01:58 โ ๐ 3 ๐ 0 ๐ฌ 2 ๐ 0
What is next? Do you go into prod also in Duck? or..?
Real question, curious about the workflow you're cooking on.
19.11.2024 18:02 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
where do YOU keep your last 7 years of tax returns!?
11.11.2024 17:14 โ ๐ 4 ๐ 0 ๐ฌ 2 ๐ 0
well now i'm one post in and zero toxicity. good start! lol
11.11.2024 17:10 โ ๐ 10 ๐ 0 ๐ฌ 1 ๐ 0
Dipping my toe in here. Deeply skeptical of social at this point in my life but at the same time in search of that Twitter golden age. Trying to be open. Hello world.
11.11.2024 14:41 โ ๐ 53 ๐ 2 ๐ฌ 11 ๐ 2
Engineer, climber, currently working as Head of Data at Hex.
www.erikapullum.com
Publishes http://commoncog.com. Tweets about books & the art of business, from the perspective of an operator. Also: https://warpcast.com/cedric
Arch Data (@arch.dev) Founder/CEO
Meltano maintainer ๐
ex-Data @ GitLab, Concert Genetics
PhD @ Vanderbilt
Paid Data Influence / Actor aka "The Peter Drucker of Data Shitposts"
Husband and Dad ๐ฆ๐ฆ๐ง
Hobby consultant, ๐๐ช๐ฎ๐ฅ๏ธ๐๏ธ๐น๐ง๐จโ๐ณ๐ท๐ชโ
Quant UX Researcher @ GCP (posts are my own)
Writes a data newsletter: https://www.counting-stuff.com/
https://linktr.ee/randy_au
Languages: EN/JP/CN
Data nerd, โrecovering data scientistโ, author, podcaster, occasional athlete
Data Engineer @Supabase. Previously @Nasdaq. Creator of pypacktrends.com. Peeling back the layers of abstraction.
Learn more about me: tylerhillery.com
Writer @ davidsj.substack.com
VP of AI @ cube.dev
Software, Data and Analytics Engineering. Principal Engineer at Bobsled.
Dividing my time between Berlin and Bangalore.
Formerly led Data and Eng at Beat Mobility, Omio and Thoughtworks.
Data engineer in practice, data librarian at heart.
Knitter, dog momma, board gamer, and consumer of sci-fi/fantasy of all mediums.
Civic tech, data for good, peace/conflict data, and philosophizing thru data modeling
https://jennajordan.me
๐ @107wins.club
๐ช @thedataroom.app
๐ @generalfolders.com
No one of consequence๐ง
Solutions Engineer
Industrial-Organizational psych
Stage 4 endo & stroke survivor
Geek dad with a clipboard. data / AI, open source, community building. Interested in ideas with high creative surface area
blog.abegong.com
Apache Arrow & DataFusion PMC Member. Original creator of Apache DataFusion.
Dad, data, process, philosophy | Sales Eng @dagster | My wife says my headstone will read โsorry Iโm lateโ
Works with data, runs with swords.
Write at lethain.com. Author of An Elegant Puzzle, Staff Engineer, and An Engineering Executiveโs Primer. Worked some places.
CEO @prefect.io. Building FastMCP. Mostly harmless.
VP DevRel at Confluent. Relentless hobbyist. Married to @swimflythrive.bsky.social. Father of three, grandfather of four. Believer in Christ. Opinions should be your own.
Available for coffee dates
Data infra AI
Ex MotherDuck Firebolt BigQuery. Papa. Husband to
@emilyvajda
Ultra-competitive C class cyclist