"Saying 'you are wrong' is not curious. Saying 'why are your priors different than what the data is showing' is curious."
Bryan Bischof on Data Renegades
#DataScience #DataCulture
15.02.2026 17:11 β π 0 π 0 π¬ 0 π 0
The Data Valentine Challenge | Recce
Join the Data Valentine Challenge! 5 days of quick, actionable challenges led by experts from Recce, Greybeam, dltHub, Database Tycoon, and Bauplan.
Happy Valentine's Day to everyone who spent this week falling back in love with their data stack.
Five days. Five companies. No slides. No safety nets.
Thank you Greybeam, dltHub, Database Tycoon, & bauplan for doing this with us and shipping in front of an audience.
14.02.2026 17:21 β π 1 π 0 π¬ 1 π 0
Data Renegades | Ep. #6, From Big Data to Curiosity-Driven Insight with Roger Magoulas | Heavybit
On episode 6 of Data Renegades, CL Kao and Dori Wilson of Recce speak with Roger Magoulas about the real bottlenecks holding data organizations back.
"The best analytics asks more questions than it answers."
Roger Magoulas on why dashboards shipped is the wrong success metric. The real measure is whether the work generates questions worth investigating.
π§ www.heavybit.com/library/podc...
#dataengineering #analytics
14.02.2026 17:05 β π 0 π 0 π¬ 0 π 0
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β importing satellite telemetry into a lakehouse,β¦
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
Five days. Five companies. Five chances to fall back in love with your data stack. That was the Data Valentine Challenge
Full replays: youtu.be/yzX05Z8FlYw
#DataEngineering #DataValentineChallenge
13.02.2026 20:26 β π 0 π 0 π¬ 0 π 0
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β importing satellite telemetry into a lakehouse,β¦
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
The prompting lesson: "It's better to explicitly say 'don't use Pandas' rather than just encouraging other libraries." Ban what you don't want. Don't just suggest what you prefer. Works for AI agents. Works for code reviews too.
13.02.2026 20:26 β π 0 π 0 π¬ 1 π 0
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β importing satellite telemetry into a lakehouse,β¦
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
The AI agent made real mistakes along the way. Hallucinated a namespace decorator. Tried to write directly to main β bauplan blocked it. Defaulted to Pandas instead of PyArrow. Every mistake happened on a staging branch. Production never saw any of it
13.02.2026 20:26 β π 0 π 0 π¬ 1 π 0
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β importing satellite telemetry into a lakehouse,β¦
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
Act 3 β Write-Audit-Publish. The agent moved validation into the ingestion pipeline itself. Expectations ran inline. Bad rows filtered before reaching silver. Row count dropped by half. All validations passed. Merge to main went through clean
13.02.2026 20:26 β π 0 π 0 π¬ 1 π 0
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β importing satellite telemetry into a lakehouse,β¦
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
Act 2 β Validation. The agent wrote a separate validation pipeline using bauplan's expectations framework. Null checks, numeric type compatibility, uniqueness. The uniqueness check failed. Duplicate rows in the silver table. Problem visible
13.02.2026 20:26 β π 0 π 0 π¬ 1 π 0
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β importing satellite telemetry into a lakehouse,β¦
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
Act 1 β Naive pipeline. The AI agent imported satellite telemetry into bronze, passed it through to silver, merged to main. Data landed. But it had duplicates and string-typed numeric columns. An anomaly detection system would break silently on this
13.02.2026 20:26 β π 0 π 0 π¬ 1 π 0
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β importing satellite telemetry into a lakehouse,β¦
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
Intella builds anomaly detection for satellite fleets. The lakehouse runs Iceberg on S3 with a git-for-data catalog. Every pipeline runs on a staging branch. Nothing touches main until it passes validation and gets explicitly merged
13.02.2026 20:26 β π 0 π 0 π¬ 1 π 0
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β importing satellite telemetry into a lakehouse,β¦
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
An AI agent tried to write to production. The lakehouse said no. Day 5 of Data Valentine Challenge: @BauplanLabs brought the finale. Aldrin let Claude Code build his entire pipeline from scratch β on transactional branches that caught every mistake π§΅
13.02.2026 20:26 β π 0 π 0 π¬ 1 π 0
Full replay: youtu.be/2snf_AY94-A
Tomorrow is the finale: Bauplan, Let AI Build Your Pipelines Without Breaking Your Heart (or Production)
Catch the full week: reccehq.com/data-valentine-week-challenge
#DataEngineering #dbt #DataValentineChallenge
12.02.2026 19:19 β π 0 π 0 π¬ 0 π 0
Chloe's mom's rule: when you're getting dressed, always take one accessory off. Then take another one off.
That was the whole session. Don't build just-in-case models. Delete fearlessly. Git remembers everything.
`dbt docs serve` is free. Spin it up and let the graph show you where the mess is.
12.02.2026 19:19 β π 0 π 0 π¬ 1 π 0
After cleanup: the DAG went from overwhelming to clean. The cleanup revealed what was still wrong. Models without sources. Staging tables with no downstream path. Cleaning up made the real problems obvious.
12.02.2026 19:19 β π 0 π 0 π¬ 1 π 0
Three dimension tables, dim_date, dim_borough, dim_day_type, that nothing joins to.
Built for a star schema that never materialized. Delete until you actually need them.
12.02.2026 19:19 β π 0 π 0 π¬ 1 π 0
An intermediate model called "stops with routes." What is it?
A SELECT * from each source. Cross joined. Every stop times every route. A Cartesian product that nothing downstream used.
"You're not gonna need it."
"Less is more. Heard."
12.02.2026 19:19 β π 0 π 0 π¬ 1 π 0
Three intermediate models with zero downstream dependencies. Built because someone thought they'd be useful someday. Classic "just in case" modeling.
Nothing consumed them. Out.
12.02.2026 19:19 β π 0 π 0 π¬ 1 π 0
First pass: duplicate staging models (GTFS routes and MTA bus routes, same raw data). One handled borough name conversion and service type logic. The other was a bare select.
Pick the one doing more. Delete the other. Consolidate.
12.02.2026 19:19 β π 0 π 0 π¬ 1 π 0
The Data Valentine Challenge | Recce
Join the Data Valentine Challenge! 5 days of quick, actionable challenges led by experts from Recce, Greybeam, dltHub, Database Tycoon, and Bauplan.
"The best code is the code you don't write. Or in this case, the code you delete."
Day 4 of Data Valentine Challenge: Database Tycoon ran a live dbt makeover. Stephen volunteered his NYC transit project. Chloe walked the lineage and cut everything dead.
12.02.2026 19:19 β π 1 π 0 π¬ 1 π 0
Build and run a data pipeline without writing a single line of Python.
In this Data Valentine Challenge session, Ashish from dltHub walks through the dltHub workspace workflow: load data from a RESTβ¦
Build a Data Pipeline With Zero Python β dltHub Workspace | Data Valentine Challenge Day 3
Full replay: youtu.be/NZhvYBJezdM
Tomorrow, Database Tycoon β From Hot Mess to Happily Ever After: A dbt Glow Up
Register for the week: reccehq.com/data-valentine-week-challenge
#DataEngineering #dltHub #DataPipelines
11.02.2026 17:39 β π 1 π 0 π¬ 0 π 0
Build and run a data pipeline without writing a single line of Python.
In this Data Valentine Challenge session, Ashish from dltHub walks through the dltHub workspace workflow: load data from a RESTβ¦
Build a Data Pipeline With Zero Python β dltHub Workspace | Data Valentine Challenge Day 3
"You can play to your strengths." SQL people write SQL. Python people write Python. Same data, same destination
LLMs hallucinate. In this workflow, the rules + YAML give enough context that errors stay small and fixable. No ghosting. No broken schemas.
Pipelines that don't ghost you. π
11.02.2026 17:39 β π 0 π 0 π¬ 1 π 0
Build and run a data pipeline without writing a single line of Python.
In this Data Valentine Challenge session, Ashish from dltHub walks through the dltHub workspace workflow: load data from a RESTβ¦
Build a Data Pipeline With Zero Python β dltHub Workspace | Data Valentine Challenge Day 3
Step 3: Marimo + Ibis. Attach the pipeline. DuckDB shows up in the notebook. Write SQL or Python. Your call.
He built two charts: commits per month (line), commits by contributor (bar). Altair, interactive. Still zero lines of pipeline code
11.02.2026 17:39 β π 0 π 0 π¬ 1 π 0
Build and run a data pipeline without writing a single line of Python.
In this Data Valentine Challenge session, Ashish from dltHub walks through the dltHub workspace workflow: load data from a RESTβ¦
Build a Data Pipeline With Zero Python β dltHub Workspace | Data Valentine Challenge Day 3
The rules teach the agent pagination (and more). So when it slips, the fix is one round.
Step 2: DLT Dashboard. Schema, child tables, SQL preview, pipeline state β all in the browser. Validate before you build
11.02.2026 17:39 β π 0 π 0 π¬ 1 π 0
Build and run a data pipeline without writing a single line of Python.
In this Data Valentine Challenge session, Ashish from dltHub walks through the dltHub workspace workflow: load data from a RESTβ¦
Build a Data Pipeline With Zero Python β dltHub Workspace | Data Valentine Challenge Day 3
You say what you want (commits, contributors, repo). Agent fills it. You run.
First run: pagination error.
Old move: open the code, find the bug, fix it
Ashish's move: paste the error into the chat. Agent fixes it. Run again. Pipeline runs.
11.02.2026 17:39 β π 0 π 0 π¬ 1 π 0
Build and run a data pipeline without writing a single line of Python.
In this Data Valentine Challenge session, Ashish from dltHub walks through the dltHub workspace workflow: load data from a RESTβ¦
Build a Data Pipeline With Zero Python β dltHub Workspace | Data Valentine Challenge Day 3
The command: `dlt init tlthub github duckdb`
That creates the project + pipeline script + the guardrails: Cursor rules and a GitHub docs YAML so the LLM can fill your config without you opening GitHub's docs
11.02.2026 17:39 β π 0 π 0 π¬ 1 π 0
The Data Valentine Challenge | Recce
Join the Data Valentine Challenge! 5 days of quick, actionable challenges led by experts from Recce, Greybeam, dltHub, Database Tycoon, and Bauplan.
"We didn't write a single line of Python."
Day 3 of Data Valentine Challenge: Ashish from dltHub walked through the workspace workflow β GitHub API to DuckDB to reports, no code.
One command. One prompt. Pipeline runs
π§΅
11.02.2026 17:39 β π 0 π 0 π¬ 1 π 0
"Why don't these numbers match?"
Every data person has gotten that message from the CEO. In this Data Valentine Challenge session, Kyle from Greybeam shows how to reconcile data across Snowflake,β¦
Dear Snowflake, I Want an Open Relationship | Data Valentine Challenge Day 2
Full replay: youtu.be/9IciVGA9kew
Tomorrow: dltHub β Pipelines That Don't Ghost You π»
Register for the week: reccehq.com/data-valentine-week-challenge
#DataEngineering #Snowflake #DuckDB
10.02.2026 18:40 β π 0 π 0 π¬ 0 π 0
The Greybeam layer: route queries to DuckDB automatically while connected to Snowflake.
ReadParquet() in your BI tool. Google Sheets joins in Hex. No warehouse spin-up for small data. 86% cost savings on average.
10.02.2026 18:40 β π 0 π 0 π¬ 1 π 0
Kyle's reflection:
"Back in my day, I'd have to pull data from Snowflake into CSV, download the Google Sheet, get the raw feed somehow... and Excel can only handle 1M rows."
4 million taxi records? DuckDB handles it. Excel doesn't.
10.02.2026 18:40 β π 0 π 0 π¬ 1 π 0
Independent AI researcher, creator of datasette.io and llm.datasette.io, building open source tools for data journalism, writing about a lot of stuff at https://simonwillison.net/
Senior Data Analyst | Germany | https://a-hahn.com
The world's leading publication for data science and artificial intelligence professionals.
Website π towardsdatascience.com
Submit an Article βοΈ https://contributor.insightmediagroup.io
Subscribe to our Newsletter π© https://bit.ly/TDS-Newsletter
Create and share social media content anywhere, consistently.
Built with π by a global, remote team.
β¬οΈ Learn more about Buffer & Bluesky
https://buffer.com/bluesky
We are here to eat bamba and revolutionize the world of query engines. The Spark is gone, let's rethink data processing with a pinch of AI
Building Datacoves, the most flexible dbt Cloud alternative
Bauplan is the easiest and fastest way to build robust data pipelines in Python over your object storage.
bauplanlabs.com
Go beyond dashboards with BI-as-code. Rill is a high-performance, AI-native alternative to legacy BI, backed by modern analytical databases.
Try Rill for free: curl https://rill.sh | sh
I used to make comics. Now I fuck around and find out in AI.
Prev: AWS, React/Native at Meta, MSFT, W3C, MDN
Submit clips through our Discord: https://discord.gg/wDdaFRMUds | Owners: @berixart.bsky.social, PoopTV and @nevocated.bsky.social
data feller, crust eater, beavis and butthead viewer
πbaltimore
Just a passionate dev, learning from this community daily.
β¨ Sharing the entire journey - bugs, breakthroughs, and banter. π
Gaming podcast about making the time to clear your gaming backlog. We make it fun. #justbeatit for us. We like to watch.
High-quality datasets designed to spark ideas, solve problems, and drive innovation. Fresh data added all the time for your AI projects, research, or curiosity. Letβs turn raw numbers into real impact π
Head of Data Eng | Building modern systems + sharper leaders
βοΈ datagibberish.com | π ivanovyordan.com
In a hole, building things.
Prev: co-founder hyperquery.ai (acq Deepnote, Khosla-backed), DS @ Airbnb + Wayfair, physics @ MIT + Harvard.
Systems engineer @turbopuffer.bsky.social. Former CTO @materialize.com.