Recce - Trust, Verify, Ship's Avatar

Recce - Trust, Verify, Ship

@datarecce.bsky.social

Helping data teams preview, validate, and ship data changes with confidence. https://datarecce.io

44 Followers  |  153 Following  |  213 Posts  |  Joined: 27.11.2023  |  2.2678

Latest posts by datarecce.bsky.social on Bluesky

Video thumbnail

"Saying 'you are wrong' is not curious. Saying 'why are your priors different than what the data is showing' is curious."

Bryan Bischof on Data Renegades

#DataScience #DataCulture

15.02.2026 17:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Five Days, Five Data Problems, Five Fixes: What the Data Valentine Challenge Revealed Five companies tackled real data problems live. Agent benchmarks, DuckDB reconciliation, no-code pipelines, dbt cleanup, and data versioning in one week.

All sessions are live up on our youtube.

Or check out our blog summarizing the week: blog.reccehq.com/data-valenti...

#DataEngineering #DataValentineChallenge

14.02.2026 17:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
The Data Valentine Challenge | Recce Join the Data Valentine Challenge! 5 days of quick, actionable challenges led by experts from Recce, Greybeam, dltHub, Database Tycoon, and Bauplan.

Happy Valentine's Day to everyone who spent this week falling back in love with their data stack.

Five days. Five companies. No slides. No safety nets.

Thank you Greybeam, dltHub, Database Tycoon, & bauplan for doing this with us and shipping in front of an audience.

14.02.2026 17:21 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Data Renegades | Ep. #6, From Big Data to Curiosity-Driven Insight with Roger Magoulas | Heavybit On episode 6 of Data Renegades, CL Kao and Dori Wilson of Recce speak with Roger Magoulas about the real bottlenecks holding data organizations back.

"The best analytics asks more questions than it answers."

Roger Magoulas on why dashboards shipped is the wrong success metric. The real measure is whether the work generates questions worth investigating.

🎧 www.heavybit.com/library/podc...

#dataengineering #analytics

14.02.2026 17:05 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β€” importing satellite telemetry into a lakehouse,… An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production

Five days. Five companies. Five chances to fall back in love with your data stack. That was the Data Valentine Challenge

Full replays: youtu.be/yzX05Z8FlYw

#DataEngineering #DataValentineChallenge

13.02.2026 20:26 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β€” importing satellite telemetry into a lakehouse,… An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production

The prompting lesson: "It's better to explicitly say 'don't use Pandas' rather than just encouraging other libraries." Ban what you don't want. Don't just suggest what you prefer. Works for AI agents. Works for code reviews too.

13.02.2026 20:26 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β€” importing satellite telemetry into a lakehouse,… An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production

The AI agent made real mistakes along the way. Hallucinated a namespace decorator. Tried to write directly to main β€” bauplan blocked it. Defaulted to Pandas instead of PyArrow. Every mistake happened on a staging branch. Production never saw any of it

13.02.2026 20:26 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β€” importing satellite telemetry into a lakehouse,… An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production

Act 3 β€” Write-Audit-Publish. The agent moved validation into the ingestion pipeline itself. Expectations ran inline. Bad rows filtered before reaching silver. Row count dropped by half. All validations passed. Merge to main went through clean

13.02.2026 20:26 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β€” importing satellite telemetry into a lakehouse,… An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production

Act 2 β€” Validation. The agent wrote a separate validation pipeline using bauplan's expectations framework. Null checks, numeric type compatibility, uniqueness. The uniqueness check failed. Duplicate rows in the silver table. Problem visible

13.02.2026 20:26 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β€” importing satellite telemetry into a lakehouse,… An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production

Act 1 β€” Naive pipeline. The AI agent imported satellite telemetry into bronze, passed it through to silver, merged to main. Data landed. But it had duplicates and string-typed numeric columns. An anomaly detection system would break silently on this

13.02.2026 20:26 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β€” importing satellite telemetry into a lakehouse,… An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production

Intella builds anomaly detection for satellite fleets. The lakehouse runs Iceberg on S3 with a git-for-data catalog. Every pipeline runs on a staging branch. Nothing touches main until it passes validation and gets explicitly merged

13.02.2026 20:26 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production
Aldrin from Bauplan closed the Data Valentine Challenge with a demo where he didn't write a single line of pipeline code. Claude Code did all of it β€” importing satellite telemetry into a lakehouse,… An AI Agent Built My Entire Data Pipeline. Here's How I Kept It From Breaking Production

An AI agent tried to write to production. The lakehouse said no. Day 5 of Data Valentine Challenge: @BauplanLabs brought the finale. Aldrin let Claude Code build his entire pipeline from scratch β€” on transactional branches that caught every mistake 🧡

13.02.2026 20:26 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Full replay: youtu.be/2snf_AY94-A

Tomorrow is the finale: Bauplan, Let AI Build Your Pipelines Without Breaking Your Heart (or Production)

Catch the full week: reccehq.com/data-valentine-week-challenge

#DataEngineering #dbt #DataValentineChallenge

12.02.2026 19:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Chloe's mom's rule: when you're getting dressed, always take one accessory off. Then take another one off.

That was the whole session. Don't build just-in-case models. Delete fearlessly. Git remembers everything.

`dbt docs serve` is free. Spin it up and let the graph show you where the mess is.

12.02.2026 19:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

After cleanup: the DAG went from overwhelming to clean. The cleanup revealed what was still wrong. Models without sources. Staging tables with no downstream path. Cleaning up made the real problems obvious.

12.02.2026 19:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Three dimension tables, dim_date, dim_borough, dim_day_type, that nothing joins to.

Built for a star schema that never materialized. Delete until you actually need them.

12.02.2026 19:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

An intermediate model called "stops with routes." What is it?

A SELECT * from each source. Cross joined. Every stop times every route. A Cartesian product that nothing downstream used.

"You're not gonna need it."
"Less is more. Heard."

12.02.2026 19:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Three intermediate models with zero downstream dependencies. Built because someone thought they'd be useful someday. Classic "just in case" modeling.

Nothing consumed them. Out.

12.02.2026 19:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

First pass: duplicate staging models (GTFS routes and MTA bus routes, same raw data). One handled borough name conversion and service type logic. The other was a bare select.

Pick the one doing more. Delete the other. Consolidate.

12.02.2026 19:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
The Data Valentine Challenge | Recce Join the Data Valentine Challenge! 5 days of quick, actionable challenges led by experts from Recce, Greybeam, dltHub, Database Tycoon, and Bauplan.

"The best code is the code you don't write. Or in this case, the code you delete."

Day 4 of Data Valentine Challenge: Database Tycoon ran a live dbt makeover. Stephen volunteered his NYC transit project. Chloe walked the lineage and cut everything dead.

12.02.2026 19:19 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Build a Data Pipeline With Zero Python β€” dltHub Workspace | Data Valentine Challenge Day 3
Build and run a data pipeline without writing a single line of Python. In this Data Valentine Challenge session, Ashish from dltHub walks through the dltHub workspace workflow: load data from a REST… Build a Data Pipeline With Zero Python β€” dltHub Workspace | Data Valentine Challenge Day 3

Full replay: youtu.be/NZhvYBJezdM

Tomorrow, Database Tycoon β€” From Hot Mess to Happily Ever After: A dbt Glow Up

Register for the week: reccehq.com/data-valentine-week-challenge

#DataEngineering #dltHub #DataPipelines

11.02.2026 17:39 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Build a Data Pipeline With Zero Python β€” dltHub Workspace | Data Valentine Challenge Day 3
Build and run a data pipeline without writing a single line of Python. In this Data Valentine Challenge session, Ashish from dltHub walks through the dltHub workspace workflow: load data from a REST… Build a Data Pipeline With Zero Python β€” dltHub Workspace | Data Valentine Challenge Day 3

"You can play to your strengths." SQL people write SQL. Python people write Python. Same data, same destination

LLMs hallucinate. In this workflow, the rules + YAML give enough context that errors stay small and fixable. No ghosting. No broken schemas.

Pipelines that don't ghost you. πŸ’•

11.02.2026 17:39 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Build a Data Pipeline With Zero Python β€” dltHub Workspace | Data Valentine Challenge Day 3
Build and run a data pipeline without writing a single line of Python. In this Data Valentine Challenge session, Ashish from dltHub walks through the dltHub workspace workflow: load data from a REST… Build a Data Pipeline With Zero Python β€” dltHub Workspace | Data Valentine Challenge Day 3

Step 3: Marimo + Ibis. Attach the pipeline. DuckDB shows up in the notebook. Write SQL or Python. Your call.

He built two charts: commits per month (line), commits by contributor (bar). Altair, interactive. Still zero lines of pipeline code

11.02.2026 17:39 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Build a Data Pipeline With Zero Python β€” dltHub Workspace | Data Valentine Challenge Day 3
Build and run a data pipeline without writing a single line of Python. In this Data Valentine Challenge session, Ashish from dltHub walks through the dltHub workspace workflow: load data from a REST… Build a Data Pipeline With Zero Python β€” dltHub Workspace | Data Valentine Challenge Day 3

The rules teach the agent pagination (and more). So when it slips, the fix is one round.

Step 2: DLT Dashboard. Schema, child tables, SQL preview, pipeline state β€” all in the browser. Validate before you build

11.02.2026 17:39 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Build a Data Pipeline With Zero Python β€” dltHub Workspace | Data Valentine Challenge Day 3
Build and run a data pipeline without writing a single line of Python. In this Data Valentine Challenge session, Ashish from dltHub walks through the dltHub workspace workflow: load data from a REST… Build a Data Pipeline With Zero Python β€” dltHub Workspace | Data Valentine Challenge Day 3

You say what you want (commits, contributors, repo). Agent fills it. You run.

First run: pagination error.

Old move: open the code, find the bug, fix it

Ashish's move: paste the error into the chat. Agent fixes it. Run again. Pipeline runs.

11.02.2026 17:39 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Build a Data Pipeline With Zero Python β€” dltHub Workspace | Data Valentine Challenge Day 3
Build and run a data pipeline without writing a single line of Python. In this Data Valentine Challenge session, Ashish from dltHub walks through the dltHub workspace workflow: load data from a REST… Build a Data Pipeline With Zero Python β€” dltHub Workspace | Data Valentine Challenge Day 3

The command: `dlt init tlthub github duckdb`

That creates the project + pipeline script + the guardrails: Cursor rules and a GitHub docs YAML so the LLM can fill your config without you opening GitHub's docs

11.02.2026 17:39 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
The Data Valentine Challenge | Recce Join the Data Valentine Challenge! 5 days of quick, actionable challenges led by experts from Recce, Greybeam, dltHub, Database Tycoon, and Bauplan.

"We didn't write a single line of Python."

Day 3 of Data Valentine Challenge: Ashish from dltHub walked through the workspace workflow β€” GitHub API to DuckDB to reports, no code.

One command. One prompt. Pipeline runs

🧡

11.02.2026 17:39 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Dear Snowflake, I Want an Open Relationship | Data Valentine Challenge Day 2
"Why don't these numbers match?" Every data person has gotten that message from the CEO. In this Data Valentine Challenge session, Kyle from Greybeam shows how to reconcile data across Snowflake,… Dear Snowflake, I Want an Open Relationship | Data Valentine Challenge Day 2

Full replay: youtu.be/9IciVGA9kew

Tomorrow: dltHub β€” Pipelines That Don't Ghost You πŸ‘»

Register for the week: reccehq.com/data-valentine-week-challenge

#DataEngineering #Snowflake #DuckDB

10.02.2026 18:40 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The Greybeam layer: route queries to DuckDB automatically while connected to Snowflake.

ReadParquet() in your BI tool. Google Sheets joins in Hex. No warehouse spin-up for small data. 86% cost savings on average.

10.02.2026 18:40 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Kyle's reflection:

"Back in my day, I'd have to pull data from Snowflake into CSV, download the Google Sheet, get the raw feed somehow... and Excel can only handle 1M rows."

4 million taxi records? DuckDB handles it. Excel doesn't.

10.02.2026 18:40 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@datarecce is following 20 prominent accounts