Ananth Packkildurai's Avatar

Ananth Packkildurai

@ananthdurai.bsky.social

Editor Data Engineering Weekly; subscribe www.dataengineeringweekly.com. In Prgress, LakeByte

3,539 Followers  |  579 Following  |  246 Posts  |  Joined: 22.06.2023  |  2.007

Latest posts by ananthdurai.bsky.social on Bluesky

Post image

The open source companies built their success on top of open-source platforms, benefited from community contributions and adoption, but now must abandon open-source principles to survive commercially.

10.11.2025 02:47 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Data Engineering Weekly #244 The Weekly Data Engineering Newsletter

๐Ÿš€ The 244th edition of Data Engineering Weekly dives into:

AI agents as execution engines, LLM inference economics, databases for AI, personalization, and product evidence.

Read more ๐Ÿ‘‰ www.dataengineeringw...

#DataEngineering #AI #LLMs

03.11.2025 09:29 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Cricket has been Indiaโ€™s greatest force in overcoming centuries of colonial suppression. Todayโ€™s Womenโ€™s World Cup win echoes the spirit of 1983 โ€” a triumph that will inspire generations to come. ๐Ÿ‡ฎ๐Ÿ‡ณ๐Ÿ†

03.11.2025 00:40 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Thinking Like a Data Engineer A Journey Beyond Code โ€” Toward Systems, Curiosity, and Confidence

This is the most personal essay that I have written in Data Engineering Weekly. I shared a few key moments in my life and how fortunate I was to meet mentors along my professional journey, which shaped my career.

23.10.2025 00:25 โ€” ๐Ÿ‘ 9    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1
Preview
Revisiting Medallion Architecture: Data Vault in Silver, Dimensional Modeling in Gold How to Balance Flexibility and Performance in a Modern Data Platform

๐Ÿš€ Data Vault vs. Dimensional Modeling vs. Medallion Architecture โ€” When viewed through a modern enterprise data lens, these techniques interlock.

I break down how in Part 2 of my โ€œRevisiting the Medallion Architectureโ€ series.

17.10.2025 14:54 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Fivetran and dbt form a strong foundation for modern data infrastructure, known for bringing simplicity to complex engineering workflows. That said, calling it โ€œopenโ€ data infrastructure feels like a stretch.

17.10.2025 12:02 โ€” ๐Ÿ‘ 5    ๐Ÿ” 0    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 0

Should we update the definition of an "Analytical Engineer"?

13.10.2025 17:53 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Engineering Growth: The Data Layers Powering Modern GTM Building privacy-preserving pipelines that unify zero-, first-, second-, third-, and fourth-party data into a coherent GTM ecosystem.

As a data engineer, you can't treat zero-party (consent) and third-party (inferred) data the same way. This distinction is critical for building systems that are scalable, private, and trustworthy.

Hereโ€™s my guide:

09.10.2025 00:35 โ€” ๐Ÿ‘ 5    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Could be. Composable CDP has not gained significant market share, as identity resolution is a key component that is often proprietary.

04.10.2025 16:34 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

With Census already in with Fiveatran and with dbt, it is most likely to evolve as a composable CDP.

04.10.2025 02:11 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Airbnb: Real-Time Key-Value Store

Airbnbโ€™s next-gen key-value store supports real-time ingestion and bulk uploads with sub-second latency, powering feature stores and fraud detection.

Read the full story here: www.dataengineeringw...

02.10.2025 13:00 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Data Engineering Weekly #239 The Weekly Data Engineering Newsletter

Read the full story here:

01.10.2025 13:00 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Grab: Partner Gateway Metrics at Sub-Second Speed
Real-time partner analytics at scale is tough. Grab uses Apache Pinot, Kafkaโ€“Flink ingestion, partitioning, and Star-tree indexing to cut query latency to <300 ms, enabling efficient API monitoring and fast issue resolution.

01.10.2025 13:00 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Data Engineering Weekly #239 The Weekly Data Engineering Newsletter

๐Ÿ’ก Read the full story โ†’

30.09.2025 12:33 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Netflix Muse: Scaling Analytics at Trillion-Row Scale
Netflix evolved its Muse architecture to handle huge datasets efficiently: HyperLogLog sketches, Hollow in-memory feeds, and Druid optimizations cut query latency by ~50% and reduced concurrency load.

30.09.2025 12:33 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Data Engineering Weekly #239 The Weekly Data Engineering Newsletter

๐Ÿ”— Link in bio:

29.09.2025 12:33 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

โšก Latency Every Data Streaming Engineer Should Know

โ€œReal-timeโ€ has limitsโ€”disk, network, and replication delays add up. StreamNative explains latency tiers, common costs, and tuning levers like batching & async processing.
๐Ÿ’ก Must-read for data streaming engineers!

29.09.2025 12:33 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
What โ€œSupporting Our AI Overlordsโ€ and โ€œSemantic Spacetimeโ€ Tell Us About the Future of Data Infrastructure Connecting agent-first and universal semantic grammar to reimagine data infrastructure beyond the relational model.

I enjoyed this post by @ananthdurai.bsky.social. Does a great job tying a bunch of recent papers and concepts together.

27.09.2025 17:46 โ€” ๐Ÿ‘ 8    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
This MCP Server Could Have Been a JSON File There's a lot of buzz around MCP. I'm not convinced it needs to exist.

๐Ÿ”— Full story:

27.09.2025 13:00 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

MCP (Model Context Protocol) promises a new way for LLMs to use tools.

Chris Riccomini argues it mostly reinvents OpenAPI, gRPC & CLIs.
Resources = docs
Tools = RPC
Prompts = configs

Soโ€ฆ could MCP have just been a JSON file?

๐Ÿ’ก More insights: www.dataengineeringw...

27.09.2025 13:00 โ€” ๐Ÿ‘ 3    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Post image

How Tables Got Smarter: Iceberg โ†’ DuckLake. From static snapshots to stream-native updates and catalog-first metadata, tables are evolving fast. Choose by intent, not hype.

Subscribe โ†’ www.dataengineeringw...

Full story โ†’ medium.com/fresha-da...

26.09.2025 12:33 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
What โ€œSupporting Our AI Overlordsโ€ and โ€œSemantic Spacetimeโ€ Tell Us About the Future of Data Infrastructure Connecting agent-first and universal semantic grammar to reimagine data infrastructure beyond the relational model.

I wrote my thoughts on Supporting Our AI Overlords.

25.09.2025 13:15 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

How Tables Grew a Brain: Iceberg โ†’ DuckLake
Snapshots โ†’ incremental โ†’ stream-native โ†’ catalog-first.
Metadata is the bottleneck.

More insights โ†’ www.dataengineeringw...

Full story โ†’ medium.com/fresha-da...

25.09.2025 12:33 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

BlaBlaCar scales like a pro!

dbt Core โ†’ Transform like a champ

Airflow โ†’ Orchestrate effortlessly

CI/CD โ†’ Deploy instantly

Dev Containers โ†’ Standardized dev

๐Ÿ“– Full story โ†’medium.com/blablacar...

๐Ÿ’ก More insights โ†’ Subscribe to DEW

#DataEngineering #dbt #Airflow #CICD #DevContainers

24.09.2025 12:33 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

๐Ÿš€ AI adoption is boomingโ€”but most data isnโ€™t ready!

AI-ready data is:

Unified

Real-time

Human-verified

Governed

Without it, AI can confidently fail. With it? Reliable, scalable results.

๐Ÿ“– Read More

๐Ÿ’ก More insights โ†’ Data Engineering Weekly
#AI #AIReady #DataEngineering

23.09.2025 08:56 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
How we built it: Real-time analytics for Stripe Billing Among global business leaders surveyed, 84% agree that adapting pricing quickly will be a key competitive advantage. Our new real-time analytics system for Stripe Billing helps them spot customer trends just as they emerge.

๐Ÿ’ก More insights โ†’ Data Engineering Weekly
๐Ÿ’ก Learn more: stripe.com/blog/how-...

#DataEngineering #Stripe #RealTimeAnalytics #ApacheFlink

22.09.2025 13:00 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Stripeโ€™s Real-Time Billing Analytics โšก
Content:
Stripe wanted real-time visibility into subscriptions.
Traditional batch systems werenโ€™t fast enough. โฑ๏ธ
They built a pipeline using Flink, Spark, and Pinot v2.
Now, analytics arrive in minutes, not hours. Queries return in <300ms. ๐Ÿš€

22.09.2025 13:00 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

The 238th edition of Data Engineering Weekly is available, featuring exciting Data & AI articles.

Read more:
www.dataengineeringw...

22.09.2025 02:17 โ€” ๐Ÿ‘ 5    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Apache Iceberg is now entering the classic paradox.

Reference:

www.dataengineeringw...

www.warpstream.com/b...

18.09.2025 03:28 โ€” ๐Ÿ‘ 8    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
When Dimensions Change Too Fast for Iceberg Why Iceberg Struggles with Fast-Changing Dimensionsโ€”and What Comes Next

open.substack.com/pub/dataengi...

16.09.2025 17:20 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@ananthdurai is following 20 prominent accounts