Data ingestion with dlt and Dagster: An end-to-end pipeline tutorial
Ingest Data from Bluesky API to AWS S3 Using dlt and deploy it on Dagster in Just 15 Minutes.
Data ingestion with dlt and Dagster: An end-to-end pipeline tutorial:
Curious like us to see what people are sharing with #dataBS and #datasky? Check out this post to learn how to do it using dlt!"
@matthausk.bsky.social
@datateam.bsky.social
@hgeren.bsky.social
@hopefanhe.bsky.social
#dlt
19.12.2024 11:00 β π 9 π 1 π¬ 0 π 0
Week 0/32 - A Comprehensive Data Engineering Interview Preparation Guide
Join us every Saturday on This New Journey
We are starting a 32-week Data Engineering Interview Guide program, covering everything from fundamentals to advanced topics, with sessions every Saturday.
Do you think we're missing any critical topics? We're curious about your opinionsπ
#dataBS
#datasky
08.12.2024 11:06 β π 4 π 3 π¬ 0 π 0
As a Data Engineer, understanding the data storage lifecycle and data retention policies is critical for designing efficient, cost-effective, and compliant data systems.
@joereis.bsky.social
#dataBS #datasky
substack.com/@pipeline2in...
04.12.2024 12:11 β π 7 π 2 π¬ 0 π 0
10 Pipeline Design Patterns for Data Engineers
How to leverage Design Patterns for scalable and efficient data pipelines
In our new post, we've covered 10 of the most popular data pipeline design patterns.
Weβd love to hear your thoughts. For more details, please check out the full post created by (@hgeren.bsky.social and @hopefanhe.bsky.social ): open.substack.com/pub/pipeline...
#dataBS #datasky
03.12.2024 10:19 β π 3 π 2 π¬ 0 π 0
As a Data Engineer and Monster Hunter fan, love this metaphor!
01.12.2024 12:10 β π 0 π 0 π¬ 1 π 0
Introduction to data load tool (dlt): A Python Library for Simple Data Ingestion
Discover the basics of dlt and its role in modern data engineering workflows
Discover how dlt simplifies data ingestion.
Learn its origins and real-world use cases. Follow a step-by-step guide to build your first pipeline and join the growing dlt community!
@matthausk.bsky.social
@datateam.bsky.social
@hgeren.bsky.social
@hopefanhe.bsky.social
#dataBS #datasky
01.12.2024 10:44 β π 9 π 3 π¬ 2 π 0
Hi, wishing everyone a great Thanksgiving!
Recently we wrote about how SQL queries are executed behind the scenes.
If you are interested, check out our post: open.substack.com/pub/pipeline...
#dataBS #datasky
28.11.2024 12:23 β π 6 π 2 π¬ 0 π 0
Storage Fundamentals For Data Engineers
Why organised and durable storage is the cornerstone of Data Engineering?
Storage is at the heart of Data Engineering.
In this post, we explore the hierarchy of data storage from the ground up, drawing inspiration from Fundamentals of Data Engineering by
@joereis.bsky.social
and Matt Housley, as well as insights from the DE Professionals on Coursera.
#dataBS #datasky
26.11.2024 10:59 β π 16 π 2 π¬ 3 π 0
Thank you so much! I am also planning to study cost estimation step in detail soon, so I will definitely write about it when I deepen my knowledge ππ»
19.11.2024 22:33 β π 2 π 0 π¬ 0 π 0
SQL Behind the Curtain: How Are Queries Executed?
Explore the journey of your SQL query guided by execution plans
Hey #dataBS and #datasky folks,
Our new post about "how understanding Big O Notation & Execution Plans can optimize SQL queries" has just been posted.
Check it out if you're interested, and we'd love to hear your thoughts! @hopefanhe.bsky.social
open.substack.com/pub/pipeline...
19.11.2024 10:45 β π 8 π 2 π¬ 1 π 0
yeah you are right, it was posted about 10 days ago π
16.11.2024 13:17 β π 1 π 0 π¬ 0 π 0
Yeah, maybe Data Science can also be the navigation system with its predictions capabilities and Data Analytics can be driving assistants. While Data Engineering ensuring the whole coordination.
09.11.2024 12:24 β π 2 π 0 π¬ 0 π 0
Hey #dataBS, I've been thinking of an analogy for Data Teams' roles.
Imagine a company as a vehicle. How would you map Data Engineering, Analytics, and Science to vehicle parts? Teams could have multiple parts or overlap with other Teams.
Curious about your thoughts!
08.11.2024 22:46 β π 4 π 0 π¬ 2 π 0
YouTube video by Rill Data
Data Talks on the Rocks 5 - Hannes MΓΌhleisen, DuckDB
Looking for a distraction? Try this great interview between @hannes.muehleisen.org and @medriscoll.bsky.social covering all things @duckdb.org. I especially enjoyed the philosophy around improving SQL usability. www.youtube.com/watch?v=a-Rm... #databs
07.11.2024 23:16 β π 14 π 4 π¬ 0 π 0
#dstaBS can you repost?
Filled up the first 150 and so am creating a second starter pack! Letβs all keep finding each other and make this place the best for all things data
07.11.2024 12:39 β π 13 π 5 π¬ 2 π 0
Week #1: 100 Days of SQL Optimisation
How Small Tweaks Transformed Our Queries, Saving Time and Resources
Week 1 of "100 Days of SQL Optimisation" covered key techniques like column selection, multicolumn indexes, filtering, window functions, Rank, CTE and composite indexes with IMDb data.
Check out the full post for more!
@hgeren.bsky.social
#dataBS #datasky
07.11.2024 12:01 β π 6 π 1 π¬ 0 π 0
I made an infra engineer starter pack. Folks posting about databases, stream processing, durable execution, orchestrators, service meshes, and more.
go.bsky.app/SCZe42X
25.10.2024 01:16 β π 290 π 75 π¬ 44 π 16
Hello everyone! Iβm Hasan.
I transitioned from Industrial Engineering to Data Science, then found my passion in Data Engineering. Currently, doing a PhD in distributed stream processing while working as a Data Engineer.
Looking forward to connecting with fellow data enthusiasts to learn and share.
07.11.2024 03:42 β π 3 π 0 π¬ 0 π 0
Iβd say SQL
07.11.2024 00:57 β π 0 π 0 π¬ 0 π 0
Just joined and heard #dataBS and #datasky are where the cool kids hang.
Wanted to introduce our blog where we regularly write about Data Engineering concepts, news, and tools.
pipeline2insights.substack.com
06.11.2024 12:49 β π 15 π 3 π¬ 2 π 0
CEO/co-founder dltHub, the makers of OSS Python library dlt. At the intersection of single node compute, open storage, Python/in-memory & some others
Finlandβs Most Famous Data Engineer Youβve Never Heard Of πΎ
Founder @helsinkidataweek.bsky.social
Co-Founder, Lead Data Engineer @ Invinite
Photographer | Creative Director
A broad board member @ TIVIA
#helsinkidataweek #databs
Data, Streaming, coding, (maybe) AI?
building a real time streaming engine.
https://www.denormalized.io/
Someone who develops something that will increase skills and help people's needs - Fullstack Developer
*Powered by coffee, beer, gin, hockey, and sarcasm. Opinionated, fact-based data nerd. Probably from .ca in a previous life. Expect a wide variety of posts #ADHD.
NAFO Feline Corps, Clandestine Division. Fella not Felon.
Squirrel
Anglican dad, programmer, data person. An Andrew of all trades and master of none. Not popular enough to say "view are my own and not those of my employer."
Solutions Architect at Qubika
Distributed Systems
Data Engineer... loading
Stream processing, data infra, Table formats and Pickleball.
https://datapapers.substack.com/
Co-founder @dbos.dev β’ Stanford CS PhD
Co-organizer @southbaysystems.xyz
Working on π Database + Systems + AI
Amateur bird watcher π¦
Personal site: qianli.dev
She/her.
CEO @paradedb.bsky.social β’ H'20 β’ π«π·π¨π¦
https://philippemnoel.posthaven.com
Messing around with boats and databases.
Sometimes data systems, sometimes security, sometimes ai/ml, sometimes a blend of it all.
Founding Engineer @ Tigris Data, CouchDB PMC - databases and distributed systems. Father, mountain biker, climber and musician
Distributed Systems & databases person. Works at Microsoft on Orleans & Aspire
Distributed and Storage Systems. Apache Cassandra Committer and PMC member. Author of Database Internals. Mountain person. http://databass.dev/
American Homeowner | Battle Born | Go Bills | Software Engineer at Confluent | GM of KafkaStreams 𦦠| ASF Member, Committer and PMC member (Apache Kafka, Apache Flink, Apache Storm) β Reno. Home Means Nevada.
PRQL Evangelist, Chief Excel Officer, MAD* Scientist, Pythonista, Rustacian (*: ML, AI, Data)
CEO & Co-founder @electric-sql.com
Works with data, runs with swords.