Julien Hurault

Julien Hurault

@hachej.bsky.social

Freelance Data | Weekly Data Eng. Newsletter πŸ“¨ juhache.substack.com - 4k+ readers

122 Followers 298 Following 45 Posts Joined Nov 2024
1 year ago

Indeed it s not simple unfortunately..just the a way to get started quickly atm .

2 0 0 0
1 year ago

❀️❀️

0 0 0 0
1 year ago

For iceberg catalog hard to find a simpler setup..

1 0 1 0
1 year ago

Nice! Can you orchestrate lambda or ecs tasks that way?

1 0 1 0
1 year ago

Just use Pyiceberg with AWS Glue, probably the fastest way to get started.

2 0 1 0
1 year ago

In term of volume of data exchange over the marketplace? No idea

1 0 0 0
1 year ago

SF sales rep told me that markeplace was THE feature that helped a lot to convert

3 0 1 0
1 year ago

Saw it lot in finance!

3 0 1 0
1 year ago

Multi pipelines will probably get tricky no?

0 0 1 0
1 year ago

Dlt + duck + evidence + @blef.fr's baby?

3 0 1 0
1 year ago
Preview
GCP & Iceberg Ju Data Engineering Weekly - Ep 77

For those lost in GCP terminology:
I wrote a summary of Iceberg integration in GCP a couple of weeks ago:
juhache.substack.com/p/gcp-and-ic...

1 0 0 0
1 year ago
Preview
0$ Data Distribution Ju Data Engineering Weekly - Ep 78

New blog post: Building a 0$ Data Distribution System.

juhache.substack.com/p/0-data-dis...

5 0 0 0
1 year ago

30 million rows is just one month of data, right?

0 0 1 0
1 year ago
Home

check catalog.boringdata.io/dashboard/in...

2 0 0 0
1 year ago

Do you support swiss banks by any chance?

2 0 0 0
1 year ago

" bash / make knowledge, a single instance SQL processing engine (DuckDB, CHDB or a few python scripts), a distributed file system, git and a developer workflow (CI/CD)" what s your best option to orchestate sql models in such setup?

0 0 1 0
1 year ago
javisantana.com

Some learnings after helping +50 companies in high performance data engineering projects

javisantana.com/2024/11/30/l...

38 12 4 3
1 year ago

Super good thx!
"immutable workflow + atomic operation" 100%!

1 0 1 0
1 year ago

ATTACH url_to_your_dabase.duckdb;

1 0 1 0
1 year ago
Post image

Oh yeah

wiki.postgresql.org/wiki/Don%27t...

3 0 0 0
1 year ago

S3 to Snow in the same aws region is free no?

0 0 1 0
1 year ago

Niiiice, your view is doing a read_parquet(*) on their bucket? Or do you copy the data?

0 0 1 0
1 year ago
YouTube
DHH discusses SQLite (and Stoicism) YouTube video by Aaron Francis

1 Docker container embedding app code + SQLite DB β†’ live chat app with 10k simultaneous users.

youtu.be/0rlATWBNvMw

1 0 0 0
1 year ago

Where is the data stored, then? In DuckDB itself?
So, if you have a 1GB dataset, does that mean you’ll share a single .duckdb file containing the entire dataset? Or either a view pointing to parquet files: CREATE VIEW... as read_parquet(*.parquet) ?

0 0 0 0
1 year ago

Do you see DuckDB as a format?

For me:
β€’ Parquet = Standard storage format
β€’ Iceberg = Standard metadata format
β€’ DuckDB = One possible distribution vector

0 0 1 0
1 year ago

Yup, there are almost fifteen million SQLite databases on Bluesky’s PDS servers. It’s wildly efficient and simple but not without trade offs of course.

Makes sense for this use case in large part because each users atproto repository is self contained, with links to other repos, like a website.

40 6 2 4
1 year ago
Home

just do both -> catalog.boringdata.io/dashboard/in...

1 0 1 0
1 year ago
A screenshot of .zshrc code with the following content:

# Function to auto-activate virtual environment
function auto_activate_virtualenv() {
    if [ -d ".venv" ]; then
        source .venv/bin/activate
    fi
}

# Hook the function to the 'chpwd' event (triggered when you change directories)
autoload -U add-zsh-hook
add-zsh-hook chpwd auto_activate_virtualenv

# Also call the function for the initial directory when the terminal starts
auto_activate_virtualenv

Here's what I put in my ~/.zshrc file to make sure my virtualenv autoactivates when I move to a directory with a .venv file. Works well for me so far. Do the rest of you do something like this?

#Python #DataBS

38 2 8 0
1 year ago

Prediction: poeple will monetize custom feeds
github.com/bluesky-soci...

0 0 0 0
1 year ago

Something interesting is brewing in Iceberg-on-S3 land. πŸ‘€

lists.apache.org/thread/v7x65...

cc @eatonphil.bsky.social

29 5 3 2