Ritchie Vink @ritchie46 - Bluesky Profile

In 1-2 weeks we land live query profiling in Polars Cloud.

See exactly how many rows are consumed and produced per operation. Which operation takes most runtime, and watch the data flow through live, like water. 😍

02.02.2026 14:43 — 👍 9 🔁 1 💬 0 📌 0

ClickBench now runs the Polars streaming engine.

Polars is the fastest solution on that benchmark on Parquet file(s) 😎

The speed is there. This year, we will tackle out of core (spill to disk) and distributed to truly tackle scale.

benchmark.clickhouse.com#system=-ahi|...

14.01.2026 11:13 — 👍 7 🔁 0 💬 0 📌 0

I am sorry about that. We do welcome contributions, however design-wise we have to be strict. I would always recommend picking accepted issues and asking if now is the time to implement them before putting in the effort.

10.01.2026 18:05 — 👍 0 🔁 0 💬 1 📌 0

Release Python Polars 1.36.0-beta.2 · pola-rs/polars 🏆 Highlights Add Extension types (#25322) 🚀 Performance improvements Reduce HuggingFace API calls (#25521) Use strong hash instead of traversal for CSPE equality (#25537) Fix panic in is_between...

The pre-release of Polars 1.36 is out. Please give it a try so that we can ensure a stable final release with minimal regressions.

It lands a lot of goodies:

- Extension types
- Lazy pivots
- Streaming group_by_dynamic
- Float16 support
- Nested .over() expressions

github.com/pola-rs/pola...

02.12.2025 13:07 — 👍 7 🔁 1 💬 0 📌 0

Polars 1.34.0 is out!

Any Polars query can be turned into a generator!

Aside from that Polars now properly supports decimal types, scan_iceberg is completely native, cross joins can maintain order and much more.

Changelogs here:

- github.com/pola-rs/pola...
- github.com/pola-rs/pola...

03.10.2025 14:34 — 👍 6 🔁 0 💬 0 📌 0

Struct:

Think of a tuple with named fields. Or multiple columns in a single column.

16.09.2025 17:49 — 👍 0 🔁 0 💬 0 📌 0

Release Python Polars 1.32.0 · pola-rs/polars 🏆 Highlights Make Selector a concrete part of the DSL (#23351) Rework Categorical/Enum to use (Frozen)Categories (#23016) 🚀 Performance improvements Lower Expr.slice to streaming engine (#23683)...

4/4
Many new expressions are lowered to the streaming engine. This means you can run more queries faster!

See the full changelog here:

github.com/pola-rs/pola...

11.08.2025 15:02 — 👍 0 🔁 0 💬 0 📌 0

3/4
More joins will be lowered to more strict variants based on predicates. This can save a lot of intermediate rows!

11.08.2025 15:02 — 👍 0 🔁 0 💬 1 📌 0

refactor: Rework Categorical/Enum to use (Frozen)Categories by orlp · Pull Request #23016 · pola-rs/polars NoteTLDR Categoricals are completely reimplemented to be streaming compatible and fit better into the Polars Data model. They should generally be faster, more stable and more reliable. Physical ord...

2/4
The Categorical type is now streaming! No `StringCache` anymore and working Categoricals in distributed Polars.

github.com/pola-rs/pola...

11.08.2025 15:02 — 👍 0 🔁 0 💬 1 📌 0

Polars 1.32 is out and it lands a lot!

Let's go through a few:

1/4
Selectors are now implemented in Rust and we can finally select arbitrary nested types:

11.08.2025 15:02 — 👍 17 🔁 1 💬 1 📌 1

Polars Meetup - Polars Cloud and Acceleration · Luma Join the second edition of our Polars Meetup with talks from Ritchie Vink (Polars) and Vyas Ramasubramani (NVIDIA) to discuss accelerating and scaling…

Join me the 24th in SF for a @pola.rs meetup!

I will be having a talk about Polars, Polars-Cloud and the upcoming distributed engine.

NVIDIA will also be doing a talk about their GPU acceleration with Polars-CuDF

Hope to see you there!

lu.ma/60b6wfs8

14.07.2025 17:42 — 👍 1 🔁 0 💬 1 📌 0

No more `with pl.StringCache()`

Soon... 🌈

04.07.2025 15:05 — 👍 0 🔁 0 💬 0 📌 0

GP1085: Scaling DataFrames With Polars - Theme: undefined Location: NVIDIA GTC PARIS - Pavillon 7 - June 12 1:00 PM 1:45 PM - CET | Resume: Room: N03 Polars is a query engine with a DataFrame frontend designed for fast, efficient data processing. This sess...

This Thursday I will join Lawrence Mitchell from @nvidia
on the podium during the NVIDIA GTC in Paris.

We'll discuss how we made Polars work on the GPU and how it will scale to multi-GPU in the future.

On se voit là-bas !

vivatechnology.com/sessions/ses...

10.06.2025 13:58 — 👍 2 🔁 0 💬 0 📌 0

Polars has gotten 4x faster than Polars! 🚀

In the last months, the team has worked incredibly hard on the new-streaming engine and the results pay off. It is incredibly fast, and beats the Polars in-memory engine by a factor of 4 on a 96vCPU machine.

01.05.2025 14:05 — 👍 16 🔁 3 💬 4 📌 0

** Sponsor announcement ** Polars is a Supporter of RustWeek!
Find out more about them here: pola.rs

Thank you @pola.rs for your support! 🙏

More info about RustWeek and tickets: rustweek.org

#rustweek #rustlang

18.04.2025 12:07 — 👍 2 🔁 1 💬 0 📌 0

Yeah, or even for single machine remotely. E.g. let's say you run a very small node as airflow orchestrator, but need a big VM for the ETL job. That orchestrator can initiate the remote query and doesn't have to worry about hardware setup/teardown.

17.02.2025 13:44 — 👍 1 🔁 0 💬 1 📌 0

Already got all TPC-H queries running distributed!

12.02.2025 14:16 — 👍 7 🔁 0 💬 0 📌 0

He is a notorious Polars hater and has tweeted he wants the project to fail.

He fears the fact that Polars deviation from the pandas API will splinter the landscape and doesn't appreciate new API development. I haven't read a technical reason from his side.

03.02.2025 11:48 — 👍 1 🔁 0 💬 2 📌 0

That was all pre 1.0.

We've released 1.0 in july last year. The API is stable now.

28.01.2025 19:36 — 👍 2 🔁 0 💬 1 📌 0

I would call that a weakness of the AI. 😅

28.01.2025 17:09 — 👍 1 🔁 0 💬 1 📌 0

This weeks Polars release we shipped initial Unity Catalog support. This makes integration with Databricks much smoother.

Writing features are under development and will follow soon. Full release notes: github.com/pola-rs/pola...

25.01.2025 11:47 — 👍 11 🔁 4 💬 0 📌 0

Modern Polars A side-by-side comparison of the Polars and Pandas libraries.

Learning polars has been ... actually a joy? It just makes sense to my #rstats #dplyr trained data muscles #databs #python

kevinheavey.github.io/modern-polars/

23.01.2025 00:01 — 👍 44 🔁 12 💬 2 📌 2

Release Python Polars 1.20.0 · pola-rs/polars ⚠️ Deprecations Make parameter of str.to_decimal keyword-only (#20570) 🚀 Performance improvements Extend functionality on BitmapBuilder and use in Growables (#20754) Specialize first/last agg fo...

This weeks Polars release has a huge improvement for window functions. They can be an order of magnitude faster.

And we can run 20/22 TPC-H queries on the new streaming engine and all on Polars cloud. More will follow soon! ;)

See the full release docs here:

github.com/pola-rs/pola...

19.01.2025 08:57 — 👍 20 🔁 2 💬 0 📌 0

A screenshot of a Pyodide REPL executing Polars code: import polars as pl import requests r = requests.get("https://raw.githubusercontent.com/pola-rs/polars/refs/heads/main/examples/datasets/foods2.csv") pl.read_csv(r.content).group_by("category").mean()

A screenshot of a Quarto Live code cell executing Polars code: import polars as pl import requests r = requests.get("https://raw.githubusercontent.com/pola-rs/polars/refs/heads/main/examples/datasets/foods2.csv") pl.read_csv(r.content).group_by("category").mean()

A screenshot of a Shinylive app using Polars code: from shiny import App, render, ui import polars as pl from pathlib import Path app_ui = ui.page_fluid( ui.input_select("cyl", "Select Cylinders", choices=["4", "6", "8"]), ui.output_data_frame("filtered_data") ) def server(input, output, session): df = pl.read_csv(Path(__file__).parent / "mtcars.csv") @output @render.data_frame def filtered_data(): return (df .filter(pl.col("cyl") == int(input.cyl())) .select(["mpg", "cyl", "hp"])) app = App(app_ui, server)

Recently I've been working on getting #polars running in #pyodide. This was a fun one, even requiring patches to LLVM's #wasm writer! Everything has now been upstreamed and earlier this week Pyodide v0.27.0 released, including a Wasm build of Polars usable in Pyodide, Shinylive and Quarto Live 🎉

04.01.2025 11:59 — 👍 49 🔁 9 💬 0 📌 0

demo of dt.replace

✨ New temporal feature in the next Polars release!

⏲️ dt.replace lets you replace components of Date / Datetime columns

⚡🦀 It's an expressified vectorised rustified version of the Python standard library datetime.replace

20.12.2024 16:25 — 👍 9 🔁 1 💬 0 📌 1

We removed serde from our Series struct and saw a significant drop in Polars' binary size (of all features activated). The amount of codegen is huge. 😮

18.12.2024 13:32 — 👍 5 🔁 0 💬 0 📌 0

We finally support writing to cloud storage natively and seamlessly!

12.12.2024 11:04 — 👍 18 🔁 2 💬 0 📌 1

YouTube video by probabl Making a recommender by just using Polars!

Join us this Friday if you're eager to see what it can be like to design a recommender while limiting ourselves to just a DataFrame API. It is somewhat unconventional, but a great excuse to show off a Polars trick or two.

www.youtube.com/watch?v=U3Fi...

10.12.2024 10:50 — 👍 14 🔁 2 💬 0 📌 1

Interesting. is it an authentication error with Azure Storage?

08.12.2024 06:11 — 👍 0 🔁 0 💬 1 📌 0

Nice Post.

For the benchmarks, I think it would be more fair if you use `scan_csv` as read forces full materialization.

P.S. why use pyarrow to read instead of our native reader?

05.12.2024 06:00 — 👍 1 🔁 0 💬 1 📌 0

Ritchie Vink

Latest posts by ritchie46.bsky.social on Bluesky

@ritchie46 is following 5 prominent accounts