Polars Meetup - Polars Cloud and Acceleration ยท Luma
Join the second edition of our Polars Meetup with talks from Ritchie Vink (Polars) and Vyas Ramasubramani (NVIDIA) to discuss accelerating and scalingโฆ
Join me the 24th in SF for a @pola.rs meetup!
I will be having a talk about Polars, Polars-Cloud and the upcoming distributed engine.
NVIDIA will also be doing a talk about their GPU acceleration with Polars-CuDF
Hope to see you there!
lu.ma/60b6wfs8
14.07.2025 17:42 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
No more `with pl.StringCache()`
Soon... ๐
04.07.2025 15:05 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
Polars has gotten 4x faster than Polars! ๐
In the last months, the team has worked incredibly hard on the new-streaming engine and the results pay off. It is incredibly fast, and beats the Polars in-memory engine by a factor of 4 on a 96vCPU machine.
01.05.2025 14:05 โ ๐ 16 ๐ 3 ๐ฌ 4 ๐ 0
** Sponsor announcement ** Polars is a Supporter of RustWeek!ย
Find out more about them here: pola.rs
Thank you @pola.rs for your support! ๐
More info about RustWeek and tickets: rustweek.org
#rustweek #rustlang
18.04.2025 12:07 โ ๐ 2 ๐ 1 ๐ฌ 0 ๐ 0
Yeah, or even for single machine remotely. E.g. let's say you run a very small node as airflow orchestrator, but need a big VM for the ETL job. That orchestrator can initiate the remote query and doesn't have to worry about hardware setup/teardown.
17.02.2025 13:44 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Already got all TPC-H queries running distributed!
12.02.2025 14:16 โ ๐ 8 ๐ 0 ๐ฌ 0 ๐ 0
He is a notorious Polars hater and has tweeted he wants the project to fail.
He fears the fact that Polars deviation from the pandas API will splinter the landscape and doesn't appreciate new API development. I haven't read a technical reason from his side.
03.02.2025 11:48 โ ๐ 1 ๐ 0 ๐ฌ 2 ๐ 0
That was all pre 1.0.
We've released 1.0 in july last year. The API is stable now.
28.01.2025 19:36 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
I would call that a weakness of the AI. ๐
28.01.2025 17:09 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
This weeks Polars release we shipped initial Unity Catalog support. This makes integration with Databricks much smoother.
Writing features are under development and will follow soon. Full release notes: github.com/pola-rs/pola...
25.01.2025 11:47 โ ๐ 12 ๐ 4 ๐ฌ 0 ๐ 0
Modern Polars
A side-by-side comparison of the Polars and Pandas libraries.
Learning polars has been ... actually a joy? It just makes sense to my #rstats #dplyr trained data muscles #databs #python
kevinheavey.github.io/modern-polars/
23.01.2025 00:01 โ ๐ 48 ๐ 13 ๐ฌ 2 ๐ 2
Release Python Polars 1.20.0 ยท pola-rs/polars
โ ๏ธ Deprecations
Make parameter of str.to_decimal keyword-only (#20570)
๐ Performance improvements
Extend functionality on BitmapBuilder and use in Growables (#20754)
Specialize first/last agg fo...
This weeks Polars release has a huge improvement for window functions. They can be an order of magnitude faster.
And we can run 20/22 TPC-H queries on the new streaming engine and all on Polars cloud. More will follow soon! ;)
See the full release docs here:
github.com/pola-rs/pola...
19.01.2025 08:57 โ ๐ 20 ๐ 2 ๐ฌ 0 ๐ 0
A screenshot of a Pyodide REPL executing Polars code:
import polars as pl
import requests
r = requests.get("https://raw.githubusercontent.com/pola-rs/polars/refs/heads/main/examples/datasets/foods2.csv")
pl.read_csv(r.content).group_by("category").mean()
A screenshot of a Quarto Live code cell executing Polars code:
import polars as pl
import requests
r = requests.get("https://raw.githubusercontent.com/pola-rs/polars/refs/heads/main/examples/datasets/foods2.csv")
pl.read_csv(r.content).group_by("category").mean()
A screenshot of a Shinylive app using Polars code:
from shiny import App, render, ui
import polars as pl
from pathlib import Path
app_ui = ui.page_fluid(
ui.input_select("cyl", "Select Cylinders", choices=["4", "6", "8"]),
ui.output_data_frame("filtered_data")
)
def server(input, output, session):
df = pl.read_csv(Path(__file__).parent / "mtcars.csv")
@output
@render.data_frame
def filtered_data():
return (df
.filter(pl.col("cyl") == int(input.cyl()))
.select(["mpg", "cyl", "hp"]))
app = App(app_ui, server)
Recently I've been working on getting #polars running in #pyodide. This was a fun one, even requiring patches to LLVM's #wasm writer! Everything has now been upstreamed and earlier this week Pyodide v0.27.0 released, including a Wasm build of Polars usable in Pyodide, Shinylive and Quarto Live ๐
04.01.2025 11:59 โ ๐ 50 ๐ 9 ๐ฌ 0 ๐ 0
demo of dt.replace
โจ New temporal feature in the next Polars release!
โฒ๏ธ dt.replace lets you replace components of Date / Datetime columns
โก๐ฆ It's an expressified vectorised rustified version of the Python standard library datetime.replace
20.12.2024 16:25 โ ๐ 9 ๐ 1 ๐ฌ 0 ๐ 1
We removed serde from our Series struct and saw a significant drop in Polars' binary size (of all features activated). The amount of codegen is huge. ๐ฎ
18.12.2024 13:32 โ ๐ 5 ๐ 0 ๐ฌ 0 ๐ 0
We finally support writing to cloud storage natively and seamlessly!
12.12.2024 11:04 โ ๐ 18 ๐ 2 ๐ฌ 0 ๐ 1
YouTube video by probabl
Making a recommender by just using Polars!
Join us this Friday if you're eager to see what it can be like to design a recommender while limiting ourselves to just a DataFrame API. It is somewhat unconventional, but a great excuse to show off a Polars trick or two.
www.youtube.com/watch?v=U3Fi...
10.12.2024 10:50 โ ๐ 14 ๐ 2 ๐ฌ 0 ๐ 1
Interesting. is it an authentication error with Azure Storage?
08.12.2024 06:11 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Nice Post.
For the benchmarks, I think it would be more fair if you use `scan_csv` as read forces full materialization.
P.S. why use pyarrow to read instead of our native reader?
05.12.2024 06:00 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
A diagram shows how to use str.split, list.first / list.last, cast, pl.int_ranges, and explode, all together, to turn a dataframe where a column may contain ranges like "3-5" into a similar dataframe where all ranges have been expanded, or exploded, across multiple rows.
The full code is:
range_start = pl.col("nrs").str.split("-").list.first().cast(pl.Int64)
range_end = pl.col("nrs").str.split("-").list.last().cast(pl.Int64)
df.with_columns(pl.int_ranges(range_start, range_end + 1)).explode("nrs")
How to โexpandโ ranges like "3-5" across new rows with the values 3, 4, 5?
This comes straight from our Discord server (discord.com/invite/4UfP5...)
21.11.2024 14:36 โ ๐ 7 ๐ 2 ๐ฌ 0 ๐ 0
Diagram showing how `value_counts` produces a column with struct values, mapping column values to their counts.
We then show how to use `.struct.field` to extract a single field from the struct and how to use `.struct.unnest` to extract all fields into corresponding columns.
Why is there a `struct` data type?
A single expression produces a single column, so expressions like `value_counts` need to output structs to map the values to their counts.
With that said, do you understand why `.struct.unnest` doesn't break the 1 expr = 1 column principle?
20.11.2024 11:14 โ ๐ 7 ๐ 2 ๐ฌ 0 ๐ 0