Mihail Stoian's Avatar

Mihail Stoian

@mihailstoian.bsky.social

Second-year database PhD student @UTN. Interned @Oracle, @AWS

16 Followers  |  14 Following  |  8 Posts  |  Joined: 02.12.2024  |  1.5089

Latest posts by mihailstoian.bsky.social on Bluesky

Preview
Andi Zimmerer | Pruning in Snowflake: Working Smarter, Not Harder Modern cloud-based data analytics systems must efficiently process petabytes of data residing on cloud storage. A key optimization technique in state-of-the-art systems like Snowflake is partition pru...

"The fastest way of processing data is to not process it."

Our SIGMOD 2025 paper shows how Snowflake skips 99.4% of data with new pruning techniques for LIMIT, top-k, and JOIN queries.

Blog: snowflakepruning.github.io
Paper: arxiv.org/abs/2504.11540

@sigmod2025.bsky.social

05.05.2025 05:09 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 0    πŸ“Œ 1

SIGMOD BEST PAPER Honorable Mentions
πŸ₯‡ CRDV: Conflict-free Replicated Data Views
Nuno Faria (INESCTEC & U. Minho)*; JosΓ© Pereira (U. Minho & INESCTEC)
πŸ₯‡ DPconv: Super-Polynomially Faster Join Ordering
Mihail Stoian (UTN)*; Andreas Kipf (UTN)

22.04.2025 19:29 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

From top to bottom in the list of accepted papers (2025.sigmod.org/sigmod_paper...):

Yannakakis+
RPT
Galley
LpBound [my favorite]
PDX [my 2nd favorite]
GFTR
Libdbos [my 3rd favorite]
MementoFilter
Spilly

10.04.2025 17:57 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

DPconv just won a SIGMOD'25 Honorable Mention! πŸ₯

I was quite impressed, given this year's high-quality papers. Let's see who won the big prize.

My list of candidates in the thread below 🧡.

Paper: dl.acm.org/doi/10.1145/...
Slides: stoianmihail.github.io/assets/dpcon...

10.04.2025 17:57 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

πŸ”ΊRedbench is now live: github.com/utndatasyste....

Let's see how workload-aware your system really is.

09.04.2025 22:11 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image Post image Post image

Thrilled to share that we've received the Best Demonstration Award πŸ† at EDBT 2025!

Congratulations to my students @mihailstoian.bsky.social and Ping-Lin Kuo for their excellent work and dedication over the past few weeksβ€”well deserved!

Paper: openproceedings.org/2025/conf/ed...

28.03.2025 13:39 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Idea: LinDP's limitation lies in its cubic-time DP enumeration strategy, that may enumerate invalid subplans.

Our fix:

1. We output only the valid subplans inspired by DPccp.

2. We also transfer DP-states across linearizations by exploiting how IKKBZ creates them.

13.01.2025 07:20 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

πŸ“Ž Paper: db.in.tum.de/~birler/pape...

Code: github.com/umbra-db/ada...

Appears at BTW: btw2025.gi.de (the German database conference where "Unnesting Arbitrary Queries" appeared at).

13.01.2025 07:20 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Umbra's DP optimizer for queries of ~100 relations ran in cubic time.

AWS Redshift's Redset captures a 2,296-relation query.

Our revamped DP enumeration optimizes tree queries like snowflakes of *millions* of relations within 1 sec. πŸ›Έ

Joint work w/ Altan Birler & Thomas Neumann.

13.01.2025 07:20 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Lightweight Correlation-Aware Table Compression
YouTube video by Table Representation Learning Lightweight Correlation-Aware Table Compression

If you don't manage to come by, do check out our 3-min presentation:

πŸ“Ή www.youtube.com/watch?v=B0bU...
πŸ“Ž openreview.net/forum?id=z7e...

14.12.2024 01:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Are you a fan of Parquet and at #NeurIPS2024 tomorrow? Let's meet at our poster at @trl-research.bsky.social to see how you can reduce your Parquet file sizes by up to 40%.

Virtual compresses tables via functions while ensuring fast column scans.

⏰ 2.30pm
πŸ“East Meeting Room 11 & 12

14.12.2024 01:06 β€” πŸ‘ 5    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0
Thumbnail: DataLoom: Simplifying Data Loading with LLMs

Thumbnail: DataLoom: Simplifying Data Loading with LLMs

Vol:17 No:12 β†’ DataLoom: Simplifying Data Loading with LLMs
πŸ‘₯ Authors: Alexander Van Renen, Mihail Stoian, Andreas Kipf
πŸ“„ PDF: https://www.vldb.org/pvldb/vol17/p4449-renen.pdf

02.12.2024 05:00 β€” πŸ‘ 3    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

@mihailstoian is following 14 prominent accounts