Ryan Marcus @ryanmarc.us - Bluesky Profile

Ryan Marcus

@ryanmarc.us.bsky.social

Assistant professor at UPenn. Database systems. https://RyanMarc.us I'm mostly on Mastodon, https://discuss.systems/@ryanmarcus

229 Followers | 119 Following | 3 Posts | Joined: 21.11.2023 | 1.6922

Latest posts by ryanmarc.us on Bluesky

Infographic describing BayesQO, an offline, multi-iteration learned query optimizer. On the left, it shows a Variational Autoencoder (VAE) being pretrained to reconstruct query plans from vectors, using orange-colored plan diagrams. The decoder part of the VAE is retained. In the center and right, the image shows Bayesian optimization being performed in the learned vector space: new vectors are decoded into query plans, tested for latency, and refined iteratively. At the bottom, a library of optimized query plans is used to train a robot labeled “LLM,” which can then generate new plans directly. The caption reads: "We get a fast query, but also a library of high-quality plans. We can train an LLM to speed up the process for next time!" The image credits Jeff Tao et al., SIGMOD '25, and links to https://rm.cab/bayesqo

For that one query that must go 𝑟𝑒𝑎𝑙𝑙𝑦 𝑓𝑎𝑠𝑡, BayesQO (by Jeff Tao) finds superoptimized plans using Bayesian optimization in a learned plan space. It’s costly, but the results can train an LLM to speed things up next time.

📄https://rm.cab/bayesqo

03.06.2025 19:34 — 👍 3 🔁 0 💬 0 📌 0

Infographic describing LimeQO, a workload-level, offline, learned query optimizer. On the left, it shows a workload consisting of multiple queries (q₁ to q₄), each with a default execution time (3s, 9s, 12s, 22s respectively). On the right, alternate plans (h₁, h₂, h₃) show varying execution times for each query, with some entries missing (represented by question marks). For example, q₁ takes 1s under h₂, much faster than the 3s default. A specific callout highlights that for q₃, plan h₃ reduced the time from 12s to 3s, but took 18s to find, resulting in a benefit of 9s gained / 18s search. The image poses the question: “Where should we explore next to maximize benefit?” The image credits Zixuan Yi et al., SIGMOD '25, and provides a link: https://rm.cab/limeqo

LimeQO (by Zixuan Yi), a 𝑤𝑜𝑟𝑘𝑙𝑜𝑎𝑑-𝑙𝑒𝑣𝑒𝑙 approach to query optimization, can use neural networks or simple linear methods to find good query hints significantly faster than a random or brute force search.

📄https://rm.cab/limeqo

03.06.2025 19:34 — 👍 2 🔁 0 💬 1 📌 0

OLAP workloads are dominated by repetitive queries -- how can we optimize them?

A promising direction is to do 𝗼𝗳𝗳𝗹𝗶𝗻𝗲 query optimization, allowing for a much more thorough plan search.

Two new SIGMOD papers! 🧵

03.06.2025 19:34 — 👍 6 🔁 0 💬 1 📌 0

@ryanmarc.us is following 20 prominent accounts

Jon Becker
@jonbecker

Education professor. School Law. Ed tech. Politics of Ed. Pickleball doer. 🌩️👀 yb5VxCSdeMo3

UTN Data Systems
@utndatasystems

The Data Systems Lab explores applications of AI to build next-gen data systems that are efficient and easy to use.

Wan Shen Lim
@wslim

databases @ CMU-DB | capybara enthusiast | Pittsburgh | Brunei

Natacha Crooks
@nacrooks

CS prof at UC Berkeley. Distributed Systems and Databases. Baking, handball and football enthusiast. Avid cheese eater.

Jeffrey P. Bigham
@jeffreybigham.com

Professor of HCII and LTI at Carnegie Mellon School of Computer Science. jeffreybigham.com

Andy Pavlo
@andypavlo

Associate Prof. of Databases @ Carnegie Mellon.

Mohammad Amiri
@mjamiri

Assistant Professor at Stony Brook University. Previously: UPenn and UC Santa Barbara. Areas: data management, consensus, blockchains. www3.cs.stonybrook.edu/~amiri

Sam Madden
@samrmadden

MIT Professor in EECS, Co-Founder Cambridge Mobile Telematics, Dad

Paris Carbone
@senorcarbone

Assoc Prof of Data Systems at KTH & RISE Sweden

Satnam Singh
@satnam6502

Punjabi-Scottish-American husband and father of two, Haskell hacker at Groq, cook, cyclist, Lost In Music. ∃🇮🇳 ∧ ∀🇬🇧 ∧ ∃🇪🇺 ∧ ∀🇺🇸 #celiac ex-{Microsoft, Google, Facebook, Xilinx, Glasgow} living in Los Altos, California http://satnam.raintown.org

Zoe Kleinman
@zsk

BBC Technology Editor 🤖

Bryan Cantrill
@bcantrill

Co-founder and CTO of Oxide Computer Company. According to Field of Schemes, "tech exec and Oakland A's fan" -- but more of an Oakland Ballers fan now.

Dominic Orchard
@dorchard

Computer science academic, co-directing the Institute of Computing for Climate Science at the University of Cambridge and a senior lecturer in computer science at the University of Kent. Programming languages >< climate modelling https://dorchard.github.io

Andrew W Moore (he/him)
@awm22

Stumbling through life

Sam Tobin-Hochstadt
@samth

Associate Professor, IU Computer Science · Core Developer, @racketlang.bsky.social · Member, TC39 · Handler, Gravymaker · Bike Advocate, Bloomington IN

Shriram Krishnamurthi
@shriram

Brown Computer Science / Brown University || BootstrapWorld || Pyret || Racket I'm unreasonably fascinated by, delighted by, and excited about #compsci #education #cycling #cricket and the general human experience.

Dan Roy
@roydanroy

Research Director, Founding Faculty, Canada CIFAR AI Chair @VectorInst. Full Prof @UofT - Statistics and Computer Sci. (x-appt) danroy.org I study assumption-free prediction and decision making under uncertainty, with inference emerging from optimality.

Laurence Tratt
@ltratt

Shopify / Royal Academy of Engineering Research Chair in Language Engineering. https://tratt.net/laurie/

Davanum Srinivas (@dims)
@dims.dev

#OpenSource Person, Principal Engineer for Nvidia. He/Him.

Anil Madhavapeddy
@anil.recoil.org

Professor of Planetary Computing at the University of Cambridge @cst.cam.ac.uk, where I co-lead the @eeg.cl.cam.ac.uk, and am also to found at @conservation.cam.ac.uk. Homepage at https://anil.recoil.org