Appreciate the kind words!
03.11.2025 23:24 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0@xiangpeng.systems.bsky.social
Database/storage Flight/DataFusion/Arrow/Parquet PhD student@UW-Madison https://xiangpeng.systems
Appreciate the kind words!
03.11.2025 23:24 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0Nice to see this getting shared! ๐ Now Iโm even more motivated to turn it into a full course.
29.10.2025 19:01 โ ๐ 3 ๐ 0 ๐ฌ 0 ๐ 0Just like other big cities, Madison is getting its own systems talk series. Come join us!
24.10.2025 20:08 โ ๐ 7 ๐ 1 ๐ฌ 0 ๐ 0LiquidCache a distributed pushdown cache for DataFusion, designed to cut down S3 requests for diskless databases.
๐ป Code: github.com/XiangpengHao...
๐ Paper (VLDB 2026): github.com/XiangpengHao...
Thanks you for sharing! slides are here ๐ what-is-liquid-cache.xiangpeng.systems
02.09.2025 00:26 โ ๐ 2 ๐ 0 ๐ฌ 0 ๐ 0Hey Tyler ๐ welcome back! I'd be happy to chat, I work in the data systems space (database + storage + cloud) from the same group that also studies storage fault!
01.08.2025 19:52 โ ๐ 2 ๐ 0 ๐ฌ 0 ๐ 0Project repo: github.com/XiangpengHao...
16.05.2025 00:53 โ ๐ 6 ๐ 0 ๐ฌ 0 ๐ 0Data-Aware Caching for Cloud Analytics
Join my PhD prelim talk next Monday:
Data-Aware Caching for Cloud Analytics
๐ May 19, 1PM CDT
๐ CS2310 or Zoom: uwmadison.zoom.us/j/3081128886
My manifesto on optimizing SQL and DataFrames in query engines (including an explanation of why Apache DataFusion doesn't have a complex join ordering algorithm):
www.influxdata.com/blog/optimiz... www.influxdata.com/blog/optimiz...
New blog post: "Build your own S3-Select in 400 lines of Rust"
Check it out ๐: blog.xiangpeng.systems/posts/build-...
Credit goes to github.com/excalidraw/e... for making it easy๐
14.03.2025 14:00 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0Here's the PR: github.com/apache/arrow...
13.03.2025 18:36 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0I submitted a PR that cuts average ClickBench latency by 15% for DataFusion! But reviewing it wasn't straightforward due to the nature of complex performance tuning dynamics, so I made a blog post to explain why it works -- check it out: blog.xiangpeng.systems/posts/parque...
13.03.2025 18:36 โ ๐ 16 ๐ 2 ๐ฌ 2 ๐ 0We are excited to share Fray Debugger (aoli.al/blogs/deadlo...), an IntelliJ plugin that allows you to control concurrent execution deterministically!
We have translated the Deadlock Empire (deadlockempire.github.io) into Java to demonstrate how to use Fray Debugger.
Meanwhile, as a PhD student, I still feel frustrated comparing my systems to many ideas that seem novel but lack practical impact. That said, I find โfeet on the ground, head in the cloudsโ research very inspiring -- itโs probably what keeps me motivated to stay in academia.
10.03.2025 19:11 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0Thanks for the insightful points, Marc! I totally agree that academia is important in many areas. I'm planning a follow-up post discussing the kinds of research that are impactful and beneficial to people, and your examples strongly resonate with what I have in mind!
10.03.2025 19:05 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0Thanks for sharing your perspective! Itโs always helpful to hear insights from folks whoโve spent time in industry. Thereโs definitely room for academia to evolve, and Iโm hopeful it will :)
10.03.2025 18:49 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0@xiangpeng.systems shared a great post about system researchers. I wrote a comment on it and would like to share some thoughts here and offer complementary ideas.
In short: build paper with open source.
xuanwo.io/links/2025/0...
Wrote a blog post reflecting my thoughts on DeepSeek, NSF funding and system research communities in general. Apologies for the bold claims -- hope they can invite some discussions.
blog.xiangpeng.systems/posts/system...
Compile to WASM is a very interesting idea! I think Fray at some point explored this a bit, not sure about the current status
22.02.2025 18:52 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0Current approaches need to replace std locks with framework provided locks, like the ones in shuttle: docs.rs/shuttle/late...
I think binary instrumentation like the one in this paper is possible, but I'm not an expert on this. www.microsoft.com/en-us/resear...
I heard from Fray dev that it is getting a built-in interactive debugger, which visualizes what each threads is doing at a given moment, I can see it to be incredibly useful!
22.02.2025 18:44 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0Yes, Loom and shuttle: github.com/awslabs/shut...
They are incredibly useful at identifying and reproducing bugs, but I find it quite hard to use them with a debugger, as lldb needs frequently jump to different stacks and I soon lost track of what's going on...
Checkout the underneath framework: github.com/cmu-pasta/fray
Looking forward to a future Rust support๐
It uses Gemini free tier API to translate natural language to SQL: ai.google.dev/pricing#1_5f...
24.11.2024 20:01 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0Reading S3 files (through OpenDAL) is planned for next weekend :-)
24.11.2024 19:59 โ ๐ 2 ๐ 0 ๐ฌ 0 ๐ 1My weekend project now comes with AI super power! Now you can explore Parquet data with natural language! parquet-viewer.haoxp.xyz
24.11.2024 19:58 โ ๐ 2 ๐ 1 ๐ฌ 2 ๐ 0I helped on the string view part, along with many others!
22.11.2024 02:06 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0This is amazing -- an open source query engine build on open standard is now the fastest, and it is in Rust! datafusion.apache.org/blog/2024/11...
21.11.2024 23:22 โ ๐ 32 ๐ 4 ๐ฌ 2 ๐ 1New blog post on the fun new hardware advancements which databases can leverage for great gains, and why the cloud means it doesn't matter that they exist. ๐ซ
transactional.blog/b...