Read the full breakdown of why PDFs are so challenging and how we're tackling it: www.llamaindex.ai/blog/why-re...
06.03.2026 19:01 β π 0 π 0 π¬ 0 π 0Read the full breakdown of why PDFs are so challenging and how we're tackling it: www.llamaindex.ai/blog/why-re...
06.03.2026 19:01 β π 0 π 0 π¬ 0 π 0We built LlamaParse using this hybrid approach: fast text extraction for standard content, vision models for complex layouts. It's how we're solving document processing at scale.
06.03.2026 19:01 β π 0 π 0 π¬ 1 π 0
π Reading order is pure guesswork β content streams have zero relationship to visual flow
π€ Seventy years of OCR evolution led us to combine text extraction with vision models for optimal results
π PDF text isn't stored as characters: it's glyph shapes positioned at coordinates with no semantic meaning
π Tables don't exist as objects: they're just lines and text that happen to look tabular when rendered
PDFs are the bane of every AI agent's existence: here's why parsing them is so much harder than you think π
Every developer building document agents eventually hits the same wall: PDFs weren't designed to be machine-readable. They're drawing instructions from 1982, not structured data.
LlamaParse vs. The LLMs β a free webinar where we parse the ugliest documents we can find across every leading model and show the results side by side.
Hosted by George, Head of Engineering, LlamaIndex
When: March 26th; 9 AM PST
Register π
landing.llamaindex.ai/llamaparsev...
"Just send the PDF to GPT-4o"
Ok. We did. Here's what happened:
β’ Reading order? Wrong.
β’ Tables? Half missing.
β’ Hallucinated data? Everywhere.
β’ Bounding boxes? Nonexistent.
β’ Cost at 100K pages? Brutal.
So we're doing it live.
This integration with DBOS removes all the manual snapshot work from durable workflows. Just pass a DBOS runtime to your workflow and get great reliability.
Learn how to build durable agents on our new docs: developers.llamaindex.ai/python/llam...
π€ Idle release feature frees memory for long-running workflows waiting on human input
π‘οΈ Built-in crash recovery detects and relaunches incomplete workflows automatically
π Every step transition persists automatically - workflows resume exactly where they left off
β‘ Zero external dependencies with SQLite, or scale to multi-replica deployments with Postgres
π―ββοΈ Built for replication - each replica owns its workflows, with Postgres coordinating across instances
Creating agent workflows and architecting the logic is one thing, making them durable and fail-safe is anotherπ
New integration for durable agent workflows with @dbos.dev execution - Make sure your agents survive crashes, restarts, and errors without writing any checkpoint code.
Our DevRel @tuana.dev gave a 30 minute workshop to get participants started on document agents with LlamaParse. We saw some amazing projects being submitted with no lack of creativity and imagination. Congrats to the 3 winning teams, and see you next time!
04.03.2026 20:04 β π 0 π 0 π¬ 0 π 0Huge thank you to everyone who joined the Google DeepMind hackathon in NYC with us over the weekend π
04.03.2026 20:04 β π 1 π 0 π¬ 1 π 0
π₯ Watch the full video here:
π Or get started right away with the docs (UI + code examples): developers.llamaindex.ai/python/clou...
In this walkthrough, @cle-does-things.bsky.social demonstrates how to configure LlamaSplit to break down Environmental Impact Reports into clearly defined impact categories π³
04.03.2026 16:58 β π 1 π 0 π¬ 1 π 0
With the intuitive UI, you can:
β’ Define a custom configuration for how your documents should be categorized
β’ Specify the exact sections or impact types you want extracted
β’ Run the job and explore the results through an interactive interfaceπ
If you need to split complex or composite documents into structured categories or sections, LlamaSplit is built for the job βοΈ
04.03.2026 16:58 β π 0 π 0 π¬ 1 π 0Read about our evolution and what's next: www.llamaindex.ai/blog/llamai...
03.03.2026 20:04 β π 0 π 0 π¬ 0 π 0Our mission is now providing core infrastructure to automate knowledge work over documents, not just being connective tissue between LLMs and data.
03.03.2026 20:04 β π 1 π 0 π¬ 1 π 0βοΈ Real automation potential exists in workflows where humans manually process documents daily - financial analysis, contract review, insurance underwriting can all become end-to-end agentic processes
03.03.2026 20:04 β π 0 π 0 π¬ 1 π 0π’ LlamaParse now processes 300k+ users across 50+ formats for enterprises like Carlyle, CEMEX, and KPMG with multi-agent workflows combining OCR, computer vision, and LLM reasoning
03.03.2026 20:04 β π 0 π 0 π¬ 1 π 0π Document understanding remains a massive opportunity - frontier vision models still struggle with complex tables, charts, and long documents at scale
03.03.2026 20:04 β π 0 π 0 π¬ 1 π 0
LlamaIndex has evolved far beyond a RAG framework - we're now focused on agentic document processing that automates knowledge work.
π Agent orchestration has fundamentally changed with sophisticated reasoning loops, tool discovery through Skills/MCP, and coding agents that write Python for you
When you parse a document with LlamaParse, you also get access to layout data for figures, charts, etc.
Parse the document, specify to save layout images, and access those images on the response! Each image will be a cropped screenshot of that specific layout element.
Check out the full tutorial: developers.llamaindex.ai/python/clou...
27.02.2026 17:02 β π 0 π 0 π¬ 0 π 0
β‘ Use the items view to get per-page structured data including tables and figures
We demonstrate this using a 2024 Executive Summary PDF, extracting a fiscal year chart showing Budget Deficit vs Net Operating Cost data spanning 2020-2024, and reproducing the key financial insights.
π Enable specialized chart parsing to convert visual charts into structured table data
πΌ Extract table rows directly from parsed PDF pages and load them into DataFrames
π Perform year-over-year analysis, calculate gaps between metrics, and create visualizations
Turn your PDF charts into pandas DataFrames with specialized chart parsing in LlamaParse!
This tutorial walks you through extracting structured data from charts and graphs in PDFs, then running data analysis with pandas - no manual data entry required.
Read the full tutorial: www.llamaindex.ai/blog/creati...
26.02.2026 17:02 β π 1 π 0 π¬ 0 π 0