LlamaIndex's Avatar

LlamaIndex

@llamaindex.bsky.social

Build AI agents over your documents

1,113 Followers  |  80 Following  |  901 Posts  |  Joined: 02.10.2023
Posts Following

Posts by LlamaIndex (@llamaindex.bsky.social)

Parse Charts in PDFs and Analyze with Pandas

Check out the full tutorial: developers.llamaindex.ai/python/clou...

27.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

⚑ Use the items view to get per-page structured data including tables and figures

We demonstrate this using a 2024 Executive Summary PDF, extracting a fiscal year chart showing Budget Deficit vs Net Operating Cost data spanning 2020-2024, and reproducing the key financial insights.

27.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

πŸ“Š Enable specialized chart parsing to convert visual charts into structured table data
🐼 Extract table rows directly from parsed PDF pages and load them into DataFrames
πŸ“ˆ Perform year-over-year analysis, calculate gaps between metrics, and create visualizations

27.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Turn your PDF charts into pandas DataFrames with specialized chart parsing in LlamaParse!

This tutorial walks you through extracting structured data from charts and graphs in PDFs, then running data analysis with pandas - no manual data entry required.

27.02.2026 17:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Creating a Deal Sourcing Agent with LlamaAgents Builder LlamaIndex is a simple, flexible framework for building knowledge assistants using LLMs connected to your enterprise data.

Read the full tutorial: www.llamaindex.ai/blog/creati...

26.02.2026 17:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The tutorial covers prompt engineering best practices, using example files effectively, visualizing agent workflows, and deploying to production. We demonstrate the complete process from initial prompt to testing the deployed application with real deal documents.

26.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

πŸ”§ Iterate and refine your agent through natural language conversations

26.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

🎯 Classify deals into buyout, growth, or minority investment strategies
πŸ“Š Extract critical metrics including revenue, EBITDA, growth rates, and debt levels
πŸš€ Deploy directly to GitHub and get a working UI without writing code

26.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Build a private equity deal sourcing agent that automatically classifies investment opportunities and extracts key financial metrics using our LlamaAgents Builder.

This step-by-step guide shows you how to create an agent that processes deal files like teasers and financial summaries:

26.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
OmniDocBench is Saturated, What’s Next for OCR Benchmarks? LlamaIndex is a simple, flexible framework for building knowledge assistants using LLMs connected to your enterprise data.

Read our full analysis: www.llamaindex.ai/blog/omnido...

24.02.2026 17:03 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

We're building parsing models focused on semantic correctness for complex visual documents. If you're scaling OCR workloads in production, LlamaParse handles the edge cases that benchmarks miss.

24.02.2026 17:03 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The document parsing challenge isn't solved just because benchmark scores look impressive. We need evaluation methods that reward semantic understanding over exact formatting, especially as AI agents become the primary consumers of parsed content.

24.02.2026 17:03 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

⚑ AI agents need semantic correctness, not perfect formatting matches - current benchmarks miss this critical distinction
πŸ”¬ The benchmark's 1,355 pages can't capture the full complexity of production document processing needs

24.02.2026 17:03 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

πŸ“Š Models are saturating OmniDocBench scores but still struggle with complex financial reports, legal filings, and domain-specific documents
🎯 Rigid exact-match evaluation penalizes semantically correct outputs that differ in formatting (HTML vs markdown, spacing, etc.)

24.02.2026 17:03 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Our latest analysis reveals why OmniDocBench, the go-to standard for document parsing evaluation, is becoming inadequate as models like GLM-OCR @Zai_org achieve 94.6% accuracy while still failing on complex real-world documents.

24.02.2026 17:03 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Document OCR benchmarks are hitting a ceiling - and that's a problem for real-world AI applications.

24.02.2026 17:03 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Build Agents From Your Files (with LlamaAgents Builder)
In this quick demo, Clelia walks you through the new file upload feature in LlamaAgents Builder. Upload example files to give the agent a concrete starting p... Build Agents From Your Files (with LlamaAgents Builder)

πŸŽ₯ Watch the full walkthrough: youtu.be/5Nk6KZhBDbQ
πŸ¦™ Get started with LlamaCloud: cloud.llamaindex.ai/signup

23.02.2026 16:59 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

You can provide example documents as context, and the agent will use them as a starting point to design and tailor your workflow. The result? Applications that better match your real-world use case.

The more representative your sample files, the more accurate your final app will be.

23.02.2026 16:58 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

πŸš€ LlamaAgents Builder just leveled up: File uploads are here!

Our natural language interface for building agentic document workflows now supports file uploads.

23.02.2026 16:58 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

πŸ‘©β€πŸ’» GitHub repo: github.com/run-llama/r...
πŸ¦™ Get started with LlamaCloud: cloud.llamaindex.ai/signup

20.02.2026 17:03 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Using our Agent Workflows, the app:
πŸ“Έ Parses receipt images with LlamaParse (Agentic tier)
πŸ—‚ Stores everything locally in an SQLite database
πŸ“Š Aggregates your spending monthly
🧠 Uses Gemini 3.1 Pro to analyze trends and generate actionable tips to improve your finances
Check out the demo!πŸ‘†

20.02.2026 17:03 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

πŸš€ Big drop from Google DeepMind: Gemini 3.1 Pro is here, and we built a hands-on demo powered by LlamaCloud to put it to work and turn your receipt photos into real financial insights!

20.02.2026 17:03 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
The Cost of Overthinking: Why Reasoning Models Fail at Document Parsing Can more reasoning improve document parsing? We tested GPT-5.2 at four thinking levels on complex documents β€” tables, formulas, multi-column layouts. The result: reasoning doesn't help. Cost and latency jumped 5–8Γ— while accuracy flatlined at ~0.79. Higher thinking actually caused hallucinations and structural errors. A pipeline-based approach like LlamaParse Agentic outperformed all reasoning levels in quality, speed, and cost.

Read the full analysis: www.llamaindex.ai/blog/the-co...

19.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Our solution uses a pipeline approach - specialized OCR extracts text at native resolution, then LLMs structure what's already been accurately read. Each component plays to its strengths instead of forcing one model to handle everything.

19.02.2026 17:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

You can't reason past what you can't see. Vision encoders lose pixel-level information before reasoning even starts, and no amount of thinking tokens can recover that lost detail.

19.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

⚑ Processing time increased 5x with xHigh reasoning (241s vs 47s) while accuracy stayed flat at ~0.79
πŸ’° Our LlamaParse Agentic outperformed all reasoning levels at 18x lower cost and 13x faster speed

19.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

🧠 Reasoning models hallucinate content that isn't there, filling in "missing" table cells with inferred values
πŸ“Š They split single tables into multiple sections by overthinking structural boundaries

19.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

More reasoning doesn't always mean better results - especially for document parsing.

We tested GPT-5.2 at four reasoning levels on complex documents and found that higher reasoning actually hurt performance while dramatically increasing costs and latency.

19.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Vibe-Code a Document Agent with LlamaAgents
In this video, Senior DevRel Enginner Tuana Celik walks through how to use LlamaAgent Builder to create an end-to-end agent workflow. The example here is an ... Vibe-Code a Document Agent with LlamaAgents

No dragging boxes around. No writing workflow code (unless you want to). Just describe the problem and let it figure out the architecture.

You own the code, it pushes to your GitHub. Clone it, open in Cursor, customize whatever you need.
www.youtube.com/watch?v=0Zh...

18.02.2026 17:37 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

πŸ› οΈ From that, the agent builder reasons about which LlamaCloud tools to use, lands on LlamaSplit + LlamaExtract, configures both, iterates on the workflow structure, and gives you a deployable agent with an API and UI.

18.02.2026 17:37 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0