LlamaIndex's Avatar

LlamaIndex

@llamaindex.bsky.social

Build AI agents over your documents

1,114 Followers  |  80 Following  |  913 Posts  |  Joined: 02.10.2023
Posts Following

Posts by LlamaIndex (@llamaindex.bsky.social)

Getting Started Introduction to the Split API, a tool for automatically segmenting concatenated PDFs into logical document sections based on content categories.

πŸŽ₯ Watch the full video here:
πŸ“˜ Or get started right away with the docs (UI + code examples): developers.llamaindex.ai/python/clou...

04.03.2026 16:58 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

In this walkthrough, @cle-does-things.bsky.social demonstrates how to configure LlamaSplit to break down Environmental Impact Reports into clearly defined impact categories 🌳

04.03.2026 16:58 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

With the intuitive UI, you can:
β€’ Define a custom configuration for how your documents should be categorized
β€’ Specify the exact sections or impact types you want extracted
β€’ Run the job and explore the results through an interactive interfaceπŸ”

04.03.2026 16:58 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

If you need to split complex or composite documents into structured categories or sections, LlamaSplit is built for the job βœ‚οΈ

04.03.2026 16:58 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
LlamaIndex is more than a RAG Framework. It is Agentic Document Processing. LlamaIndex started as a RAG framework. It's evolved into something more focused: best-in-class document infrastructure for agentic work automation. As agent reasoning and coding tools advanced, framework abstractions became less critical. What hasn't changed is the need for accurate document understanding β€” the vast majority of enterprise knowledge lives in PDFs and spreadsheets, and extracting it reliably remains an unsolved, high-value problem.

Read about our evolution and what's next: www.llamaindex.ai/blog/llamai...

03.03.2026 20:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Our mission is now providing core infrastructure to automate knowledge work over documents, not just being connective tissue between LLMs and data.

03.03.2026 20:04 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

βš™οΈ Real automation potential exists in workflows where humans manually process documents daily - financial analysis, contract review, insurance underwriting can all become end-to-end agentic processes

03.03.2026 20:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

🏒 LlamaParse now processes 300k+ users across 50+ formats for enterprises like Carlyle, CEMEX, and KPMG with multi-agent workflows combining OCR, computer vision, and LLM reasoning

03.03.2026 20:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

πŸ“„ Document understanding remains a massive opportunity - frontier vision models still struggle with complex tables, charts, and long documents at scale

03.03.2026 20:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

LlamaIndex has evolved far beyond a RAG framework - we're now focused on agentic document processing that automates knowledge work.

πŸš€ Agent orchestration has fundamentally changed with sophisticated reasoning loops, tool discovery through Skills/MCP, and coding agents that write Python for you

03.03.2026 20:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
llamaparse_images.ipynb Colab notebook

colab.research.google.com/drive/1EqsH...

02.03.2026 18:14 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

When you parse a document with LlamaParse, you also get access to layout data for figures, charts, etc.

Parse the document, specify to save layout images, and access those images on the response! Each image will be a cropped screenshot of that specific layout element.

02.03.2026 18:14 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Parse Charts in PDFs and Analyze with Pandas

Check out the full tutorial: developers.llamaindex.ai/python/clou...

27.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

⚑ Use the items view to get per-page structured data including tables and figures

We demonstrate this using a 2024 Executive Summary PDF, extracting a fiscal year chart showing Budget Deficit vs Net Operating Cost data spanning 2020-2024, and reproducing the key financial insights.

27.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

πŸ“Š Enable specialized chart parsing to convert visual charts into structured table data
🐼 Extract table rows directly from parsed PDF pages and load them into DataFrames
πŸ“ˆ Perform year-over-year analysis, calculate gaps between metrics, and create visualizations

27.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Turn your PDF charts into pandas DataFrames with specialized chart parsing in LlamaParse!

This tutorial walks you through extracting structured data from charts and graphs in PDFs, then running data analysis with pandas - no manual data entry required.

27.02.2026 17:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Creating a Deal Sourcing Agent with LlamaAgents Builder LlamaIndex is a simple, flexible framework for building knowledge assistants using LLMs connected to your enterprise data.

Read the full tutorial: www.llamaindex.ai/blog/creati...

26.02.2026 17:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The tutorial covers prompt engineering best practices, using example files effectively, visualizing agent workflows, and deploying to production. We demonstrate the complete process from initial prompt to testing the deployed application with real deal documents.

26.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

πŸ”§ Iterate and refine your agent through natural language conversations

26.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

🎯 Classify deals into buyout, growth, or minority investment strategies
πŸ“Š Extract critical metrics including revenue, EBITDA, growth rates, and debt levels
πŸš€ Deploy directly to GitHub and get a working UI without writing code

26.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Build a private equity deal sourcing agent that automatically classifies investment opportunities and extracts key financial metrics using our LlamaAgents Builder.

This step-by-step guide shows you how to create an agent that processes deal files like teasers and financial summaries:

26.02.2026 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
OmniDocBench is Saturated, What’s Next for OCR Benchmarks? LlamaIndex is a simple, flexible framework for building knowledge assistants using LLMs connected to your enterprise data.

Read our full analysis: www.llamaindex.ai/blog/omnido...

24.02.2026 17:03 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

We're building parsing models focused on semantic correctness for complex visual documents. If you're scaling OCR workloads in production, LlamaParse handles the edge cases that benchmarks miss.

24.02.2026 17:03 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The document parsing challenge isn't solved just because benchmark scores look impressive. We need evaluation methods that reward semantic understanding over exact formatting, especially as AI agents become the primary consumers of parsed content.

24.02.2026 17:03 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

⚑ AI agents need semantic correctness, not perfect formatting matches - current benchmarks miss this critical distinction
πŸ”¬ The benchmark's 1,355 pages can't capture the full complexity of production document processing needs

24.02.2026 17:03 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

πŸ“Š Models are saturating OmniDocBench scores but still struggle with complex financial reports, legal filings, and domain-specific documents
🎯 Rigid exact-match evaluation penalizes semantically correct outputs that differ in formatting (HTML vs markdown, spacing, etc.)

24.02.2026 17:03 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Our latest analysis reveals why OmniDocBench, the go-to standard for document parsing evaluation, is becoming inadequate as models like GLM-OCR @Zai_org achieve 94.6% accuracy while still failing on complex real-world documents.

24.02.2026 17:03 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Document OCR benchmarks are hitting a ceiling - and that's a problem for real-world AI applications.

24.02.2026 17:03 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Build Agents From Your Files (with LlamaAgents Builder)
In this quick demo, Clelia walks you through the new file upload feature in LlamaAgents Builder. Upload example files to give the agent a concrete starting p... Build Agents From Your Files (with LlamaAgents Builder)

πŸŽ₯ Watch the full walkthrough: youtu.be/5Nk6KZhBDbQ
πŸ¦™ Get started with LlamaCloud: cloud.llamaindex.ai/signup

23.02.2026 16:59 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

You can provide example documents as context, and the agent will use them as a starting point to design and tailor your workflow. The result? Applications that better match your real-world use case.

The more representative your sample files, the more accurate your final app will be.

23.02.2026 16:58 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0