InspectorRAGet: An Introspection Platform for RAG Evaluation
Large Language Models (LLM) have become a popular approach for implementing Retrieval Augmented Generation (RAG) systems, and a significant amount of effort has been spent on building good models and ...
Working on RAG? Come check out our InspectorRAGet DEMO presented by Siva Sankalp Patel May 2 (Friday), 11-12:30 at Demo Session 8 in Hall 3! Looking forward to attending ACL in a few months! #NAACL2025 @naaclmeeting.bsky.social
paper: arxiv.org/abs/2404.17347
github: github.com/IBM/Inspecto...
01.05.2025 01:24 β
π 2
π 0
π¬ 0
π 0
Excited about this collab! Come check out FeeL and help advance multilingual generation in your language! huggingface.co/spaces/feel-...
26.03.2025 13:59 β
π 2
π 1
π¬ 0
π 0
Retrievers (Elser shown here) struggle with later turns and non-standalone questions:
08.01.2025 20:09 β
π 0
π 0
π¬ 0
π 0
SOTA LLMs struggle with later turns and unanswerable questions:
08.01.2025 20:09 β
π 0
π 0
π¬ 1
π 0
Sample Conversation:
08.01.2025 20:09 β
π 0
π 0
π¬ 1
π 0
MTRAG is a challenging benchmark for SOTA LLMs and a great way to evaluate across multiple domains for Retrieval and Generation! MTRAG contains 110 conversations averaging 7.7 turns each across four domains for a total of 842 tasks. We also explore synthetic data and LLM-as-a-judge.
08.01.2025 20:09 β
π 0
π 0
π¬ 1
π 0
GitHub - IBM/mt-rag-benchmark: Multi-Turn RAG Benchmark
Multi-Turn RAG Benchmark. Contribute to IBM/mt-rag-benchmark development by creating an account on GitHub.
π New Benchmark! π
Do you work on RAG? Are you interested in Multi-Turn conversations? Very excited to share the new MTRAG benchmark we have released!
Data: github.com/ibm/mt-rag-b...
Paper: arxiv.org/abs/2501.03468
08.01.2025 20:08 β
π 6
π 4
π¬ 1
π 0
Anyone else feel like Google scholar is missing citations lately? I have a recent paper that has 8 citations on semantic scholar and only 3 on Google scholarβ¦. and I have two papers that are cited in one paper but only one has the citation π€
27.11.2024 01:47 β
π 3
π 0
π¬ 0
π 0
Please just message me on slack
25.11.2024 13:01 β
π 1
π 0
π¬ 0
π 0
Please add me. Thanks!
24.11.2024 14:33 β
π 1
π 0
π¬ 0
π 0
I did a starter pack of people in New York (City) working on ML/AI. Please distribute and feel free to self nominate!
go.bsky.app/BoEtagz
19.11.2024 01:38 β
π 87
π 19
π¬ 42
π 8
GitHub - IBM/InspectorRAGet: The repository contains generative AI analytics platform application code.
The repository contains generative AI analytics platform application code. - IBM/InspectorRAGet
If you work on RAG check out InspectorRAGet - an awesome RAG tool for evaluation. Available on HuggingFace! We provide the interface, you provide the experiments and metrics. Want to know more? Just reach out!
github.com/IBM/Inspecto...
huggingface.co/spaces/kpfad...
arxiv.org/abs/2404.17347
22.11.2024 02:22 β
π 5
π 0
π¬ 0
π 0
Starter pack for IBM Research! Follow awesome IBM researchers! IBMers, let me know and I will add you! go.bsky.app/2SXcRmA
19.11.2024 13:13 β
π 21
π 6
π¬ 3
π 1
GitHub - primeqa/clapnq
Contribute to primeqa/clapnq development by creating an account on GitHub.
Working on RAG? Check out our ClapNQ benchmark (accepted to TACL) to test the full RAG pipeline!
github.com/primeqa/clapnq
arxiv.org/abs/2404.02103
19.11.2024 02:49 β
π 12
π 2
π¬ 1
π 0
Please add me!
19.11.2024 02:44 β
π 1
π 0
π¬ 0
π 0
This is great! Please add me as well!
19.11.2024 02:42 β
π 1
π 0
π¬ 0
π 0