YouTube video by How AI Is Built
Temporal RAG: Embracing Time for Smarter, Reliable Knowledge Graphs
Thanks to @nicolay.fyi for giving me the opportunity to talk about @trustgraph.bsky.social on How AI Is Built!
Labels such as "facts", "observations", and "assertions" take on new meanings when we begin to consider time. Click ๐ to watch the full episode! ๐๏ธ
youtu.be/VpFVAE3L1nk?
17.02.2025 18:36 โ ๐ 6 ๐ 3 ๐ฌ 0 ๐ 0
Dropping some new episodes on @howaiisbuilt.fm . Links below.
31.01.2025 12:44 โ ๐ 1 ๐ 1 ๐ฌ 1 ๐ 0
AI-Powered Search: Context Is King, But Your RAG System Ignores Two-Thirds of It | S2 E21
How AI Is Built ยท Episode
Trey and I talk about the different techniques for AI-powered search and how we can combine them to build modern search systems.
Spotify: open.spotify.com/episode/1udV...
Apple: podcasts.apple.com/us/podcast/a...
09.01.2025 13:58 โ ๐ 2 ๐ 1 ๐ฌ 0 ๐ 0
YouTube video by How AI Is Built
AI-Powered Search: Context Is King, But Your RAG System Ignores Two-Thirds of It | S2 E21
You want the exact opposite.
You want layers of tools aligned in a graph that you can tune, debug, and update in isolation.
Today on How AI Is Built, we are talking to one of the OGs of search: Trey Grainger, the author of AI Powered Search.
www.youtube.com/watch?v=6IQq...
09.01.2025 13:58 โ ๐ 1 ๐ 1 ๐ฌ 1 ๐ 0
New episode is out.
The three contexts of search, layered architectures and much more!
09.01.2025 13:59 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
YouTube video by How AI Is Built
Chunking for RAG: Stop Breaking Your Documents Into Meaningless Pieces | S2 E20
Youtube: www.youtube.com/watch?v=trG5...
Spotify: open.spotify.com/episode/6eyT...
Apple: podcasts.apple.com/us/podcast/c...
03.01.2025 11:28 โ ๐ 1 ๐ 2 ๐ฌ 0 ๐ 0
The biggest lie in RAG is that semantic search is simple.
The reality is that it's easy to build, it's easy to get up and running, but it's really hard to get right.
And if you don't have a good setup, it's near impossible to debug.
One of the reasons it's really hard is chunking.
03.01.2025 11:28 โ ๐ 2 ๐ 1 ๐ฌ 1 ๐ 0
YouTube video by How AI Is Built
How AI Can Start Teaching Itself
Most companies can't afford huge teams labeling AI data.
So, use an AI model to train an AI model.
The big labs like Cohere and OpenAI already use โsynthetic dataโ - AI-generated data that mimics real-world patterns.
The LLMs you use are already trained with it.
youtu.be/thqgKG5lZ8Q
19.12.2024 12:43 โ ๐ 1 ๐ 1 ๐ฌ 1 ๐ 0
YouTube video by How AI Is Built
A Search System That Learns As You Use It (Agentic RAG)
Youtube: youtu.be/Z9Z820HadIA
Spotify: open.spotify.com/episode/3LAJ...
Apple: podcasts.apple.com/us/podcast/a...
14.12.2024 14:07 โ ๐ 2 ๐ 1 ๐ฌ 0 ๐ 0
YouTube video by How AI Is Built
A Search System That Learns As You Use It (Agentic RAG)
Want to learn more? Today on @howaiisbuilt.fm, we are talking to Stephen Batifol from Zilliz. Stephen and I discuss agentic RAG and the future of search - where systems decide their own path to find answers.
What's your take on agentic RAG?
youtu.be/Z9Z820HadIA
14.12.2024 14:07 โ ๐ 2 ๐ 1 ๐ฌ 1 ๐ 0
"Instead of being a one-way pipeline, agentic RAG allows you to check, 'Am I actually answering the user's question?'"
Different questions need different approaches.
โก๏ธ ๐ค๐๐ฒ๐ฟ๐-๐๐ฎ๐๐ฒ๐ฑ ๐๐น๐ฒ๐
๐ถ๐ฏ๐ถ๐น๐ถ๐๐:
- Structured data? Use SQL
- Context-rich query? Use vector search
- Date-specific? Apply filters first
14.12.2024 14:07 โ ๐ 5 ๐ 1 ๐ฌ 2 ๐ 0
Rethinking Search Inside Postgres, From Lexemes to BM25
How AI Is Built ยท Episode
We talk about how they enable BM25 on PostgreSQL, how they integrate into the Postgres Query engines, and how you can build search in your database.
open.spotify.com/episode/4CXX...
05.12.2024 13:38 โ ๐ 1 ๐ 1 ๐ฌ 0 ๐ 0
YouTube video by How AI Is Built
Rethinking Search Inside Postgres, From Lexemes to BM25
Not anymore.
ParadeDB is building an open-source PostgreSQL extension to enable search within your database.
Today on How AI Is Built, I am talking to @philippemnoel.bsky.social , the founder and CEO of @paradedb.bsky.social.
youtu.be/RPjGuOcrTsQ
05.12.2024 13:38 โ ๐ 3 ๐ 1 ๐ฌ 1 ๐ 1
Many companies use ElasticSearch or OpenSearch and use 10% of the capacity.
On top, they have to build ETL pipelines.
Get data normalized.
Worry about race conditions.
All in all, when you want to do search on top of your existing database, you are forced to build distributed systems.
#ai
05.12.2024 13:38 โ ๐ 2 ๐ 2 ๐ฌ 1 ๐ 0
Documentation quality is the silent killer of RAG systems. A single ambiguous sentence might corrupt an entire set of responses. But the hardest part isn't fixing errors - it's finding them.
Check out the episode with Max.
Links to Spotify, Apple in the thread.
21.11.2024 12:07 โ ๐ 1 ๐ 1 ๐ฌ 0 ๐ 0
LLMs hallucinate.
We want to put the blame on them.
But often itโs our fault.
Many knowledge bases have:
โ Temporal Inconsistencies
- Multiple versions from different time periods
- Historical information without timeline context
>>
21.11.2024 12:05 โ ๐ 1 ๐ 1 ๐ฌ 1 ๐ 0
With RAG these issues are amplified.
We do not look at full documents anymore, but at bits and pieces.
So we have to be extra careful.
Today on @howaiisbuilt.fm we talk to Max Buckley.
Max works at Google and has built a lot of interesting stuff with LLMs to improve knowledge bases for RAG.
>>
21.11.2024 12:05 โ ๐ 0 ๐ 1 ๐ฌ 1 ๐ 0
Some query types might not work at all.
It is very costly in terms of storage and compute. We have to keep our indexes in memory to achieve a low enough latency for search.
What we are talking about today works for everything, works out of domain, and is one of the most efficient.
>>
15.11.2024 12:20 โ ๐ 2 ๐ 1 ๐ฌ 1 ๐ 0
People implementing RAG jump straight into vector search.
But vector search has a lot of downsides.
Vector search is not robust out of domain.
Different types of queries need different embedding models with different vector indices.
>>
15.11.2024 12:20 โ ๐ 4 ๐ 2 ๐ฌ 1 ๐ 0
You probably guessed it, we are talking about the OG ranking function in search: BM25.
Today we are back continuing our series on search on @howaiisbuilt.fm with @taidesu.bsky.social.
We talk about BM25, how it works, what makes it great and how you can tailor it to your use-case.
15.11.2024 12:20 โ ๐ 2 ๐ 1 ๐ฌ 1 ๐ 1
"Sadly, it's a bit off a snake oil. These long context embedding models have tested basically all of them, not really working well. So it's [best length of chunks] something between like 500 and 1,000 tokens."
Text embeddings are far from perfect.
They struggle with long documents.
>>
09.11.2024 16:52 โ ๐ 6 ๐ 3 ๐ฌ 1 ๐ 0
Vector Databases come with their own set of challenges.
The data is too large to be stored on a single node.
We often need to handle 10k to 50k QPS.
Indexes are very slow to build, but we still want to search the fresh data.
>>
07.11.2024 13:37 โ ๐ 3 ๐ 2 ๐ฌ 4 ๐ 0
Catch the episode on:
- Youtube: youtu.be/3PEARAf7HEc (now in 4K :D)
- Spotify: open.spotify.com/episode/5lCl...
- Apple: podcasts.apple.com/us/podcast/v...
07.11.2024 13:39 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
Podcast Thumbnail with Title: Search Systems at Scale: Avoiding Local Maxima and Other Engineering Lessons
โThere is no free lunch.โ
Every performance optimization comes with tradeoffs in either functionality, flexibility, or cost.
When building search systems, there's a seductive idea that we can optimize everything: fast results, high relevancy, and low costs.
But thatโs not the reality.
05.11.2024 17:22 โ ๐ 1 ๐ 1 ๐ฌ 5 ๐ 0