Glasgow IR Group's Avatar

Glasgow IR Group

@irglasgow.bsky.social

Glasgow Information Retrieval Group at the University of Glasgow

559 Followers  |  61 Following  |  41 Posts  |  Joined: 17.11.2024  |  1.7896

Latest posts by irglasgow.bsky.social on Bluesky


Post image

๐ŸŽ„ PyTerrier Advent 25/25: To wrap up the our advent series, we'd like thank the contributors shown below, and to the many others who support the PyTerrier ecosystem! #WorldChangersTogether

25.12.2025 07:56 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 25/25: To wrap up the our advent series, we'd like thank the contributors shown below, and to the many others who support the PyTerrier ecosystem! #WorldChangersTogether

25.12.2025 07:56 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 24/25: Removing low-quality docs can boost search quality and cut indexing costs. Our SIGIRโ€™24 paper QT5 trains a T5 model to filter passages at indexing timeโ€”easy to integrate, and works with dense, PISA, or SPLADE indexes too.

24.12.2025 09:23 โ€” ๐Ÿ‘ 4    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 24/25: Removing low-quality docs can boost search quality and cut indexing costs. Our SIGIRโ€™24 paper QT5 trains a T5 model to filter passages at indexing timeโ€”easy to integrate, and works with dense, PISA, or SPLADE indexes too.

24.12.2025 09:23 โ€” ๐Ÿ‘ 4    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image

๐ŸŽ„PyTerrier Advent 23/25: Youโ€™ve done retrieval, but the results seem too homogeneous. Use a diversification reranker. Shown below is the implicit MMR diversification approach, instantiated on a BM25 or dense retrieval, but even an explicit approach like xQuAD (c.f. Rodrygo Santos) is easy to write.

23.12.2025 11:54 โ€” ๐Ÿ‘ 3    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image

๐ŸŽ„PyTerrier Advent 23/25: Youโ€™ve done retrieval, but the results seem too homogeneous. Use a diversification reranker. Shown below is the implicit MMR diversification approach, instantiated on a BM25 or dense retrieval, but even an explicit approach like xQuAD (c.f. Rodrygo Santos) is easy to write.

23.12.2025 11:54 โ€” ๐Ÿ‘ 3    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 22/25: A more complex pipelineโ€”knowledge-graphโ€“enhanced RAG from our EMNLP 2024 paper TRACE. We build a KG over retrieved docs, then use a transformer to reason over triples for better QA. This pipeline instantiation uses a cache (see 20th advent) on LLM-based KG extraction.

22.12.2025 12:25 โ€” ๐Ÿ‘ 4    ๐Ÿ” 4    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 22/25: A more complex pipelineโ€”knowledge-graphโ€“enhanced RAG from our EMNLP 2024 paper TRACE. We build a KG over retrieved docs, then use a transformer to reason over triples for better QA. This pipeline instantiation uses a cache (see 20th advent) on LLM-based KG extraction.

22.12.2025 12:25 โ€” ๐Ÿ‘ 4    ๐Ÿ” 4    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„PyTerrier Advent 21/25: Bounded recall blues got you down? You can use Adaptive Retrieval techniques, like GAR, LADR, and LAFF, to efficiently surface missing relevant documents.

21.12.2025 11:25 โ€” ๐Ÿ‘ 2    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„PyTerrier Advent 21/25: Bounded recall blues got you down? You can use Adaptive Retrieval techniques, like GAR, LADR, and LAFF, to efficiently surface missing relevant documents.

21.12.2025 11:25 โ€” ๐Ÿ‘ 2    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„PyTerrier Advent 20/25: You can think of every PyTerrier transformer as a function - mapping from one dataframe type to another. This makes them easily cachable, courtesy of pyterrier_caching. We have cache object for retrievers, rerankers, or even indexing-time transformations (e.g. Doc2Query)

20.12.2025 10:54 โ€” ๐Ÿ‘ 2    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„PyTerrier Advent 20/25: You can think of every PyTerrier transformer as a function - mapping from one dataframe type to another. This makes them easily cachable, courtesy of pyterrier_caching. We have cache object for retrievers, rerankers, or even indexing-time transformations (e.g. Doc2Query)

20.12.2025 10:54 โ€” ๐Ÿ‘ 2    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 19/25: PyTerrier-RAG brings agentic RAG to your workflows with support for SOTA methods like Search-R1 and R1-Searcher, to combine retrievers and reasoning. You could even swap BM25 out for dense or LSR retriever.

19.12.2025 10:24 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 19/25: PyTerrier-RAG brings agentic RAG to your workflows with support for SOTA methods like Search-R1 and R1-Searcher, to combine retrievers and reasoning. You could even swap BM25 out for dense or LSR retriever.

19.12.2025 10:24 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image

๐ŸŽ„ PyTerrier Advent 18/25: In RAG, the reader runs the LLMโ€”but your pipeline shouldnโ€™t depend on the LLM stack.

PyTerrier-RAG separates Reader from Backend, letting you swap vLLM โ†” HF with one line while keeping the same pipeline (and even share a Backend with other stages).

18.12.2025 14:23 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image

๐ŸŽ„ PyTerrier Advent 18/25: In RAG, the reader runs the LLMโ€”but your pipeline shouldnโ€™t depend on the LLM stack.

PyTerrier-RAG separates Reader from Backend, letting you swap vLLM โ†” HF with one line while keeping the same pipeline (and even share a Backend with other stages).

18.12.2025 14:23 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 17/25: FlexIndex simplifies dense retrieval by supporting FAISS, Voyager, FlatNav & more. It auto-builds data structures, reuses vector stores to cut storage costs, and offers one familiar API for many retrievers.

17.12.2025 14:30 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 17/25: FlexIndex simplifies dense retrieval by supporting FAISS, Voyager, FlatNav & more. It auto-builds data structures, reuses vector stores to cut storage costs, and offers one familiar API for many retrievers.

17.12.2025 14:30 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„PyTerrier Advent 16/25: Speaking of Learned Sparse Retrieval, PyTerrier has bindings to two backend search engines that provide blazing-fast retrieval over sparse vectors: PISA and BMP.

You can see that we really work to keep the look-and-feel uniform between implementations

16.12.2025 09:32 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„PyTerrier Advent 16/25: Speaking of Learned Sparse Retrieval, PyTerrier has bindings to two backend search engines that provide blazing-fast retrieval over sparse vectors: PISA and BMP.

You can see that we really work to keep the look-and-feel uniform between implementations

16.12.2025 09:32 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image Post image

๐ŸŽ„PyTerrier Advent 15/25: A very well-known learned sparse method is SPLADE. Our pyt_splade plugin makes it easy to use SPLADE by formulating Terrier indexing & retrieving pipelines that are composed with a SPLADE encoder, adding extra columns (e.g. query_toks).

Try it ๐Ÿ‘‰ github.com/cmacdonald/p...

15.12.2025 10:51 โ€” ๐Ÿ‘ 1    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image Post image

๐ŸŽ„PyTerrier Advent 15/25: A very well-known learned sparse method is SPLADE. Our pyt_splade plugin makes it easy to use SPLADE by formulating Terrier indexing & retrieving pipelines that are composed with a SPLADE encoder, adding extra columns (e.g. query_toks).

Try it ๐Ÿ‘‰ github.com/cmacdonald/p...

15.12.2025 10:51 โ€” ๐Ÿ‘ 1    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 14/25: So weโ€™ve seen sparse and dense retrieval in PyTerrier. Some folk recommend hybrid retrieval โ€“ e.g. reciprocal rank fusion (RRF) of sparse and dense results. We have an easy pipeline component that combine two sets of results by RRF (w/ a pretty schematic by Sean MacAvaney)

14.12.2025 18:31 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 10/25: Dense retrieval often improves with pseudo-relevance feedback (Rocchio-style).

In PyTerrier_DR itโ€™s easy: encode query, retrieve docs, a transformer to mix doc vectors w/ the query vector, and then re-retrieve.
pyterrier.readthedocs.io/en/latest/ex...

10.12.2025 10:17 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 14/25: So weโ€™ve seen sparse and dense retrieval in PyTerrier. Some folk recommend hybrid retrieval โ€“ e.g. reciprocal rank fusion (RRF) of sparse and dense results. We have an easy pipeline component that combine two sets of results by RRF (w/ a pretty schematic by Sean MacAvaney)

14.12.2025 18:31 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 13/25: Doc2query expands docs with generated queries, but can hallucinate. Our ECIRโ€™23 paper Doc2query-- (aka "minus minus") filters generated queries using a cross-encoder before indexing.
PyTerrier pipeline: generateโ†’scoreโ†’filterโ†’index.
๐Ÿ“„https://arxiv.org/pdf/2301.03266

13.12.2025 11:18 โ€” ๐Ÿ‘ 3    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 13/25: Doc2query expands docs with generated queries, but can hallucinate. Our ECIRโ€™23 paper Doc2query-- (aka "minus minus") filters generated queries using a cross-encoder before indexing.
PyTerrier pipeline: generateโ†’scoreโ†’filterโ†’index.
๐Ÿ“„https://arxiv.org/pdf/2301.03266

13.12.2025 11:18 โ€” ๐Ÿ‘ 3    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 12/25: Beyond dense retrieval, learned sparse methods like Doc2Query expand docs with predicted queries before indexing. Our pyterrier_doc2query plugin makes this easy for any corpusโ€”perfectly intuitive as PyTerrierโ€™s pipelines can be applied at indexing time too!

12.12.2025 10:18 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„ PyTerrier Advent 12/25: Beyond dense retrieval, learned sparse methods like Doc2Query expand docs with predicted queries before indexing. Our pyterrier_doc2query plugin makes this easy for any corpusโ€”perfectly intuitive as PyTerrierโ€™s pipelines can be applied at indexing time too!

12.12.2025 10:18 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽ„PyTerrier Advent 11/25: Want to use an external search services with PyTerrier? No problemo! It has integrations with APIs for Semantic Scholar, ChatNoir (thanks to Jan Heinrich Merker!), Pinecone, and others!

11.12.2025 09:53 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

@irglasgow is following 20 prominent accounts