advertisement generation and detection in RAG
Excited to present at #CLEF2025 #TouchΓ© Lab (Session 2) shared task "Advertisement in RAG"πͺπΈ!
@webis.de
ποΈSept 9 (Tue)
β²οΈ5:20PM (CEST) / 11:20AM (EST)
πFlorentino Sanz Room
π§ https://arxiv.org/abs/2507.00509
Join us for insights on #RAG + advertising!
09.09.2025 00:02 β π 0 π 0 π¬ 0 π 0
an aerial view of tokyo at night with lots of lights
ALT: an aerial view of tokyo at night with lots of lights
Some exciting news! π€ After 3 amazing years at TREC, the Tip-of-the-Tongue (ToT) shared task will be a core task at NTCIR-19 in 2026. The new track will focus on tip-of-the-tongue information needs in English and East Asian languages.
More details coming soon. See you all in Tokyo next year!
01.09.2025 16:12 β π 5 π 3 π¬ 0 π 0
Gentle reminder π’
All run submissions for the Tip-of-the-Tongue (ToT) Track are due next week Wednesday (Aug 27).
More info: trec-tot.github.io/guidelines
#TREC2025 #TRECToT #TREC2025ToT
19.08.2025 16:45 β π 2 π 2 π¬ 0 π 1
This year's TREC Tip of the Tongue (ToT) track will be amazing! Based on our rigorous experiments on synthetic ToT query generation presented at #SIGIR2025, we extended the track to open domain ToT queries.
We provide codes for baseline systems, and submissions are due by August 27th!
04.08.2025 17:52 β π 1 π 1 π¬ 0 π 0
Hello TREC-ToTers!
We have released the test queries for the TREC 2025 Tip-of-the-Tongue (TREC-ToT) Track. Please see the guidelines for more information: trec-tot.github.io/guidelines. Run submission deadline will tentatively be in August. #TREC2025 #TRECToT #TREC2025ToT
Please spread the word!
13.07.2025 16:47 β π 3 π 3 π¬ 0 π 1
βHow do LLMs respond to fair ranking in RAG?
π€© See how fair ranking boosts downstream utility while promoting fairer attribution of cited sources.
Catch our oral presentation at #ICTIR2025!
#SIGIR2025 @841io.bsky.social
12.07.2025 13:32 β π 7 π 0 π¬ 0 π 1
Dory from finding nemo with the quote: "I remember it like it was yesterday. Of course, I dont remember yesterday."
Do not forget to participate in the #TREC2025 Tip-of-the-Tongue (ToT) Track :)
The corpus and baselines (with run files) are now available and easily accessible via the ir_datasets API and the HuggingFace Datasets API.
More details are available at: trec-tot.github.io/guidelines
27.06.2025 14:46 β π 11 π 7 π¬ 0 π 0
An overview of the work βResearch Borderlands: Analysing Writing Across Research Culturesβ by Shaily Bhatt, Tal August, and Maria Antoniak. The overview describes that We survey and interview interdisciplinary researchers (Β§3) to develop a framework of writing norms that vary across research cultures (Β§4) and operationalise them using computational metrics (Β§5). We then use this evaluation suite for two large-scale quantitative analyses: (a) surfacing variations in writing across 11 communities (Β§6); (b) evaluating the cultural competence of LLMs when adapting writing from one community to another (Β§7).
ποΈ Curious how writing differs across (research) cultures?
π© Tired of βculturalβ evals that don't consult people?
We engaged with interdisciplinary researchers to identify & measure β¨cultural normsβ¨in scientific writing, and show thatβLLMs flatten themβ
π arxiv.org/abs/2506.00784
[1/11]
09.06.2025 23:29 β π 74 π 30 π¬ 1 π 5
TREC 2025 Tip-of-the-Tongue (ToT) Track
Tip of the tongue: The phenomenon of failing to retrieve something from memory, combined with partial recall and the feeling that retrieval is imminent.
Hello TREC-ToTers! ππ½
Excited to announce the release of TREC 2025 Tip-of-the-Tongue (TREC-ToT) Track guidelines: trec-tot.github.io/guidelines. We will release test queries in July and run submission deadline will be in August. #TREC2025 #TRECToT #TREC2025ToT
Please register to participate:
09.05.2025 21:02 β π 4 π 2 π¬ 0 π 1
Related paper here!
bsky.app/profile/841i...
29.04.2025 21:29 β π 0 π 0 π¬ 0 π 0
Ever trusted a metric that works great on average, only for it to fail in your specific use case?
In our #NAACL2025 paper (w/ @841io.bsky.social), we show why global evaluations are not enough and why context matters more than you think.
π aclanthology.org/2025.finding...
#NLP #Evaluation
(π§΅1/9)
29.04.2025 17:10 β π 23 π 5 π¬ 1 π 2
Towards Fair RAG: On the Impact of Fair Ranking in Retrieval-Augmented Generation
Modern language models frequently include retrieval components to improve their outputs, giving rise to a growing number of retrieval-augmented generation (RAG) systems. Yet, most existing work in RAG...
If you're interested in OpenAI including shopping results, you might also be interested in @teknology.bsky.social's paper relating retrieval diversity/fairness and generation by downstream RAG models. This has implications for individuals selling products online.
arxiv.org/abs/2409.11598
28.04.2025 19:34 β π 9 π 2 π¬ 0 π 1
If you're working on a recall-oriented task or with ranking systems evaluated across varied users, content, or intents, check it out. 5/5
dl.acm.org/doi/10.1145/...
07.04.2025 16:15 β π 1 π 2 π¬ 0 π 0
A ven diagram showing that the recall and robustness, each of which has many different conceptions, interest when thinking about recall as "totality" and robustness as "worst-case performance". It's in this intersection that lexicographic recall (lexirecall) lives.
π’ New Paper: "Recall, Robustness, and Lexicographic Evaluation" (ACM TORS)
F Diaz, M Ekstrand (@md.ekstrandom.net), B Mitra (@bmitra.bsky.social)
For IR, NLP, and ML researchers working on ranking systems evaluated for recall and robustness. π§΅ 1/5 dl.acm.org/doi/10.1145/...
07.04.2025 16:15 β π 14 π 6 π¬ 1 π 0
Here's an overview of TREC 2024 TOT track runs with the test queries:
trec.nist.gov/pubs/trec33/...
07.03.2025 16:29 β π 0 π 0 π¬ 0 π 0
Yes! Thats exactly the case of TOT retrieval for academics :)
05.03.2025 22:08 β π 0 π 0 π¬ 0 π 0
Overview
Tip of the tongue: The phenomenon of failing to retrieve something from memory, combined with partial recall and the feeling that retrieval is imminent.
These approaches powered the TREC 2024 TOT track test queries and will continue into the 2025 track (trec-tot.github.io).
Joyful collaboration with Yifan He @841io.bsky.social Jaime Arguello, and @bmitra.bsky.social !
#SIGIR #TREC #TOT
05.03.2025 01:37 β π 4 π 2 π¬ 1 π 0
β‘οΈMulti-Domain Coverage
Combining both methods allows TOT query evaluation in multiple domains. We tested simulated evaluation in Movie, Landmark, and Person domains. Moreover, we build a broader, more inclusive TOT test collection.
05.03.2025 01:36 β π 2 π 1 π¬ 1 π 0
Human TOT query elicitation interface
Solution2οΈβ£: Human-Elicitation
We designed an interface with visual prompts to induce a TOT state in human participants. Their queries closely match authentic TOT queries and captures genuine TOT experiences in a controlled setting.
05.03.2025 01:35 β π 3 π 1 π¬ 1 π 0
System rank correlation as a validation method for synthetic TOT queries.
Solution1οΈβ£: LLM-Elicitation
We built a TOT user simulator to produce synthetic queries. Results show high system rank correlation and linguistic similarity compared to real queries. This scalable simulated evaluation method overcomes data scarcity by simulating new queries on demand.
05.03.2025 01:35 β π 3 π 1 π¬ 1 π 0
π€Why the Problem?
TOT query data collection relies heavily on community question answering websites (e.g., Reddit). This causes data availability issues and domain bias (most TOT queries end up being about movies or books).
05.03.2025 01:33 β π 4 π 1 π¬ 1 π 0
π
Tip-of-the-Tongue (TOT) search is a complex form of known-item search, shaped by the expression of partial recall, personal context, and uncertain memories. However, TOT research has long been hindered by the scarcity of high-quality TOT queries.
05.03.2025 01:33 β π 5 π 1 π¬ 1 π 0
Tip of the Tongue Query Elicitation for Simulated Evaluation
Tip-of-the-tongue (TOT) search occurs when a user struggles to recall a specific identifier, such as a document title. While common, existing search systems often fail to effectively support TOT scena...
π¨New Breakthrough in Tip-of-the-Tongue (TOT) Retrieval Research!
We address data limitations and offer a fresh evaluation method for these complex queries.
Curious how TREC TOT track test queries are created? Check out this thread π§΅ and our paper π: arxiv.org/abs/2502.17776
05.03.2025 01:32 β π 17 π 7 π¬ 2 π 1
Figure showing that interpretations of gestures vary dramatically across regions and cultures. βCrossing your fingers,β commonly used in the US to wish for good luck, can be deeply offensive to female audiences in parts of Vietnam. Similarly, the 'fig gesture,' a playful 'got your nose' game with children in the US, carries strong sexual connotations in Japan and can be highly offensive.
Did you know? Gestures used to express universal conceptsβlike wishing for luckβvary DRAMATICALLY across cultures?
π€means luck in US but deeply offensive in Vietnam π¨
π£ We introduce MC-SIGNS, a test bed to evaluate how LLMs/VLMs/T2I handle such nonverbal behavior!
π: arxiv.org/abs/2502.17710
26.02.2025 16:22 β π 33 π 7 π¬ 1 π 3
Are you planning to come to AFME workshop? If you are, then would love to talk with you at the venue!
10.12.2024 18:53 β π 1 π 0 π¬ 1 π 0
Towards Fair RAG: On the Impact of Fair Ranking in Retrieval-Augmented Generation
Many language models now enhance their responses with retrieval capabilities, leading to the widespread adoption of retrieval-augmented generation (RAG) systems. However, despite retrieval being a cor...
Heading to #NeurIPS2024 to present our βFair RAGβ paper at the #AFME2024 workshop! Let's talk about RAG, Information Retrieval, and Fairness. Honored that our paper was selected as one of the Top 5 Spotlight Papers! π Letβs connect and chat!
Paper: arxiv.org/abs/2409.11598
09.12.2024 21:19 β π 11 π 4 π¬ 1 π 1
Slides are up! I presented on "Presentation & Consumption in the context of REML"
The full deck is here. There's a lot of gems if you're interested in this space!
retrieval-enhanced-ml.github.io/sigir-ap2024...
09.12.2024 07:14 β π 15 π 6 π¬ 0 π 0
Waiting on a robot body. All opinions are universal and held by both employers and family. ML/NLP.
nsaphra.net
multi-model @ Β¬β | ex ai safety @LTI, CMU
SIGIR is the Association for Computing Machineryβs Special Interest Group on Information Retrieval. Since 1963, we have promoted research, development and education in the area of search and other information access technologies.
Visit: https://sigir.org/
PhD student at the CIR Group, TH KΓΆln, Germany
Postdoc at the University of Edinburgh | PhD, University of Amsterdam | Former Applied Scientist Intern at Amazon
Offizieller Account der UniversitΓ€t TΓΌbingen.
Impressum: https://uni-tuebingen.de/impressum/
Datenschutz: https://uni-tuebingen.de/impressum/bluesky-hinweise/
Machine learning and information retrieval researcher. | Assistant professor at Radboud University Nijmegen and visiting research scholar at Google DeepMind. | Previously at Google Research, Twitter and University of Amsterdam.
Research in NLP (mostly LM interpretability & explainability).
Assistant prof at UMD CS + CLIP.
Previously @ai2.bsky.social @uwnlp.bsky.social
Views my own.
sarahwie.github.io
Posting about research fby and events and news relevant for the Amsterdam NLP community. Account maintained by @wzuidema@bsky.social
π Researcher β’ π» Developer β’ πͺπΊ European
PhD student for health-related information retrieval at @uni-jena.de Γ @webis.de
Prof at Saarland. NLP and machine learning.
Theory and interpretability of LLMs.
https://www.mhahn.info
#NLP Postdoc at Mila - Quebec AI Institute and McGill University | Former PhD @ University of Copenhagen (CopeNLU)
π karstanczak.github.io
Masterβs student @ltiatcmu.bsky.social. he/him
a mediocre combination of a mediocre AI scientist, a mediocre physicist, a mediocre chemist, a mediocre manager and a mediocre professor.
see more at https://kyunghyuncho.me/
PhD @CMU LTI
https://eeelisa.github.io/
associate prof at UMD CS researching NLP & LLMs
Grad Student @ CMU | Currently fascinated by problems in ML4Code and AI Alignment | cs.cmu.edu/~anmola