Yu Lu Liu's Avatar

Yu Lu Liu

@liuyulu.bsky.social

PhD student at Johns Hopkins University Alumni from McGill University & MILA Working on NLP Evaluation, Responsible AI, Human-AI interaction she/her πŸ‡¨πŸ‡¦

1,039 Followers  |  892 Following  |  34 Posts  |  Joined: 12.11.2024  |  2.2863

Latest posts by liuyulu.bsky.social on Bluesky

This was accepted to #NeurIPS πŸŽ‰πŸŽŠ

TL;DR Impoverished notions of rigor can have a formative impact on AI work. We argue for a broader conception of what rigorous work should entail & go beyond methodological issues to include epistemic, normative, conceptual, reporting & interpretative considerations

29.09.2025 23:13 β€” πŸ‘ 25    πŸ” 8    πŸ’¬ 1    πŸ“Œ 1
Dr. Su Lin Blodgett and Dr. Gagan Bansal will be the keynote speakers of the 2nd HEAL workshop @CHI25

Dr. Su Lin Blodgett and Dr. Gagan Bansal will be the keynote speakers of the 2nd HEAL workshop @CHI25

We are excited to kick off the 2nd HEAL workshop tomorrow at #CHI2025. Dr. Su Lin Blodgett and Dr. Gagan Bansal from MSR will be our keynote speakers!

Welcome new and old friends! See you at G221!

All accepted papers: tinyurl.com/bdfpjcr4

25.04.2025 11:28 β€” πŸ‘ 6    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

πŸ˜…

03.04.2025 13:29 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

Bringing together our incredible current and admitted studentsβ€”future leaders, innovators, and changemakers!

07.03.2025 05:15 β€” πŸ‘ 7    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

πŸ“£ DEADLINE EXTENSION πŸ“£

By popular request, HEAL workshop submission deadline is extended to Feb 24 AOE!

Reminder that we welcome a wide range of submissions: position papers, lit reviews, encore of published work, etc.

Looking forward to your submissions!

13.02.2025 22:05 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Faux Polyglot: A Study on Information Disparity in Multilingual Large Language Models With Retrieval Augmented Generation (RAG), Large Language Models (LLMs) are playing a pivotal role in information search and are being adopted globally. Although the multilingual capability of LLMs of...

Thrilled that our paper Faux Polyglot has been accepted to #NAACL2025 main! πŸš€
We show that multilingual RAG creates language-specific information cocoons and amplifies perspectives and facts in the dominant language, especially when handling knowledge conflicts.
πŸ“œ arxiv.org/abs/2407.05502

31.01.2025 15:19 β€” πŸ‘ 10    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0

The submission deadline is in less than a month! We welcome encore submissions, so consider submitting your work regardless of whether it's been accepted or not #chi2025 πŸ˜‰

22.01.2025 15:32 β€” πŸ‘ 8    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

An awesome team of organizers: @wesleydeng.bsky.social, @mlam.bsky.social, @juhokim.bsky.social, @qveraliao.bsky.social, @cocoweixu.bsky.social, @ziangxiao.bsky.social, Motahhare Eslami, and Jekaterina Novikova!

16.12.2024 22:07 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 0    πŸ“Œ 1
The image includes a shortened call for participation that reads: 
"We welcome participants who work on topics related to supporting human-centered evaluation and auditing of language models. Topics of interest include, but not limited to:
- Empirical understanding of stakeholders' needs and goals of LLM evaluation and auditing
- Human-centered evaluation and auditing methods for LLMs
- Tools, processes, and guidelines for LLM evaluation and auditing
- Discussion of regulatory measures and public policies for LLM auditing
- Ethics in LLM evaluation and auditing

Special Theme: Mind the Context. We invite authors to engage with specific contexts in LLM evaluation and auditing. This theme could involve various topics: the usage contexts of LLMs, the context of the evaluation/auditing itself, and more! The term ''context'' is purposefully left open for interpretation!

The image also includes pictures of workshop organizers, who are: Yu Lu Liu, Wesley Hanwen Deng, Michelle S. Lam, Motahhare Eslami, Juho Kim, Q. Vera Liao, Wei Xu, Jekaterina Novikova, and Ziang Xiao.

The image includes a shortened call for participation that reads: "We welcome participants who work on topics related to supporting human-centered evaluation and auditing of language models. Topics of interest include, but not limited to: - Empirical understanding of stakeholders' needs and goals of LLM evaluation and auditing - Human-centered evaluation and auditing methods for LLMs - Tools, processes, and guidelines for LLM evaluation and auditing - Discussion of regulatory measures and public policies for LLM auditing - Ethics in LLM evaluation and auditing Special Theme: Mind the Context. We invite authors to engage with specific contexts in LLM evaluation and auditing. This theme could involve various topics: the usage contexts of LLMs, the context of the evaluation/auditing itself, and more! The term ''context'' is purposefully left open for interpretation! The image also includes pictures of workshop organizers, who are: Yu Lu Liu, Wesley Hanwen Deng, Michelle S. Lam, Motahhare Eslami, Juho Kim, Q. Vera Liao, Wei Xu, Jekaterina Novikova, and Ziang Xiao.

Human-centered Evalulation and Auditing of Language models (HEAL) workshop is back for #CHI2025, with this year's special theme: β€œMind the Context”! Come join us on this bridge between #HCI and #NLProc!

Workshop submission deadline: Feb 17 AoE
More info at heal-workshop.github.io.

16.12.2024 22:07 β€” πŸ‘ 44    πŸ” 10    πŸ’¬ 2    πŸ“Œ 4

Super excited to announce that @msftresearch.bsky.social's FATE group, Sociotechnical Alignment Center, and friends have several workshop papers at next week's @neuripsconf.bsky.social. A short thread about (some of) these papers below... #NeurIPS2024

02.12.2024 23:01 β€” πŸ‘ 58    πŸ” 13    πŸ’¬ 1    πŸ“Œ 0

πŸ“£ πŸ“£ Interested in an internship on human-centred AI, human agency, AI evaluation & the impacts of AI systems? Our team/FATE MLT (Su Lin Blodgett, @qveraliao.bsky.social & I) is looking for a few summer interns πŸŽ‰ Apply by Jan 10 for full consideration: jobs.careers.microsoft.com/global/en/jo...

05.12.2024 20:11 β€” πŸ‘ 22    πŸ” 10    πŸ’¬ 0    πŸ“Œ 2
Preview
It Takes Two to Tango: Navigating Conceptualizations of NLP Tasks and Measurements of Performance Progress in NLP is increasingly measured through benchmarks; hence, contextualizing progress requires understanding when and why practitioners may disagree about the validity of benchmarks. We develop...

Seeing cool works on metrology and measurement modeling for NLP!

So I wanted to port over the thread our ACL 2023 Findings paper (arxiv.org/abs/2305.09022) on conceptualizations of NLP tasks and measurements of performance! Work with Eric Yuan, @haldaume3.bsky.social, and Su Lin Blodgett. (1/n)

04.12.2024 18:37 β€” πŸ‘ 18    πŸ” 7    πŸ’¬ 1    πŸ“Œ 0

I am collecting examples of the most thoughtful writing about generative AI published in 2024. What’s yours? They can be insightful for commentary, smart critique, or just because it shifted the conversation. I’ll post some of mine below as I go through them. #criticalAI

02.12.2024 04:09 β€” πŸ‘ 353    πŸ” 118    πŸ’¬ 98    πŸ“Œ 19

Created a small starter pack including folks whose work I believe contributes to more rigorous and grounded AI research -- I'll grow this slowly and likely move it to a list at some point :) go.bsky.app/P86UbQw

30.11.2024 19:58 β€” πŸ‘ 12    πŸ” 5    πŸ’¬ 1    πŸ“Œ 0

Added!

28.11.2024 15:29 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Hi, so I've spent the past almost-decade studying research uses of public social media data, like e.g. ML researchers using content from Twitter, Reddit, and Mastodon.

Anyway, buckle up this is about to be a VERY long thread with lots of thoughts and links to papers. 🧡

27.11.2024 15:33 β€” πŸ‘ 966    πŸ” 453    πŸ’¬ 59    πŸ“Œ 125

Added!

27.11.2024 01:29 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Human-Centered Eval@EMNLP24

Had a lot of fun teaching a tutorial on Human-Centered Evaluation of Language Technologies at #EMNLP2024, w/ @ziangxiao.bsky.social, Su Lin Blodgett, and Jackie Cheung

We just posted the slides on our tutorial website: human-centered-eval.github.io

26.11.2024 20:55 β€” πŸ‘ 13    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0
Post image

🚨 NeurIPS 2024 Spotlight
Did you know we lack standards for AI benchmarks, despite their role in tracking progress, comparing models, and shaping policy? 🀯 Enter BetterBench–our framework with 46 criteria to assess benchmark quality: betterbench.stanford.edu 1/x

25.11.2024 19:02 β€” πŸ‘ 139    πŸ” 25    πŸ’¬ 5    πŸ“Œ 7

It turns out we had even more papers at EMNLP!

Let's complete the list with three more🧡

24.11.2024 02:17 β€” πŸ‘ 14    πŸ” 4    πŸ’¬ 1    πŸ“Œ 1

Our lab members recently presented 3 papers at @emnlpmeeting.bsky.social in Miami β˜€οΈ πŸ“œ

From interpretability to bias/fairness and cultural understanding -> 🧡

23.11.2024 20:35 β€” πŸ‘ 19    πŸ” 6    πŸ’¬ 1    πŸ“Œ 2

Added!

23.11.2024 22:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Added!

23.11.2024 16:27 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Added!

23.11.2024 15:44 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Added!

23.11.2024 13:49 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
tiny owl

tiny owl

my first post, now that I am here with my 500+ closest friends πŸ™‚ -- here is a tiny owl πŸ¦‰ I met some weeks back in the big apple 🍎 (picture by @sbucur.bsky.social)

22.11.2024 23:19 β€” πŸ‘ 11    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Added!

23.11.2024 01:44 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

McGill NLP just landed on this blue planet

bsky.app/profile/mcgi...

22.11.2024 17:17 β€” πŸ‘ 8    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

The starter pack just surpassed 1/3 of its capacity! Don't be shy to reach out to me if you are a researcher in this area, or if you have suggestions. Thank you πŸ₯°

23.11.2024 01:10 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@liuyulu is following 19 prominent accounts