YouTube video by Women in AI Research WiAIR
Generalization in AI, with Dr. Dieuwke Hupkes
π§ Hear Dr. Hupkes discuss her work on GenBench and how consistency, generalization, and reasoning shape our understanding of LLMs.
π¬ YouTube: www.youtube.com/watch?v=CuTW...
ποΈ Apple Podcasts: podcasts.apple.com/ca/podcast/w...
π§ Spotify: open.spotify.com/show/51RJNlZ...
#WiAIR #NLP #WomenInAI
18.07.2025 16:11 β π 1 π 1 π¬ 0 π 0
ποΈ New Episode Out Now!
Weβre thrilled to announce that the latest episode of the
@wiair.bsky.social is live!
This week, we sit down with Dr. Angelica Lim, Ph.D., to talk about "Robots with Empathy".
#AI #EthicalAI #SocialRobotics #HumanCenteredAI #WiAIR
14.05.2025 15:48 β π 3 π 1 π¬ 1 π 1
Read this if you're new to academic conferences or if you'd just like a bit of helpful advice on how to make friends at conferences (as opposed to a formal "networking ")
03.05.2025 01:28 β π 3 π 1 π¬ 0 π 0
It is critical for scientific integrity that we trust our measure of progress.
The @lmarena.bsky.social has become the go-to evaluation for AI progress.
Our release today demonstrates the difficulty in maintaining fair evaluations on the Arena, despite best intentions.
30.04.2025 14:55 β π 40 π 9 π¬ 3 π 4
SUPER thrilled that our #NAACL2025 paper got the runnerup BEST paper award πππππππ
We show that people rely 30% more on LLMs when they use emphatic expressions (eg "Sure, happy to help") even though the answer is wrong and 10% more when the task involves math questions π΅
π arxiv.org/pdf/2407.07950
30.04.2025 15:16 β π 20 π 2 π¬ 0 π 0
YouTube video by Women in AI Research WiAIR
Responsible AI for Health, with Aparna Balagopalan
π Our new episode is LIVE! ποΈ
In Episode 3, we talk with @aparnabee.bsky.social about:
π₯β οΈ Unique challenges of applying AI in medical contexts
ππ§π½βπ€βπ§π» Data quality and bias
π©ββοΈπ©Ί Importance of collaboration with clinicians
Watch and subscribe!
youtu.be/DEdJltlFg4I
#MLforHealth #WiAIR #WomenInAI
23.04.2025 15:40 β π 2 π 2 π¬ 1 π 0
The latest open artifacts (#9): RLHF book draft, where the open reasoning race is going, and unsung heroes of open LM work
Artifacts Log 9.
The latest happenings in open models
- Eagerly awaiting Qwen 3
- Llama 4 uptake is slow
- Reasoning models seem to be saturating
- Multimodal models are being slept on
- China is still dominating
- Oh yeah, and a reminder that my RLHF book online version0 is done!
Artifacts Log #9.
buff.ly/F6lapGF
21.04.2025 16:43 β π 35 π 7 π¬ 1 π 0
Glad to share that our publication was recognized as the Top Viewed Article.
Read it here alz-journals.onlinelibrary.wiley.com/doi/full/10....
16.04.2025 08:45 β π 0 π 0 π¬ 0 π 0
Proud to be a part of this multi-cultural multi-institutional collaborative project
10.04.2025 20:42 β π 0 π 0 π¬ 0 π 0
Reasoning models don't always say what they think
Research from Anthropic on the faithfulness of AI models' Chain-of-Thought
OpenAI: "Users have told us that understanding how the model reasons ... helps build trust in its answers."
Anthropic: "Do reasoning models accurately verbalize their reasoning? Our new paper shows they don't."
www.anthropic.com/research/rea...
04.04.2025 22:41 β π 136 π 33 π¬ 5 π 3
Don't miss this episode! It's going to be an interesting discussion about social and ethical implications of biased AI, and how researchers are working to create fair and inclusive systems
28.03.2025 15:49 β π 0 π 0 π¬ 0 π 0
Logo of the Women in AI Research WiAIR podcast
Following up on my last post - it's time for the big reveal! π
Thrilled to announce that @malikeh97.bsky.social and I are launching a podcast called Women in AI Research! We're excited to bring you inspiring stories from women in AI.
Follow @wiair.bsky.social for all the updates
#womeninai
05.03.2025 16:37 β π 0 π 0 π¬ 0 π 0
Big announcement coming up! My friend @malikeh97.bsky.social and I have been working on something very special. Can't wait to reveal what we have been up to. Stay tuned for more info! π
#WomenInAI
03.03.2025 14:29 β π 0 π 0 π¬ 0 π 0
Sounds like the way back to the closet..
26.02.2025 03:10 β π 1 π 0 π¬ 0 π 0
I am not into sports and not a hockey fan. But this time, I am very glad about the outcome of this game. Go Canada! π¨π¦π¨π¦π¨π¦ππ
21.02.2025 17:53 β π 1 π 0 π¬ 0 π 0
CohereForAI/include-base-44 Β· Datasets at Hugging Face
Weβre on a journey to advance and democratize artificial intelligence through open source and open science.
Results from evaluating 15 models on INCLUDE reveal stark performance variations among languages, emphasizing the need for equitable AI tools.
Public release of INCLUDE encourages further research on fair and inclusive AI.
Dataset: huggingface.co/datasets/Coh...
/4
23.01.2025 16:07 β π 0 π 0 π¬ 0 π 0
INCLUDE is the largest multilingual benchmark of its kind, containing 197,243 MCQA pairs from 1,926 examinations across 44 languages and 15 scripts coming from 52 countries.
/3
23.01.2025 16:07 β π 0 π 0 π¬ 1 π 0
LLMs hold immense potential, but performance disparities across languages limit their global impact. INCLUDE is a large multilingual language understanding
benchmark that includes regional educational, professional, and practical tests collected by native speakers.
/2
23.01.2025 16:07 β π 0 π 0 π¬ 1 π 0
Our paper is accepted to ICLR!
INCLUDE: Evaluating Multilingual LLMs with Regional Knowledge (arxiv.org/abs/2411.19799)
A benchmark of ~200k QA pairs across 44 languages, capturing real-world cultural nuances.
A collaborative effort led by @cohereforai.bsky.social, with contributors worldwide.
/1
23.01.2025 16:07 β π 11 π 4 π¬ 1 π 0
Very interesting paper about unlearning for AI Safety, a subject that deserves more attention. β¬οΈ
11.01.2025 15:11 β π 50 π 7 π¬ 0 π 0
Noteworthy AI Research Papers of 2024 (Part One)
Six influential AI papers from January to June
Happy New Year! To kick off the year, I've finally been able to format and upload the draft of my AI Research Highlights of 2024 article.
It covers a variety of topics, from mixture-of-experts models to new LLM scaling laws for precision:
01.01.2025 14:12 β π 67 π 17 π¬ 2 π 0
Multimodal LLMs | Notion
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Last month I attended the #NeurIPS2024 conference in Vancouver. Now that I'm home, I'd like to reflect on all the interesting works I encountered at the conference.
Part 1 is about multimodal #LLM, next parts coming soon.
typhoon-mirror-155.notion.site/Multimodal-L...
03.01.2025 21:40 β π 0 π 0 π¬ 0 π 0
β³Submission deadline: Feb 17, 2025
ποΈWorkshop date: April 26-May 1, 2025 (TBD)
π Join us in Yokohama, Japan (also hybrid)
Submit your work and help shape the future of LLMs!
03.01.2025 02:07 β π 0 π 0 π¬ 0 π 0
This year's theme "Mind the Context" invites participants to explore how LLMs are used and evaluated in specific contexts, such as e.g. LLM applications in mental wellness care, or translation in high-stakes scenarios.
03.01.2025 02:07 β π 0 π 0 π¬ 1 π 0
Excited to co-organize the HEAL workshop at
@acm_chi
2025!
HEAL addresses the "evaluation crisis" in LLM research and brings HCI and AI experts together to develop human-centered approaches to evaluating and auditing LLMs.
π heal-workshop.github.io
#NLProc #LLMeval #LLMsafety
03.01.2025 02:07 β π 1 π 0 π¬ 1 π 0
Asst Prof at Johns Hopkins Cognitive Science β’ Director of the Group for Language and Intelligence (GLINT) β¨β’ Interested in all things language, cognition, and AI
jennhu.github.io
Interested in ML, comp bio, immunology, and just about anything one hop away from either.
Postdoc in AI at the Allen Institute for AI & the University of Washington.
π https://valentinapy.github.io
PhD student @MIT. Previously: @Uoft/@VectorInst, @winterlightlabs, and @IITGuwahati
Ph.D. student at University of Washington CSE. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . βοΈ π πββοΈπ§ββοΈπ³
Postdoc @vectorinstitute.ai | organizer @queerinai.com | previously MIT, CMU LTI | π rodent enthusiast | she/they
π https://ryskina.github.io/
Engineer, roboethicist and pro-feminist. Interested in robots as working models of life, evolution, intelligence and culture. Prof Robot Ethics, Bristol Robotics Lab. Home page: https://people.uwe.ac.uk/Person/AlanWinfield
Waiting on a robot body. All opinions are universal and held by both employers and family.
Literally a professor. Recruiting students to start my lab.
ML/NLP/they/she.
Building personalized Bluesky feeds for academics! Pin Paper Skygest, which serves posts about papers from accounts you're following: https://bsky.app/profile/paper-feed.bsky.social/feed/preprintdigest. By @sjgreenwood.bsky.social and @nkgarg.bsky.social
@Cohere.com's non-profit research lab and open science initiative that seeks to solve complex machine learning problems. Join us in exploring the unknown, together. https://cohere.com/research
Incoming faculty at the Max Planck Institute for Software Systems
Postdoc at UW, working on Natural Language Processing
Recruiting PhD students!
π https://lasharavichander.github.io/
Research Scientist at Ai2, PhD in NLP π€ UofA. Ex
GoogleDeepMind, MSFTResearch, MilaQuebec
https://nouhadziri.github.io/
Book: https://thecon.ai
Web: https://faculty.washington.edu/ebender
WiAIR is dedicated to celebrating the remarkable contributions of female AI researchers from around the globe. Our goal is to empower early career researchers, especially women, to pursue their passion for AI and make an impact in this exciting field.
Machine Learning Research Scientist, MScAC | ML Researcher at Vector Institute | Co-Host of Women in AI Research (WiAIR) Podcast | 5+ Years of Industry Experience | University of Toronto Alumnus
Partner at SAMMAN | President of the European Network for Women In Leadership | Board member of IRIS | Board member of EACC
computers and music are (still) fun
Listening to Plants: Sonic Atlas to (Re)Discover Other Intelligences
#Art #Science #AI as a bridge between plants and humans
@ibmcp.bsky.social + VRAIN + @tallerestampa.bsky.social
Computing Science prof in multimodal embodied AI, emotion, interaction at SFU in Vancouver π¨π¦π΅π Director of the Rosie Lab www.rosielab.ca Robotics nerd. Previously at SoftBank Robotics π€ FR/JP
Our mission is to raise awareness of queer issues in AI, foster a community of queer researchers and celebrate the work of queer scientists. More about us: queerinai.com