@josephc.bsky.social @mariaa.bsky.social and I are at poster #21
findings from large scale survey of 800 researchers on how they use LMs in their research #colm2025
Joint work with Amanpreet Singh @arnaik19.bsky.social @sergeyf.bsky.social @paopow.bsky.social @kylelo.bsky.social @dougdowney.bsky.social Dan Weld and more. Please check it out and give us feedback : )
Can AI really help with literature reviews? 🧐
Meet Ai2 ScholarQA, an experimental solution that allows you to ask questions that require multiple scientific papers to answer. It gives more in-depth and contextual answers with table comparisons and expandable sections 💡
Try it now: scholarqa.allen.ai
🤔Giving complex tasks to AI agents is easy—getting them to do exactly what you want isn’t. How can human-AI collaboration give us more reliable & steerable agents?
🍫Introducing Cocoa, our new interaction paradigm for balancing human & AI agency in complex human-AI workflows. 🧵
How do scholars of diff backgrounds use LLMs as research tools? How do we perceive the risks & benefits of this new practice? Are we willing to disclose to peers and reviewers?
We conducted a large-scale survey of verified authors of different fields, race, gender, seniority to find out - results🧵
I'm recruiting 1-2 PhD students to work with me at the University of Colorado Boulder! Looking for creative students with interests in #NLP and #CulturalAnalytics.
Boulder is a lovely college town 30 minutes from Denver and 1 hour from Rocky Mountain National Park 😎
Apply by December 15th!
1/ Introducing ᴏᴘᴇɴꜱᴄʜᴏʟᴀʀ: a retrieval-augmented LM to help scientists synthesize knowledge 📚
@uwnlp.bsky.social & Ai2
With open models & 45M-paper datastores, it outperforms proprietary systems & match human experts.
Try out our demo!
openscholar.allen.ai
Excited to share ✨ Contextualized Evaluations ✨!
Benchmarks like Chatbot Arena contain underspecified queries, which can lead to arbitrary eval judgments. What happens if we provide evaluators with context (e.g who's the user, what's their intent) when judging LM outputs? 🧵↓
Lit reviews often involves comparing sets of papers using common aspects in tables or spreadsheets -
@benn9.bsky.social & Yoonjoo Lee's #EMNLP paper explored ways to create such tables using LLMs, and how to evaluate them against a large set of lit review tables we extracted from arXiv.
🚨 #CHI24 Paper Alert! 🚨
We introduce #meronymity, a novel design paradigm to mitigate social barriers in public social interactions by revealing aspects of identity to balance credibility & privacy. @axz.bsky.social @jbragg.bsky.social @josephc.bsky.social @karger.bsky.social
Yes, pretty annoying that you can’t do both. I end up using category tabs for my personal and priority for my work email account 🤷♂️
You could! It’s called priority inbox in the settings
Larrabee state park sunset
Pretty sure I don’t follow enough people here because my feed is 90% @cats.bsky.social now 😆
Make sure to visit the Samish overlook! It’s a little bit of a detour from the main trail but very worth it. You could also drive there to skip the first 1/3 of the main trail.
Oyster Dome trail! The food is from Tylor Shellfish, they have a farm there!
Oyster Dome sunset 🙂