Thanks to my collaborators Sophia Hager, Adi Asija, Nick Andrews, and @danielkhashabi.bsky.social at @jhuclsp.bsky.social !
Arxiv: arxiv.org/abs/2508.11027
Code: github.com/JHU-CLSP/hell-or-high-water
(Data coming soon!)
19.09.2025 14:06 — 👍 0 🔁 0 💬 0 📌 0
More tools = worse at handling tool failures
When tool schemas are provided in-context, we find that performance gaps between adversarial and non-adversarial settings increases with the number of schemas.
19.09.2025 14:05 — 👍 0 🔁 0 💬 1 📌 0
LLM agents do not handle tool failures well
With RAG on tool schemas, we observe a substantial performance gap between adversarial and non-adversarial settings.
19.09.2025 14:04 — 👍 0 🔁 0 💬 1 📌 0
Tools break in the real world all the time, but not much attention has been given to how well LLMs deal with tool failures.
We introduce HOHW, a tool-use benchmark where problems remain solvable even when tools break adversarially.
19.09.2025 14:04 — 👍 1 🔁 1 💬 1 📌 0
PhD student @jhuclsp. Previously @AIatMeta, @InriaParisNLP, @EM_LCT| #NLProc
PhD student @ Johns Hopkins
wielding Natural Language Processing (NLP) systems for greater linguistic accessibility in online spaces
Assist. Prof.@OhioState, co-director OSU NLP. I like to think about intelligence and manifest it into language agents
CS PhD JohnsHopkins | Ex NLProc @ Genentech | Information Seeking | Disinformation Agents | Copilots for Social Good | PhD
@JHUCLSP @JHUMCEH
#NLProc
Doctor of NLP/Vision+Language from UCSB
Evals, metrics, multilinguality, multiculturality, multimodality, and (dabbling in) reasoning
https://saxon.me/
Stanford CS PhD. Prev: undergrad at @jhuclsp.bsky.social
PhD student at JHU
https://aleemkhan62.github.io
PhD student at Johns Hopkins CLSP (@jhuclsp.bsky.social).
Researching natural and formal language processing.
williamjurayj.com
PhD student at JHU. @Databricks MosaicML, Microsoft Semantic Machines/Translate, Georgia Tech. I like datasets!
https://marcmarone.com/
Postdoc @UNC working on NLP, AI, and computational linguistics. Formerly PhD student @JHU and undergrad @McGill
esteng.github.io
Human-centered AI #HCAI, NLP & ML. Director TRAILS (Trustworthy AI in Law & Society) and AIM (AI Interdisciplinary Institute at Maryland). Formerly Microsoft Research NYC. Fun: 🧗🧑🍳🧘⛷️🏕️. he/him.
Professor at UW; Researcher at Meta. LMs, NLP, ML. PNW life.
Second year CS PhD student @notredame.bsky.social | Intern: Amazon | Prev: @jhuclsp.bsky.social
https://yining610.github.io/
The Language Technologies Institute in Carnegie Mellon University's @scsatcmu.bsky.social
lti.cmu.edu
Stanford Linguistics and Computer Science. Director, Stanford AI Lab. Founder of @stanfordnlp.bsky.social . #NLP https://nlp.stanford.edu/~manning/
Professor, Programmer in NYC.
Cornell, Hugging Face 🤗
Parker Distinguished Professor, @UNC. Program Chair #EMNLP2024. Director http://MURGeLab.cs.unc.edu (@uncnlp). @Berkeley_AI @TTIC_Connect @IITKanpur
#NLP #CV #AI #ML
https://www.cs.unc.edu/~mbansal/