Do LLMs need rationales for learning from mistakes? π€
When LLMs learn from previous incorrect answers, they typically observe corrective feedback in the form of rationales explaining each mistake. In our new preprint, we find these rationales do not help, in fact they hurt performance!
π§΅
13.02.2025 15:38 β π 21 π 9 π¬ 1 π 3
are you guys interested in research interns at this stage as well?
24.01.2025 00:39 β π 0 π 0 π¬ 1 π 0
Not that you need another thread on Deepseek's R1, but I really enjoy these models, and it's great to see an *open*, MIT-licensed reasoner that's ~as good as OpenAI o1.
A blog post: itcanthink.substack.com/p/deepseek-r...
It's really very good at ARC-AGI for example:
22.01.2025 22:01 β π 40 π 7 π¬ 2 π 0
LLM360 gets way less recognition relative to the quality of their totally open outputs in the last year+. They dropped a 60+ page technical report last week and I don't know if I saw anyone talking about it. Along with OLMo, it's the other up to date open-source LM.
Paper: https://buff.ly/40I6s4d
23.01.2025 02:37 β π 45 π 5 π¬ 1 π 0
Thank you for posting this work.
We are finding very similar findings for LLM Agent research.
Would anyone be interested in a collaboration on reproducibility on that?
22.01.2025 10:36 β π 1 π 0 π¬ 1 π 0
#NLP #LLMAgents Community, I have a question:
I have been running Webshop with older GPTs, e.g. gpt-3.5-turbo-1106 / -0125 / -instruct). On 5 different code repos (ReAct, Reflexion, ADaPT, StateAct) I am getting scores of 0%, while previously the scores where at ~15%.
Any thoughts anyone?
21.01.2025 10:36 β π 0 π 1 π¬ 0 π 0
#NLP #LLMAgents Community, I have a question:
I have been running Webshop with older GPTs, e.g. gpt-3.5-turbo-1106 / -0125 / -instruct). On 5 different code repos (ReAct, Reflexion, ADaPT, StateAct) I am getting scores of 0%, while previously the scores where at ~15%.
Any thoughts anyone?
21.01.2025 10:36 β π 0 π 1 π¬ 0 π 0
Hey,
I work on LLM agents, if that qualifies. Please add me as well. Thanks.
28.11.2024 01:57 β π 0 π 0 π¬ 0 π 0
Posting a call for help: does anyone know of a good way to simultaneously treat both POTS and MΓ©niΓ¨reβs disease? Please contact me if youβre either a clinician with experience doing this or a patient who has found a good solution. Context in thread
24.11.2024 16:34 β π 128 π 72 π¬ 15 π 6
The OLMo 2 models sit at the Pareto frontier of training FLOPs vs model average performance.
Meet OLMo 2, the best fully open language model to date, including a family of 7B and 13B models trained up to 5T tokens. OLMo 2 outperforms other fully open models and competes with open-weight models like Llama 3.1 8B β As always, we released our data, code, recipes and more π
26.11.2024 20:51 β π 151 π 36 π¬ 5 π 12
Hi there,
Please add me as well. I'm a PhD student on LLM agents at Imperial College London
25.11.2024 23:30 β π 0 π 0 π¬ 0 π 0
That's really interesting and perhaps find some roots in that numbers come from the arabic script?
What are your thoughts on that?
24.11.2024 11:37 β π 2 π 0 π¬ 2 π 0
Hey, I would love to be added too. I work on LLM Agents, and worked on Bayesian Exploration in RL.
24.11.2024 10:21 β π 1 π 0 π¬ 0 π 0
Hey, thanks for the group. I would love to be added too. I'm a PhD on LLM Agents at Imperial College London.
24.11.2024 10:15 β π 0 π 0 π¬ 0 π 0
I would love to be in that one too :))
23.11.2024 19:50 β π 0 π 0 π¬ 0 π 0
Pretty cool people are being added to the LLM Agent & LLM Reasoning group. Thanks @lisaalaz.bsky.social for suggesting @jhamrick.bsky.social @gabepsilon.bsky.social and others.
Feel free to mention yourself and others. :)
go.bsky.app/LUrLWXe
#LLMAgents #LLMReasoning
23.11.2024 19:36 β π 10 π 1 π¬ 9 π 0
π ;)
23.11.2024 19:33 β π 1 π 0 π¬ 0 π 0
Definitely, done.
23.11.2024 19:33 β π 1 π 0 π¬ 0 π 0
Done :)
23.11.2024 19:32 β π 0 π 0 π¬ 0 π 0
Sure thing.
21.11.2024 19:38 β π 1 π 0 π¬ 0 π 0
Thanks, done.
21.11.2024 19:38 β π 1 π 0 π¬ 0 π 0
Done :)
21.11.2024 19:38 β π 1 π 0 π¬ 0 π 0
#EMNLP2024 was a fun time to reconnect with old friends and meet new ones! Reflecting on the conference program and in-person discussions, I believe we're seeing the "Google Moment" to #IR research play out in #NLProc.
1/n
21.11.2024 13:38 β π 15 π 3 π¬ 1 π 0
I thought to create a Starter Pack for people working on LLM Agents. Please feel free to self-refer as well.
go.bsky.app/LUrLWXe
#LLMAgents #LLMReasoning
20.11.2024 14:08 β π 15 π 5 π¬ 11 π 0
I thought to create a Starter Pack for people working on LLM Agents. Please feel free to self-refer as well.
go.bsky.app/LUrLWXe
#LLMAgents #LLMReasoning
20.11.2024 14:08 β π 15 π 5 π¬ 11 π 0
Meta-Reasoning Improves Tool Use in Large Language Models
External tools help large language models (LLMs) succeed at tasks where they would otherwise typically fail. In existing frameworks, LLMs learn tool use either by in-context demonstrations or via full...
Hi Bluesky, would like to introduce myself π
I am PhD-ing at Imperial College under @marekrei.bsky.socialβs supervision. I am broadly interested in LLM/LVLM reasoning & planning π€ (hereβs our latest work arxiv.org/abs/2411.04535)
Do reach out if you are interested in these (or related) topics!
20.11.2024 11:26 β π 40 π 3 π¬ 0 π 0
Welcome to Bluesky to more of our NLP researchers at Imperial!! Looking forward to following everyone's work on here.
To follow us all click 'follow all' in the starter pack below
go.bsky.app/Bv5thAb
20.11.2024 08:35 β π 20 π 7 π¬ 3 π 0
We are a joint partnership of University of TΓΌbingen and Max Planck Institute for Intelligent Systems. We aim at developing robust learning systems and societally responsible AI. https://tuebingen.ai/imprint
https://tuebingen.ai/privacy-policy#c1104
Professor, University of TΓΌbingen @unituebingen.bsky.social.
Head of Department of Computer Science π.
Faculty, TΓΌbingen AI Center π©πͺ @tuebingen-ai.bsky.social.
ELLIS Fellow, Founding Board Member πͺπΊ @ellis.eu.
CV π·, ML π§ , Self-Driving π, NLP πΊ
ML PhD Student @ Uni. of Edinburgh, working on Multi-Agent Problems. | Organiser @deeplearningindaba.bsky.socialβ¬ @rl-agents-rg.bsky.socialβ¬ | πͺπΉπΏπ¦
kaleabtessera.com
Mathematician at UCLA. My primary social media account is https://mathstodon.xyz/@tao . I also have a blog at https://terrytao.wordpress.com/ and a home page at https://www.math.ucla.edu/~tao/
PhD candidate at NYU
lexipalmer13.github.io/
Assistant Professor of Sociology, NYU. Core Faculty, CSMaP. Research Fellow Oxford Sociology. Computational social science, Methods, Conflict, Communication. Webpage: cjbarrie.com
PhD supervised by Tim RocktΓ€schel and Ed Grefenstette, part time at Cohere. Language and LLMs. Spent time at FAIR, Google, and NYU (with Brenden Lake). She/her.
Sentence Transformers, SetFit & NLTK maintainer
Machine Learning Engineer at π€ Hugging Face
Ginni Rometty Prof @NorthwesternCS | Fellow @NU_IPR | Uncertainty + decisions | Humans + AI/ML | Blog @statmodeling
Research Scientist at DeepMind. Opinions my own. Inventor of GANs. Lead author of http://www.deeplearningbook.org . Founding chairman of www.publichealthactionnetwork.org
The Thirty-Eighth Annual Conference on Neural Information Processing Systems will be held in Vancouver Convention Center, on Tuesday, Dec 10 through Sunday, Dec 15.
https://neurips.cc/
Building AI Agent Marketplace and Landscape Map
https://aiagentsdirectory.com/landscape
Breakthrough AI to solve the world's biggest problems.
βΊ Join us: http://allenai.org/careers
βΊ Get our newsletter: https://share.hsforms.com/1uJkWs5aDRHWhiky3aHooIg3ioxm
I like tokens! Lead for OLMo data at @ai2.bsky.social (Dolma π) w @kylelo.bsky.social. Open source is fun π€βοΈππ³οΈβπ Opinions are sampled from my own stochastic parrot
more at https://soldaini.net
Research Scientist Meta/FAIR, Prof. University of Geneva, co-founder Neural Concept SA. I like reality.
https://fleuret.org
Canadian in Taiwan. Emerging tech writer, and analyst with a flagship Newsletter called A.I. Supremacy reaching 115k readers
Also watching Semis, China, robotics, Quantum, BigTech, open-source AI and Gen AI tools.
https://www.ai-supremacy.com/archive
AI researcher at Google DeepMind. Synthesized views are my own.
πSF Bay Area π http://jonbarron.info
This feed is a partial mirror of https://twitter.com/jon_barron
PhD-ing Clip@UMD
https://houyu0930.github.io/
Senior Research Scientist @MBZUAI. Focused on decision making under uncertainty, guided by practical problems in healthcare, reasoning, and biology.
PhD Student at the ILLC / UvA doing work at the intersection of (mechanistic) interpretability and cognitive science. Current Anthropic Fellow.
hannamw.github.io