πRead the preprint: agentrxiv.github.io
Try out AgentRxiv: github.com/SamuelSchmid...
Letβs explore how agents can accelerate researchβtogether.
π§΅8/8
@samuelschmidgall.bsky.social
PhD at Johns Hopkins University and Researcher at Google Deepmind working on LLM agents
πRead the preprint: agentrxiv.github.io
Try out AgentRxiv: github.com/SamuelSchmid...
Letβs explore how agents can accelerate researchβtogether.
π§΅8/8
π‘οΈResearch agents and their labs, while promising, are still not at human-level quality. By channeling their work into AgentRxivβa dedicated hub for autonomous researchβweβre also safeguarding the quality of human research on arXiv.
π§΅7/8
β¨ In parallel experiments with 3 independent labs sharing pre-prints through AgentRxiv, the best method achieved 79.8% accuracyβa 13.7% relative improvementβwhile reaching key milestones faster than in sequential experiments.
π§΅6/8
π₯ We also wondered how well the methods our agents discovered perform on out-of-domain benchmarks (MMLU-Pro, GPQA, & MedQA) and with five other language models. We find the top performing algorithm SDA improves across these benchmarks on average by 3.3%.
π§΅5/8
π₯We perform experiments where agents are asked to develop new reasoning techniques on MATH-500. We find that when agents are given access to previous research, accuracy improved from 70.2% to 78.2% β an 11.4% relative improvement over the gpt-4o mini baseline and 9.7% over gpt-4o mini with CoT.
π§΅4/8
To address this, we introduce AgentRxivβa framework that lets LLM agent laboratories upload and retrieve reports from a shared preprint server in order to collaborate, share insights, and iteratively build on each otherβs research.
π§΅3/8
There has been a lot of recent excitement around autonomous LLM agents performing research, with several fully autonomous works being accepted into ICLR 2025 π
βΌοΈThe problem is that these systems work in isolation without the ability to build on their research.
π§΅2/8
ππIntroducing AgentRxiv: a framework where autonomous research agents can upload, retrieve, and build on each otherβs research.
AgentRxiv takes your research direction and progressively outputs research papers and code repositories, building on its previous work with each new paper!
π§΅
π©βπ» All of the code is completely open-source! Below are links to the website, paper, and github! Check it out.
website: agentlaboratory.github.io
paper: arxiv.org/pdf/2501.04227
github: github.com/SamuelSchmidgaβ¦
Agent Laboratory consists of three primary phases that guide the research process: (1) Literature Review, (2) Experimentation, and (3) Report Writing. During each phase, LLM agents collaborative, integrating tools like arXiv, Hugging Face, Python, and LaTeX.
27.02.2025 17:25 β π 1 π 0 π¬ 1 π 0ππ¬ Introducing Agent Laboratory: an assistant for automating machine learning research
Agent Laboratory takes your research ideas and outputs a research paper and code repository, allowing you to allocate more effort toward ideation rather than low-level coding and writing [Re-sharing from X]
π₯ Really great overview of Agent Laboratory by Two Minute Papers
video: youtu.be/2ky50XT0Nb0?...
agent lab webpage: agentlaboratory.github.io
These arenβt totally hypothetical questions. Currently, the US is in the process of trashing its wildly successful science funding system. NIH, which funds tens of billions of dollars of research each year, has been estimated to generate around $2.50 of economic activity for every $1 funded:
23.02.2025 10:16 β π 485 π 226 π¬ 14 π 12I'm excited to start as a Student Researcher at Google DeepMind working on medical AI!
27.12.2024 23:07 β π 2 π 0 π¬ 0 π 0An LLM that makes decisions that have consequences in an external environment with temporal dependencies (?)
04.12.2024 21:56 β π 0 π 0 π¬ 0 π 0Hello! The sky is so blue βοΈβοΈπ¦
25.11.2024 03:47 β π 1 π 0 π¬ 0 π 0Lol true
24.11.2024 23:30 β π 0 π 0 π¬ 0 π 0Hello. Please add me as well!! π
24.11.2024 00:13 β π 0 π 0 π¬ 0 π 0