Matthew Kenney's Avatar

Matthew Kenney

@baykenney.bsky.social

Founder - Algorithmic Research Group. Previously: Senior ML Engineer at Apple, Asst. Research Prof at Duke University, Duke Data Science | PSU '15 | Cornell '11

1,325 Followers  |  1,216 Following  |  34 Posts  |  Joined: 07.05.2023
Posts Following

Posts by Matthew Kenney (@baykenney.bsky.social)

Preview
Algorithmic Research Group on X: "At ARG, we're laser-focused on understanding recursive self-improvement. We're confident that as models scale, RSI will accelerate the frontier of AI at ever-increasing speeds. Over the past year, we've created benchmarks, agents, and AI systems to measure how this might happen. https://t.co/JyOPFSB8DJ" / X At ARG, we're laser-focused on understanding recursive self-improvement. We're confident that as models scale, RSI will accelerate the frontier of AI at ever-increasing speeds. Over the past year, we've created benchmarks, agents, and AI systems to measure how this might happen. https://t.co/JyOPFSB8DJ

Very excited to launch this little tool that we’ve been building. ScoutML is an API built for AI researchers and agents that includes a ton of metadata on each paper. It’s been super helpful for us as we run our research agents internally.
x.com/algoresearch...

29.07.2025 19:09 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
ProspectML | A research assistant that helps researchers generate insights and code to accelerate their work. A research assistant that helps researchers generate insights and code to accelerate their work.

If you're working on ML and this resonates, I’d love to hear what you'd want it to do. We're opening up a limited beta. Link below: prospectml.com

19.05.2025 13:42 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

It’s built on top of a foundation of parsed metadata from papers, code, and reposβ€”models, metrics, datasets, SOTA claims, GPU counts (and types), ablation studies, citations, etc. It’s already become crucial to our internal research, and we hope it can be helpful to others, too.

19.05.2025 13:42 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

It’s designed to support that murky, nonlinear part of the research process, where you're still figuring out what's interesting.

19.05.2025 13:42 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

You give it a question like β€œHow can we improve generalization in low-resource RL?” and it returns distilled insights, speculative ideas, and experimental code. Not final answers, just something to push the thinking forward.

19.05.2025 13:42 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Most of the time, I end up manually digging through papers, chasing links, and piecing together ideas. It works, but it’s slow, and it doesn’t scale with curiosity. I’ve been trying to fix that with a platform we're building called ProspectML.

19.05.2025 13:42 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

A lot of ML tools help you implement. Not many help you think.

When I’m exploring a new research direction, I don’t want another search engine or citation graph. I want something that’s actually read the literature, can suggest promising directions, and helps me reason through tradeoffs.

19.05.2025 13:42 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

hello world!

11.01.2025 22:53 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

ARG is on Bluesky! Please follow here: @algoresearch.bsky.social

11.01.2025 22:55 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Recommendations for Technical AI Safety Research Directions

good post on 2025 ai safety research directions:
alignment.anthropic.com/2025/recomme...

11.01.2025 22:48 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Back in Pennsylvania, drinking schuylkill county coal cracker (boilo) and making pierogies

26.12.2024 22:07 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

That’s because it was from 2022

30.11.2024 03:40 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

AI for science could be more impactful than chatbots. It is already helping win Nobel prizes and accelerating drug development and materials discovery.
Today we published an essay about it: why it matters, how it’s happening and its implications. Here is a summary from an econ / social sci lens.

26.11.2024 10:39 β€” πŸ‘ 79    πŸ” 30    πŸ’¬ 2    πŸ“Œ 7

Important point that the open protocol makes extracting data from bluesky easy. Can't have it both ways. I like the protocol and think this site is well designed, but that means anyone can and will analyze these posts (if there is value to them, which I'm honestly less convinced of than some)

28.11.2024 19:06 β€” πŸ‘ 11    πŸ” 2    πŸ’¬ 0    πŸ“Œ 1
28.11.2024 19:04 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

A dataset of 1 million or 2 million Bluesky posts is completely irrelevant to training large language models.

The primary usecase for the datasets that people are losing their shit over isn't ChatGPT, it's social science research and developing systems that improve Bluesky.

28.11.2024 18:57 β€” πŸ‘ 251    πŸ” 39    πŸ’¬ 8    πŸ“Œ 5

Wait what even is this platform. This is insane

28.11.2024 18:54 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

What! If it works for umap-learn vs umap i’m in.

25.11.2024 10:00 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
AlgorithmicResearchGroup (Algorithmic Research Group) Org profile for Algorithmic Research Group on Hugging Face, the AI community building the future.

huggingface.co/AlgorithmicR...

25.11.2024 09:37 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I have a community project in Eleuther and open source all of my research:
bsky.app/profile/bayk...

25.11.2024 09:36 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Jk the rest are great. Just a big uncle nearest fan

25.11.2024 03:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

5
.
.
.
.
.
.
.
.
3
2
4
1

25.11.2024 03:12 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We welcome PRs, contributions, additional tasks, and task revisions. Excited to see how agents perform on this benchmark.

24.11.2024 20:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

We develop a baseline agent, with tools for coding, research (via Semantic Scholar), and model training, built on top of Sonnet 3.5 and GPT-4o. Our baseline agent performs well across tasks, but generally fails to move beyond baseline implementations.

24.11.2024 20:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

ML Research Bench adapts tasks from ML conference competitions like β€˜NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day’ and β€˜LLM Merging Competition’. We prompt agents to complete these challenging tasks. These tasks move beyond simple ML tasks.

24.11.2024 20:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
ML Research Benchmark Artificial intelligence agents are increasingly capable of performing complex tasks across various domains. As these agents advance, there is a growing need to accurately measure and benchmark their c...

(re-posting from X)

Can we get AI to accelerate AI research and development?

I’m excited to release ML Research Benchmark, an agentic benchmark of 7 ML conference competition tasks.

Paper: arxiv.org/abs/2410.22553
Tasks: github.com/AlgorithmicR...
Agent: github.com/AlgorithmicR...

24.11.2024 20:02 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 2    πŸ“Œ 1

Maxo Kream

23.11.2024 14:21 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Sure! Multi-role or multi-module. From a software development perspective I think of them as microservices. Nothing more

23.11.2024 14:01 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

There’s a paper, I’ll try to find it

23.11.2024 06:33 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Idk about β€˜strengths’ and β€˜perspectives’ but you do want a separation of concerns if your agents have a lot of tools. And you do want specific system prompts to guide their objectives. If you pack too many tools into an api call, the model will only use a handful of them.

23.11.2024 06:33 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0