's Avatar

@potamitisn.bsky.social

2 Followers  |  1 Following  |  9 Posts  |  Joined: 14.07.2025  |  1.6206

Latest posts by potamitisn.bsky.social on Bluesky


πŸŽ“ Paper: openreview.net/pdf?id=yNpYb...
πŸ“ Blog: au-clan.github.io/2025-06-19-f...
πŸ’» Code: github.com/au-clan/FoA

cc: @akhilarora.bsky.social @larshklein.bsky.social @caglarai.bsky.social @icepfl.bsky.social @csaudk.bsky.social

15.07.2025 17:25 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

We systematically disable components of FoA to assess their impact. Removing selection, resampling, backtracking, caching, or batching each leads to lower success or higher cost. Results show that all parts of the system contribute meaningfully to its overall performance.

15.07.2025 17:25 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

We evaluate raw Llama3.2 models on structured reasoning tasks. Alone, both perform poorly. But with FoA, performance improves 5-6 times. Notably, Llama3.2-11B with FoA, outperforms the much larger 90B model. FoA allows smaller models to rival or beat larger ones, bridging the reasoning gap.

15.07.2025 17:25 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

We compare FoA to top methods under a fixed $5 budget, allowing multiple trials per method. FoA consistently outperforms others across all budget points. While some baselines may improve with more spending, FoA offers the best cost-quality trade-off in resource-constrained settings.

15.07.2025 17:25 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

In the selection phase, the fleet evaluates all agent states using a value function and resamples the population with replacement. FoA also supports backtracking: instead of sampling only from current states, it can revisit earlier ones from its history, with older states gradually discounted.

15.07.2025 17:25 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

In the mutation phase, each agent in the fleet independently explores the search space by simulating a sequence of transitions. Any agent that enters an invalid or terminal state is immediately removed and replaced by a randomly selected copy of a surviving agent.

15.07.2025 17:25 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

βœ… Model-agnostic gains: FoA outperforms top reasoning frameworks across tasks and models

πŸ’Έ Better cost-quality trade-off: Up to 70% improvements at a fraction of the compute

βš™οΈ Plug-and-play: Works with any prompting strategy.

πŸ“ Tunable & predictable: Control compute precisely

😁 More details πŸ‘‡

15.07.2025 17:25 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

In Fleet of agents, each agent explores the problem space independently by generating thoughts, taking actions, or making moves. After a round of exploration, agents are scored, and the best-performing ones are resampled to continue, while weak or invalid ones are eliminated and replaced.

15.07.2025 17:25 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Excited to be in Vancouver this week for #ICML2025! πŸŒ²β˜•πŸ›°οΈ

We’ll be presenting our paper β€œFleet of Agents (FoA)”, a lightweight, open-source framework that boosts LLM reasoning by orchestrating multiple LLM agents as a collective intelligence using a genetic-style dynamic tree search.

🧡 (1/n)

15.07.2025 17:25 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

@potamitisn is following 1 prominent accounts