Colin White (@crwhite-ml) — Bluesky Profile

1 year ago

The LiveBench leaderboard showing llama-3.3-70b-instruct-turbo in the leading position by average instruction following performance

Shiny! The newly released Llama 3.3 LLM leads the LiveBench ranking for instruction following¹, beating Claude 3.5, GPT-4o, OpenAI o1, and you can run it on your local² machine.

> ollama run llama3.3

livebench.ai#/?IF=as

7 1 3 0

1 year ago

It might have been because they implemented this rule for the first time this year: "All authors who are on 3 or more papers must serve as a reviewer for at least 6 papers."
I'm a fan of that rule!

0 0 1 0