EvalEval Coalition's Avatar

EvalEval Coalition

@eval-eval.bsky.social

We are a researcher community developing scientifically grounded research outputs and robust deployment infrastructure for broader impact evaluations. https://evalevalai.com/

40 Followers  |  8 Following  |  7 Posts  |  Joined: 12.06.2025  |  1.2771

Latest posts by eval-eval.bsky.social on Bluesky

Preview
The AI Evaluation Chart Crisis Charts used to showcase performance demonstrate broader issues in the AI evaluation ecosystem: a lack of balance between competitive benchmarking and statistical rigor.

🚨New blog: The AI Evaluation Chart Crisis πŸ“

From misleading bar heights to missing error bars, recent model launches have sparked debate on AI evals. In our new blogpost, we dig into what’s broken, why it matters and how they should be presented πŸ‘‡

evalevalai.com/documentatio...

11.08.2025 19:20 β€” πŸ‘ 2    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
EvalEval Coalition We are a researcher community developing scientifically grounded research outputs and robust deployment infrastructure for broader impact evaluations.

This kickoff post lays out: 1) πŸ” Why we need a science of evaluation; 2) 🀝 Our goals for the community; 3) πŸ› οΈ How you can get involved (2/2)

Interested in joining? Check out evalevalai.com

16.07.2025 17:17 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
The Science of Evaluations: Workstream Kickoff Post Announcing the launch of a research-driven initiative among a community of researchers to strengthen the science of AI evaluations.

🚨 AI Evals Crisis: Officially kicking off the Eval Science Workstream 🚨

We’re building a shared scientific foundation for evaluating AI systems, one that’s rigorous, open, and grounded in real-world & cross-disciplinary best practicesπŸ‘‡ (1/2)

Read our new blog post: tinyurl.com/evalevalai

16.07.2025 17:17 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Join us for the Eval Eval Coalition Social at @facct.bsky.social tomorrow Tuesday June 24th from 4-4:30 pm during the coffee break! We would love to have you join us and we look forward to seeing you there!! #FAccT2025 #EvalEval

23.06.2025 14:41 β€” πŸ‘ 4    πŸ” 2    πŸ’¬ 0    πŸ“Œ 1
EvalEval Coalition We are a researcher community developing scientifically grounded research outputs and robust deployment infrastructure for broader impact evaluations.

We would love to have you join us!! Check out evaleval.github.io for more info and stay tuned for future updates!! #EvalEval #AIEvaluations (3/3)

22.06.2025 19:34 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Our coalition is focused on producing scientifically grounded research outputs, robust deployment infrastructure for broader impact evaluations, and fostering a community of researchers passionate about developing better evaluations 🌎🌍🌏 (2/3)

22.06.2025 19:34 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Introducing the Eval Eval Coalition! ✨
We are a community of researchers dedicated to designing, developing, and deploying better evaluations (1/3)

22.06.2025 19:34 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1

@eval-eval is following 8 prominent accounts