MLCommons's Avatar

MLCommons

@mlcommons.org.bsky.social

MLCommons is an AI engineering consortium, built on a philosophy of open collaboration to improve AI systems. Through our collective engineering efforts, we continually measure and improve AI technologies' accuracy, safety, speed, and efficiency.

138 Followers  |  44 Following  |  59 Posts  |  Joined: 21.11.2024  |  2.1955

Latest posts by mlcommons.org on Bluesky

Benchmark MLPerf Storage | MLCommons V1.1 Results The MLPerf Storage benchmark suite measures how fast storage systems can supply training data when a model is being trained. Below is a short summary of the workloads and metrics from the latest round...

Simplyblock, TTA, UBIX, IBM, WDC, and YanRong.
Check the results here:
mlcommons.org/benchmarks/s...

#MLPerf #AI #Storage #Benchmarking #MachineLearning #MLCommons

04.08.2025 17:36 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
MLCommons - Better AI for Everyone MLCommons aims to accelerate AI innovation to benefit everyone. It's philosophy of open collaboration and collaborative engineering seeks to improve AI systems by continually measuring and improving t...

5/ Congratulations and thanks to all submitters!
Alluxio, Argonne National Lab, DDN, ExponTech, FarmGPU, H3C, Hammerspace, HPE, JNIST/Huawei, Juicedata, Kingston, KIOXIA, Lightbits Labs, MangoBoost, Micron, Nutanix, Oracle, Quanta Cloud Technology, Samsung, Sandisk,

04.08.2025 17:36 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

4/ The v2.0 submissions showcase a wide range of technical solutionsβ€”local & object storage, in-storage accelerators, software-defined storage, block systems, and more. This diversity highlights the community’s commitment to advancing AI infrastructure.

04.08.2025 17:36 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

3/ New in this round: checkpoint benchmarks, designed to reflect real-world practices in large-scale AI training systems. These benchmarks provide key data to help stakeholders optimize system reliability and efficiency at scale.

04.08.2025 17:36 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

v2.0 highlights:
- 200+ results
- 26 organizations
- 7 countries represented
- Benchmarked systems now support about 2x the accelerators vs v1.0

04.08.2025 17:36 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
New MLPerf Storage v2.0 Benchmark Results Demonstrate the Critical Role of Storage Performance in AI Training Systems - MLCommons New checkpoint benchmarks provide β€œmust-have” information for optimizing AI training

1/ MLCommons just released results for the MLPerf Storage v2.0 benchmarkβ€”an industry-standard suite for measuring storage system performance in #ML workloads. This benchmark remains architecture-neutral, representative, and reproducible.
mlcommons.org/2025/08/mlpe...

04.08.2025 17:36 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
MLCommons Releases MLPerf Client v1.0: A New Standard for AI PC and Client LLM Benchmarking - MLCommons MLCommons Releases MLPerf Client v1.0 with Expanded Models, Prompts, and Hardware Support, Standardizing AI PC Performance.

MLPerf Client v1.0 is out! πŸŽ‰

The new benchmark for LLMs on PCs and client systems is now availableβ€”featuring expanded model support, new workload scenarios, and broad hardware integration.

Thank you to all submitters! #AMD, #Intel, @microsoft.com, #NVIDIA, #Qualcomm

mlcommons.org/2025/07/mlpe...

30.07.2025 15:12 β€” πŸ‘ 0    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
MLCommons Launches MLPerf Mobile on Google Play Store - MLCommons MLCommons Launches MLPerf Mobile on Google Play Store

You can read more details here: mlcommons.org/2025/07/mlpe...

10.07.2025 19:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

MLCommons just launched MLPerf Mobile on the Google Play Store! πŸ“±
Benchmark your Android device’s AI performance on real-world ML tasks with this free, open-source app.
Try it now: play.google.com/store/apps/d...

10.07.2025 19:01 β€” πŸ‘ 3    πŸ” 2    πŸ’¬ 1    πŸ“Œ 2

3/3

27.06.2025 19:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
MLCommons Builds New Agentic Reliability Evaluation Standard in Collaboration with Industry Leaders - MLCommons MLCommons and partners unite to create actionable reliability standards for next-generation AI agents.

3/3 @cam.ac.uk , @ox.ac.uk, University of Illinois Urbana-Champaign, and @ucsb.bsky.social.

Read more about the collaborative development of the Agentic Reliability Evaluation Standard and opportunities to participate: mlcommons.org/2025/06/ares...

27.06.2025 19:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

2/3 Contributions from: Advai, AI Verify Foundation, @anthropic.com, @arize.bsky.social , @cohere.com , Google, Intel, LNE, Meta, @microsoft.com, NASSCOM, OpenAI, Patronus AI, @polymtl.bsky.social, Qualcomm, QuantumBlack - AI by McKinsey, Salesforce, Schmidt Sciences, @servicenow.bsky.social,

27.06.2025 19:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
MLCommons Builds New Agentic Reliability Evaluation Standard in Collaboration with Industry Leaders - MLCommons MLCommons and partners unite to create actionable reliability standards for next-generation AI agents.

Today, MLCommons is announcing a new collaboration with contributors from across academia, civil society, and industry to co-develop an open agent reliability evaluation standard to operationalize trust in agentic deployments.
πŸ”—https://mlcommons.org/2025/06/ares-announce/
1/3

27.06.2025 19:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We're all about acceleration! πŸ˜‰
Watch @priya-kasimbeg.bsky.social & @fsschneider.bsky.social speedrun an explanation of the AlgoPerf benchmark, rules, and results all within a tight 5 minutes for our #ICLR2025 paper video on "Accelerating Neural Network Training". See you in Singapore!

03.04.2025 11:15 β€” πŸ‘ 5    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0
Preview
AI is posing immediate threats to your business. Here’s how to protect yourself The AI threats your busines is facing right now, and how to prevent them

Companies are deploying AI tools that haven't been pressure-tested, and it's already backfiring.

In her new op-ed, our President, Rebecca Weiss, breaks down how industry-led AI reliability standards can help executives avoid costly, high-profile failures.

πŸ“– More: bit.ly/3FP0kjg

@fastcompany.com

20.06.2025 16:03 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

Call for Submissions!

#MLCommons & @AVCConsortium are accepting submissions for the #MLPerf Automotive Benchmark Suite! Help drive fair comparisons & optimize AI systems in vehicles. Focus is on camera sensor perception.

πŸ“… Submissions close June 13th, 2025

Join: mlcommons.org/community/su...

05.06.2025 18:12 β€” πŸ‘ 0    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

4/ Read more and check out the full results here:
πŸ”—https://mlcommons.org/2025/06/mlperf-training-v5-0-results/

#MLPerf #MLCommons #AI #MachineLearning #Benchmarking

04.06.2025 15:35 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

3/ MLPerf Training v5.0 introduces the Llama 3.1 405B benchmark, our largest language model yet. We also saw big performance gains for Stable Diffusion and Llama 2.0 70B LoRAβ€”AI training is getting faster and smarter.

04.06.2025 15:35 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

2/ Thank you to all 20 submitters for driving progress in AI benchmarking:
AMD, ASUSTeK, Cisco, CoreWeave, Dell, GigaComputing, Google Cloud, HPE, IBM, Krai, Lambda, Lenovo, MangoBoost, Nebius, NVIDIA, Oracle, QCT, SCITIX, Supermicro, TinyCorp.

04.06.2025 15:35 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

1/ The MLPerf Training v5.0 results are hereβ€”Let’s have a fresh look at the state of large-scale AI training! This round set a new record: 201 performance results from across the industry.
πŸ”—https://mlcommons.org/2025/06/mlperf-training-v5-0-results/

04.06.2025 15:35 β€” πŸ‘ 0    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
1st Workshop on Multilingual Data Quality Signals

Call for papers!
We are organising the 1st Workshop on Multilingual Data Quality Signals with @mlcommons.org and @eleutherai.bsky.social, held in tandem with @colmweb.org. Submit your research on multilingual data quality!

Submission deadline is 23 June, more info: wmdqs.org

29.05.2025 17:18 β€” πŸ‘ 9    πŸ” 8    πŸ’¬ 0    πŸ“Œ 1
Post image

MLCommons is partnering with Nasscom to develop globally recognized AI reliability benchmarks, including India-specific, Hindi-language evaluations. Together, we are advancing trustworthy AI.
πŸ”— mlcommons.org/2025/05/nass...

#AIForAll #IndiaAI #ResponsibleAI #Nasscom #MLCommons

29.05.2025 15:07 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
MLCommons MLPerf Training Expands with Llama 3.1 405B - MLCommons MLCommons MLPerf Training Expands with Llama 3.1 405B

MLCommons' MLPerf Training suite has a new #pretraining #benchmark based on #Meta’s Llama 3.1 405B model. We use the same dataset with a bigger model and longer context, offering a more relevant and challenging measure for today’s #AI systems. mlcommons.org/2025/05/trai...

05.05.2025 16:22 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
David Kanter MLPerf Measures AI Data Storage Performance In this Tech Barometer podcast, MLCommons Co-founder David Kanter talks about creating the MLPerf benchmark to help enterprises understand AI workload performance of various data storage technologies.

As AI models grow, storage is key to #ML performance. MLCommons' @dkanter.bsky.social joins #Nutanix’s Tech Barometer podcast to explain why and how the #MLPerf #Storage #benchmark guides smarter #data #infrastructure for #AI.
Listen: www.nutanix.com/theforecastb...

#DataStorage #EnterpriseIT

30.04.2025 16:20 β€” πŸ‘ 1    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

3/ We want to thank all the participants: #Intel, #Microsoft,
#NVIDIA, #Qualcomm Technologies.

#MLPerf #MLCommons #Client

28.04.2025 15:12 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

2/ This update broadens hardware compatibility and introduces improved device selection and updated software components, providing a transparent and standardized approach to measuring AI performance across next-generation platforms.

28.04.2025 15:12 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

1/ MLCommons announces the release of MLPerf Client v0.6, the first open benchmark to support NPU and GPU acceleration on consumer AI PCs.
Read more: mlcommons.org/2025/04/mlpe...

28.04.2025 15:12 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1
Preview
MLCommons Releases French AILuminate Benchmark Demo Prompt Dataset to Github - MLCommons MLCommons announces the release of two French language datasets for the AILuminate benchmark. A 1,200 prompt Creative-Commons licensed version, and 12,000 Practice Test prompts.

#MLCommons just released two new French prompt #datasets for #AILuminate:
πŸ”ΉDemo set: 1,200+ prompts, free for AI safety testing
πŸ”ΉPractice set: 12,000 prompts for deeper evaluation (on request)
Native speakers made both and are ready for #ModelBench. Details: mlcommons.org/2025/04/ailu...

#AI #AIRR

17.04.2025 15:44 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
What is MLPerf? Understanding AI’s Top Benchmark A constantly evolving set of real-world AI tests pushes Intel experts to boost performance, level the playing field and make AI more accessible to all.

"What’s admirable about MLPerf is that everything is shared and benchmarks are open sourced. Results need to be reproducible β€” no mystery can remain. This openness allows for more dynamic comparisons beyond raw side-by-side speed, like performance..."

newsroom.intel.com/artificial-i...

15.04.2025 15:40 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

We also want to thank the additional technical contributors: Pablo Gonzalez, MLCommons; Anandhu Sooraj, MLCommons; Arjun Suresh, AMD (formerly at MLCommons)

07.04.2025 21:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@mlcommons.org is following 20 prominent accounts