I'm lucky to be a part of this wonderful collaboration to improve the transparency and use of AI benchmarks. research.ibm.com/blog/documen...
18.12.2025 01:33 β π 0 π 0 π¬ 0 π 0@michaelhind.bsky.social
IBM Distinguished RSM, working on AI transparency, governance, explainability, and fairness. Proud husband & dad, Soccer lover. Posts are my own.
I'm lucky to be a part of this wonderful collaboration to improve the transparency and use of AI benchmarks. research.ibm.com/blog/documen...
18.12.2025 01:33 β π 0 π 0 π¬ 0 π 0An interesting backstory of a common test photo sparked another photo (of @krvarshney.bsky.social) in another dataset. research.ibm.com/blog/kush-va...
21.07.2025 18:17 β π 2 π 0 π¬ 0 π 0I'm excited to be a part of this great collaboration with colleagues at IBM Research and Notre Dame. lucyinstitute.nd.edu/news-events/...
17.07.2025 13:27 β π 2 π 0 π¬ 0 π 0Are you wondering how you can evaluate some of the risks of a foundation model before you deploy it? Read on .... www.ibm.com/new/announce...
15.04.2025 16:14 β π 3 π 0 π¬ 0 π 0I'm on the IBM Mixture of Experts podcast wearing a safety vest. We talk about all the new things in AI this week. I also connect to older work by IBM Fellows Irene Greif, Bob Dennard, Rolf Landauer, and Charlie Bennett and to Mauro Martino's new AI-generated film. www.youtube.com/watch?v=CgqH...
28.03.2025 13:10 β π 2 π 2 π¬ 0 π 0Happy to see Granite Guardian models atop the GuardBench leaderboard, including in non-English languages.
This benchmark was just released. Read about it here: www.linkedin.com/posts/eliasb....
A summary of decolonial AI alignment in the Human-Centered AI publication on Medium. Thanks to @jweisz3.bsky.social for asking me to write it, and for editing the piece. medium.com/human-center...
08.04.2025 15:12 β π 5 π 2 π¬ 0 π 0I'm happy to see my former IBM colleague raise this important issue regarding Agentic systems. www.linkedin.com/posts/thomas...
09.04.2025 15:43 β π 0 π 0 π¬ 0 π 0From Erik Miehling (www.linkedin.com/posts/erik-m...)
"AI development is currently overly focused on individual model capabilities, often ignoring broader emergent behavior, leading to a significant underestimation of the true capabilities and associated risks of agentic AI."
Four exciting things to share about watsonx.governance and Granite Guardian. Fun times in AI safety! See thread for the details.
28.02.2025 21:29 β π 1 π 1 π¬ 1 π 0 "... We'd love your feedback! Try the code, explore the Hugging Face space, and join us in building a stronger governance framework for AI."
www.linkedin.com/posts/elizab...
From Elizabeth Daly: "This week we are releasing, Risk Atlas Nexus, github.com/IBM/risk-atl..., an open source project that provides tooling to help bring together disparate resources related to governance of foundation models. ... "
28.02.2025 20:32 β π 0 π 0 π¬ 1 π 0"While techniques such as the ones used by R1 can degrade model safety, our preview release shows that reasoning and safety donβt have to be a trade-off."
www.ibm.com/new/announce...
It was a pleasure to join the panel discussion on Humanitarian AI Today podcast below, moderated by Brent Phillips: podcasts.apple.com/us/podcast/t...
23.12.2024 17:03 β π 2 π 1 π¬ 0 π 0"IBM has equipped the Granite Guardian 3.1 models with the ability to detect hallucinations in AI agent workflows. This feature provides oversight of an AI agent completing a task, monitoring for fabricated information or incorrect function calls." technologymagazine.com/articles/the...
20.12.2024 21:27 β π 6 π 2 π¬ 0 π 0Reminder: The #FAccT2025 submission deadlines are roughly one month away! Abstracts are due January 15th and full papers on January 22nd. See the full CfP here: facctconference.org/2025/cfp
17.12.2024 20:24 β π 26 π 12 π¬ 0 π 0I showed this cool demo last week @neuripsconf.bsky.social Now we have a public version on Hugging Face that you can play with to see the "judge" model in action. huggingface.co/spaces/ibm-g...
Enjoy!
Open source repo & benchmarks: github.com/ibm-granite/...
Now posted at the under construction booth π our demo lineup for Tuesday. Looking forward connecting with you at the IBM booth @neuripsconf.bsky.social
09.12.2024 23:23 β π 8 π 2 π¬ 0 π 0IBM Researchers setting up a booth at a convention center. Some are wearing safety gear and some are not.
It is @neuripsconf.bsky.social booth setup day! Among Ambrish Rawat, @bhoov.bsky.social, and @wernergeyer.bsky.social, who do you think is *not* an author of the Granite Guardian technical report we released today? (Hint: Granite Guardian helps make any LLM safer.)
Link: github.com/ibm-granite/...
If youβre headed to NeurIPS 2024, and want to learn about IBM Research Human-Centered Trustworthy AI, there are many many opportunities to do so.
1. Start with the official NeurIPS explorer by @henstr.bsky.social and @benhoover.bsky.social. It is infoviz par excellence. neurips2024.vizhub.ai
What are the desirable properties of AI metrics for such tests? What about summarizing these metrics for non-technical stakeholders?
07.12.2024 02:34 β π 1 π 0 π¬ 0 π 0... or when a physician tries to diagnose the health of a new patient by performing various diagnostic medical tests (blood tests, x-rays, etc).
What happens when one applies these ideas to AI models? How can it be helpful? How can it be misleading? What role could this play in regulations?
The work explores the challenges of testing for AI risks without have any information of how the model was developed, such as when one purchases a model from a 3rd party or open source. Similar to how a home inspector is asked to inspect a home without knowing its construction history.
07.12.2024 02:29 β π 1 π 0 π¬ 1 π 0I'm happy to announce a significant revision of our paper describing opportunities and challenges of quantitative AI risk assessments, also known as automated red-teaming: arxiv.org/abs/2209.06317
07.12.2024 02:25 β π 6 π 3 π¬ 1 π 0Overview of paper browser. A cluster for reinforcement learning is selected.
Paper Browser: only papers assigned to "physical models - physics" are shown.
Paper Browser: Filtered by author "Hoover" and detail is shown
Paper Brower: ZOOOOM in
πΊ Here comes the official 2024 NeurIPS paper browser:
- browse all NeurIPS papers in a visual way
- select clusters of interest and get cluster summary
- ZOOOOM in
- filter by human assigned keywords
- filter by substring (authors, titles)
neurips2024.vizhub.ai
#neurips by IBM Research Cambridge
I enjoyed my recent interview on the AI Risk Reward podcast with host Alec Crawford.
You can hear it here: podcasts.apple.com/us/podcast/t...