Kush Varshney कुश वार्ष्णेय's Avatar

Kush Varshney कुश वार्ष्णेय

@krvarshney.bsky.social

I wrote a book. Free pdf: http://trustworthymachinelearning.com Paperback: http://amazon.com/dp/B09SL5GPCD Posts are my own and don't necessarily represent IBM.

187 Followers  |  188 Following  |  134 Posts  |  Joined: 20.11.2024  |  1.7243

Latest posts by krvarshney.bsky.social on Bluesky

We couldn't have done this without amazing authors. Shai Satran, Will Kidder, Jason D'Cruz, @krvarshney.bsky.social, Sean Laurent, Sooyun Iris Chung, Ariel Goldstein, @gabistanovsky.bsky.social Austin Beattie, @andyhigh.bsky.social @mohammadatari.bsky.social @firatseker.bsky.social Aliah Zewail 5/n

06.01.2026 15:56 — 👍 2    🔁 1    💬 1    📌 0
Preview
IBM Granite is ranked world’s most transparent model The Stanford University Foundation Model Transparency Index has ranked IBM Granite number one this year — with the highest score in the history of the index.

The latest Stanford University Foundation Model Transparency Index was released out today, and IBM took the top spot !

In a year when other major AI players retreated from transparency, we doubled down and received the highest score in the Index’s history:
research.ibm.com/blog/ibm-gra...

09.12.2025 18:11 — 👍 2    🔁 1    💬 0    📌 0
Preview
The Perfect Emptiness of AI We’ve built a technology that speaks like a sage but thinks like a spreadsheet.

"When language no longer requires belief, AI’s fluency becomes a kind of anesthesia. And we are the ones it sedates. I’m reminded of T. S. Eliot’s ghostly image of a “patient etherized upon a table,” alive yet emptied of agency." www.psychologytoday.com/us/blog/the-...

30.10.2025 11:47 — 👍 0    🔁 0    💬 0    📌 0
The bar chart is titled **“Retrieval Augmented Generation (RAG)”** and shows **MTRAG mean accuracy** on the y-axis (0–80 scale).

### Results by model:

* **Granite-4.0-H-Small**: **73** (blue bar, highest)
* **Granite-4.0-Micro**: **72** (blue bar, nearly tied with H-Small)
* **GPT-OSS-20B**: **68** (green bar)
* **Mistral-Small-3.2-Instruct**: **48** (green bar, lowest score)
* **Llama-3.2-Instruct**: **53** (green bar)
* **Llama-3.3-70B-Instruct**: **61** (green bar)
* **Qwen3-8B**: **55** (green bar)

### Key takeaway:

The **Granite-4.0 models (H-Small and Micro)** outperform all others, achieving ~73 accuracy, with GPT-OSS-20B in third at 68. The weakest performance is from **Mistral-Small-3.2-Instruct (48)**.

The bar chart is titled **“Retrieval Augmented Generation (RAG)”** and shows **MTRAG mean accuracy** on the y-axis (0–80 scale). ### Results by model: * **Granite-4.0-H-Small**: **73** (blue bar, highest) * **Granite-4.0-Micro**: **72** (blue bar, nearly tied with H-Small) * **GPT-OSS-20B**: **68** (green bar) * **Mistral-Small-3.2-Instruct**: **48** (green bar, lowest score) * **Llama-3.2-Instruct**: **53** (green bar) * **Llama-3.3-70B-Instruct**: **61** (green bar) * **Qwen3-8B**: **55** (green bar) ### Key takeaway: The **Granite-4.0 models (H-Small and Micro)** outperform all others, achieving ~73 accuracy, with GPT-OSS-20B in third at 68. The weakest performance is from **Mistral-Small-3.2-Instruct (48)**.

Granite-4.0-H-Small: a 32B-A9B MoE Mamba for high efficency

Damn! IBM is on the map. The American Qwen? I barely even knew IBM made LLMs, this is solid

www.ibm.com/new/announce...

02.10.2025 15:21 — 👍 31    🔁 2    💬 6    📌 1
Preview
Why do AI models need to be safe? YouTube video by IBM Research

Recently got to have a super interesting conversation with the infinitely fascinating @krvarshney.bsky.social about why we need to make AI safe, and the very nature of ethics in a disaggregated digital world. Have a watch !
www.youtube.com/watch?v=g2A7...

26.09.2025 15:31 — 👍 3    🔁 1    💬 0    📌 0
Preview
Debugging LLMs to improve their credibility New tools from IBM Research can help LLM users check AI-generated content for accuracy and relevance and defend against jailbreak attacks.

Check out IBM's latest open source tools for trustworthy AI on GitHub:

In-Context Explainability 360

FactReasoner

Contextual Privacy

Links from here: research.ibm.com/blog/debuggi...

04.08.2025 15:45 — 👍 0    🔁 0    💬 0    📌 0
Preview
Opinion | A.I. Is Shedding Enlightenment Values

"In my own interactions with ChatGPT, it has often responded, with patently insincere flattery: “That’s a great question.” It has never responded: “That’s the wrong question.” It has never challenged my moral convictions or asked me to justify myself."
www.nytimes.com/2025/08/02/o...

03.08.2025 18:41 — 👍 2    🔁 0    💬 0    📌 0
Preview
AI thrives where education has been devalued | The Observer A culture that views knowledge as a means to an end invites the misuse of new technology

"Until we recognise that the debate about AI is not just about what machines can do but also about how humans should value education and knowledge, it will remain mired in confusion." observer.co.uk/news/opinion...

03.08.2025 18:04 — 👍 0    🔁 0    💬 0    📌 0
Preview
AI is not Africa’s savior: Avoiding technosolutionism in digital development | Brookings Chinasa T. Okolo discusses how Africa can ensure AI progress serves the contitnent's broader goals of social and economic empowerment.

"The true measure of progress in AI lies not in the sophistication of algorithms but in whether it genuinely serve the people and communities they seek to empower. Without grounding in human dignity and local contexts, AI risks creating technological subjugation."
www.brookings.edu/articles/ai-...

03.08.2025 02:56 — 👍 2    🔁 0    💬 0    📌 0
Preview
How IBM’s Kush Varshney became an iconic ’test’ photo The IBM Fellow reflects on copyright law, generative AI, and how he became the face of the modern camera man

What do authorship, copyright, and creativity mean in the age of AI? @krvarshney.bsky.social talks to us about it:
research.ibm.com/blog/kush-va...

21.07.2025 15:45 — 👍 2    🔁 1    💬 0    📌 0
Preview
Selective Thinking Is The Skill Every Leader Needs When you observe your mind without being swept away, you take back control from unconscious, emotional thinking—the kind that fuels rash decisions and poor leadership.

"Training yourself to observe and challenge these automatic thoughts—what psychologists call metacognition—is strikingly similar to the Buddhist concept of yoniso manasikāra, or wise attention." www.forbes.com/councils/for...

07.07.2025 02:03 — 👍 0    🔁 0    💬 0    📌 0
Preview
AI: Rewriting the future of finance and financial inclusion A new AI-driven framework that is grounded in the distinct needs of the underserved is creating a blueprint for the future of finance around the world.

"The next decade will be shaped by innovators using AI to solve real problems in real communities. The future won’t be written in Silicon Valley, but in Lagos, Jakarta, Cairo and Dubai. AI-powered solutions fused with local knowledge will unlock this future." www.weforum.org/stories/2025...

02.07.2025 03:06 — 👍 2    🔁 1    💬 0    📌 0

Weike Zhao, Chaoyi Wu, Yanjie Fan, Xiaoman Zhang, Pengcheng Qiu, Yuze Sun, Xiao Zhou, Yanfeng Wang, Ya Zhang, Yongguo Yu, Kun Sun, Weidi Xie
An Agentic System for Rare Disease Diagnosis with Traceable Reasoning
https://arxiv.org/abs/2506.20430

26.06.2025 05:43 — 👍 0    🔁 1    💬 0    📌 0
Preview
LLM-as-a-Judge Without the Headaches: EvalAssist Brings Structure and Simplicity to the Chaos of LLM Output Review | AI Alliance Evaluating AI model outputs at scale is a major challenge for teams using LLMs, especially when assessing nuanced qualities like politeness, fairness, and tone that traditional benchmarks miss. IBM Re...

📣 Today we open-sourced EvalAssist, a web-based tool that makes it super easy to develop criteria for llm judges. You can run this now locally and then scale up with notebooks using Unitxt. Check out the AI Alliance article to get the scoop:
thealliance.ai/blog/llm-as-...

16.06.2025 15:38 — 👍 5    🔁 3    💬 1    📌 1
Preview
EvalAssist EvalAssist simplifies LLM-as-a-Judge by supporting users in iteratively refining evaluation criteria in a web-based user experience.

LLM-as-a-Judge Simplified — Start Small, Refine Fast, Scale Smart ibm.github.io/eval-assist/

16.06.2025 15:18 — 👍 0    🔁 0    💬 0    📌 0
Post image

🚨 Announcing our #keynote speakers for the 3rd Trustworthy AI #Workshop @deeplearningindaba.bsky.social ! We are excited to welcome thought leaders pushing the boundaries of #ResponsibleAI

@krvarshney.bsky.social is a Fellow IBM Research

11.06.2025 09:36 — 👍 3    🔁 2    💬 1    📌 0

Djallel Bouneffouf, Matthew Riemer, Kush Varshney: The Ultimate Test of Superintelligent AI Agents: Can an AI Balance Care and Control in Asymmetric Relationships? https://arxiv.org/abs/2506.01813 https://arxiv.org/pdf/2506.01813 https://arxiv.org/html/2506.01813

04.06.2025 06:11 — 👍 1    🔁 1    💬 1    📌 0

Announcing our keynote speakers for #FAccT2025! 🎉

Suresh Venkatasubramanian (Brown)
Nathalie Smuha (KU Leuven)
Kristian Lum (Google DeepMind)
Molly Crockett (Princeton)

And the plenary panel will be on “Pathways of Change and the Future of Responsible AI"

16.05.2025 10:43 — 👍 25    🔁 8    💬 0    📌 0

Frying gulab jamuns helps you understand the phenomenon of tidal locking between moons and planets.

06.05.2025 14:07 — 👍 0    🔁 0    💬 0    📌 0
I tried getting LLMs to work together using ACP (Agent Communication Protocol)
YouTube video by Nicholas Renotte I tried getting LLMs to work together using ACP (Agent Communication Protocol)

🔗 Want to connect your agents together wherever they are🌎?

See what's possible with ACP! This video will show:
🎁 How to wrap an agent with the SDK
🔈 Calling out with a a standardized client
⛓️Chaining ACP calls to different agents
📲 Prototype of ACPCallingAgent

👉 www.youtube.com/watch?v=Nzaq...

06.05.2025 14:01 — 👍 2    🔁 2    💬 0    📌 0
Preview
The Strange Physics That Gave Birth to AI | Quanta Magazine Modern thinking machines owe their existence to insights from the physics of complex materials.

Happy to see @bhoov.bsky.social recognized in this article about spin glasses and associative memory.
www.quantamagazine.org/the-strange-...

03.05.2025 19:51 — 👍 2    🔁 0    💬 0    📌 0
Preview
AI Attribution Toolkit An attribution statement identifies not only the presence of AI involvement, but also how AI was used. This approach makes important distinctions between different types and amounts of AI…

🤖 ✏️ There is a better way to explain how you used AI in your {research paper, college essay, blog posts, …}. Check out our new AI Attribution Toolkit and look for us at #CHI2025!

aiattribution.github.io
dl.acm.org/doi/full/10....

29.04.2025 00:01 — 👍 5    🔁 2    💬 0    📌 1
Preview
How AI governance wouldn’t exist without our maritime past IBM’s Kush Varshney explains the origins of the phrase ’AI governance’ and how IBM is adapting its Trust 360 toolkits for the age of generative AI.

I appreciated the framing in terms of governors (research.ibm.com/blog/AI-gove...) and the discussion of many strategies for pursuing safety (doi.org/10.1089/big....). Now that we're moving to agentic AI, I think systems theories will be even more important for control (arxiv.org/abs/2503.00237).

25.04.2025 12:50 — 👍 2    🔁 0    💬 1    📌 0
Post image 17.04.2025 00:08 — 👍 0    🔁 0    💬 0    📌 0
Preview
Training LLMs to self-detoxify their language A new method called self-disciplined autoregressive sampling (SASA) allows large language models to detoxify their own outputs, without sacrificing fluency.

“If we think about how human beings in the world, we do see bad things, so it’s not about allowing the language model to see only the good things. It’s about understanding the full spectrum — both good and bad,” says Ko, “and choosing to uphold our values when we speak.”
news.mit.edu/2025/trainin...

15.04.2025 22:34 — 👍 0    🔁 0    💬 0    📌 0
Preview
Toward a Systems Theory for Human-Centered Trustworthy Agentic AI Come join us as we explore creating AI systems that prioritize human trust and agency in a fun and interactive event!

See you at the University of Sydney in less than two hours. www.eventbrite.com.au/e/toward-a-s...

13.04.2025 23:13 — 👍 1    🔁 0    💬 0    📌 0
Preview
Granite Guardian tops third-party AI benchmark IBM’s collection of LLM guardrail models take six of the top 10 spots on the new GuardBench leaderboard.

Granite Guardian tops a new benchmark! research.ibm.com/blog/granite...

09.04.2025 19:51 — 👍 3    🔁 2    💬 0    📌 0
Preview
Decolonial AI Alignment by Kush Varshney (IBM Research, US)

LLMs need not engage in a coloniality of knowledge by treating one culture's ethics or moral philosophy as universally correct. Instead, open LLMs should be aligned to value systems from different epistomologies and not assume universal values. 🌍🤖 #ai #hcai #alignment

08.04.2025 15:19 — 👍 6    🔁 2    💬 0    📌 1

@krvarshney is following 20 prominent accounts