Yacine Jernite's Avatar

Yacine Jernite

@yjernite.bsky.social

Head of ML & Society at Hugging Face πŸ€—

448 Followers  |  216 Following  |  45 Posts  |  Joined: 22.11.2024  |  2.1694

Latest posts by yjernite.bsky.social on Bluesky

Preview
How Your Utility Bills Are Subsidizing Power-Hungry AI | TechPolicy.Press The next few years will be pivotal for determining the future of AI and its impact on energy grids worldwide, write Sasha Luccioni and Yacine Jernite.

Think AI's rising energy demands aren't your problem? Think again! ⚑

In our commentary, @yjernite.bsky.social and I explain how AI & data center expansion is making energy bills rise in the US, what the mechanisms driving this are, and why it matters.

www.techpolicy.press/how-your-uti...

06.08.2025 13:36 β€” πŸ‘ 27    πŸ” 13    πŸ’¬ 2    πŸ“Œ 2
Preview
How Your Utility Bills Are Subsidizing Power-Hungry AI | TechPolicy.Press The next few years will be pivotal for determining the future of AI and its impact on energy grids worldwide, write Sasha Luccioni and Yacine Jernite.

As tech firms keep adding the largest and most compute-intensive AI models into more and more aspects of our digital lives, they are increasingly dependent on a growing share of existing energy and natural resources, leading to rising costs for everyone else, write Sasha Luccioni and Yacine Jernite.

06.08.2025 13:33 β€” πŸ‘ 35    πŸ” 17    πŸ’¬ 1    πŸ“Œ 3
Preview
AI Companionship: Why We Need to Evaluate How AI Systems Handle Emotional Bonds A Blog post by Giada Pistilli on Hugging Face

From Replika to everyday chatbots, people form emotional bonds with AI. But what happens when an AI tells you "I understand how you feel" and you actually believe it?

With @frimelle.bsky.social and @yjernite.bsky.social, we dug into something: how AI systems handle our emotional lives.

29.07.2025 14:03 β€” πŸ‘ 4    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0

Friends!
Does anyone know of any model distillation with public logs (W&B or other)?
I'm trying to figure out the energy tradeoffs between model training and distillation..

17.07.2025 10:20 β€” πŸ‘ 7    πŸ” 5    πŸ’¬ 0    πŸ“Œ 0

New blog post alert! 🚨"What is the Hugging Face Community Building?", with @yjernite.bsky.social and Irene Soliaman

The AI narrative focuses on big players, but the real story is happening in the open source AI ecosystem across 1.8M models, 450K datasets, and 560K apps, on
@hf.co.

15.07.2025 14:31 β€” πŸ‘ 13    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0
Preview
How Much Energy Does AI Use? The People Who Know Aren’t Saying A growing body of research attempts to put a number on energy use and AIβ€”even as the companies behind the most popular models keep their carbon emissions a secret.

One of the biggest frustrations I have is the lack of transparency around AI's energy use and environmental impacts. I know the numbers are out there... but somehow we're not seeing them 🫠

Thank you @wired.com for covering this topic in such depth and detail !

www.wired.com/story/ai-car...

19.06.2025 15:37 β€” πŸ‘ 58    πŸ” 28    πŸ’¬ 1    πŸ“Œ 1
Screenshot of the header of the article with text:

AI Scraping Bots Are Breaking Open Libraries, Archives, and Museums

Screenshot of the header of the article with text: AI Scraping Bots Are Breaking Open Libraries, Archives, and Museums

β€œAI Scraping Bots Are Breaking Open Libraries, Archives, and Museums” – interesting piece via @404media.co

Not a perfect fix, but making ML-ready datasets from collections can help.

If you want help getting your data on @hf.co, I'd be happy to help.

17.06.2025 10:43 β€” πŸ‘ 14    πŸ” 4    πŸ’¬ 0    πŸ“Œ 1
Preview
institutional/institutional-books-1.0 Β· Datasets at Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Institutional Books: Massive Historical Text Corpus

- 983K books, 242B tokens, 386M pages
- 19th-20th century texts in 254 languages
- Refined OCR with quality scores & metadata
- Noncommercial early-access release

huggingface.co/datasets/ins...

16.06.2025 09:22 β€” πŸ‘ 36    πŸ” 15    πŸ’¬ 0    πŸ“Œ 0
Preview
Open Source AI: A Cornerstone of Digital Sovereignty A Blog post by Lucie-AimΓ©e Kaffee on Hugging Face

Great blog post on *Digital Sovereignty and OS AI* led by the fantastic @frimelle.bsky.social!

Digital sovereignty for AI needs to properly account for:
πŸ“š data
πŸ§‘β€πŸ”¬ technology
πŸ’½ infrastructure
βš–οΈ regulation

Open/transparent AI contributes to all, read for some concrete examples!
hf.co/blog/frimell...

11.06.2025 16:45 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

❗️New policy blogpost!
The EU is speaking a lot about sovereignty. A cornerstone of digital sovereignty is and has to be open source.
As AI becomes more central, the ability to govern, adapt, and understand these systems is no longer optional.

11.06.2025 15:13 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
A tweet from Bernie Sanders. It reads:
The CEO of Anthropic (a powerful AI company) predicts that AI could wipe out HALF of entry-level white collar jobs in the next 5 years. 

We must demand that increased worker productivity from AI benefits working people, not just wealthy stockholders on Wall St. AI IS A BIG DEAL.

A tweet from Bernie Sanders. It reads: The CEO of Anthropic (a powerful AI company) predicts that AI could wipe out HALF of entry-level white collar jobs in the next 5 years. We must demand that increased worker productivity from AI benefits working people, not just wealthy stockholders on Wall St. AI IS A BIG DEAL.

Off-mark from @sanders.senate.gov.

We need progressive legislators to not buy into the clown show from AI CEOs. Labor replacement is not real, labor displacement is.

We need regulation to protect workers and anticipate the kind of worker speedups that employers buying into the hype will cause.

06.06.2025 15:14 β€” πŸ‘ 141    πŸ” 27    πŸ’¬ 4    πŸ“Œ 3
Preview
Bigger isn't always better: how to choose the most efficient model for context-specific tasks πŸŒ±πŸ§‘πŸΌβ€πŸ’» A Blog post by Sasha Luccioni on Hugging Face

How can we make informed choices based on performance AND energy when using AI in real-life tasks like question answering? By evaluating them and picking the models that optimize both factors!
Check out my new blog post on the subject:
huggingface.co/blog/sasha/e...

28.05.2025 13:25 β€” πŸ‘ 9    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0
Preview
AI Personas: The Impact of Design Choices A Blog post by Giada Pistilli on Hugging Face

I'm consistently impressed by @giadapistilli.com's extensive insights into AI technology πŸ€—

Her latest blog on design factors of AI "companions" shows that those go way beyond model performance and give some nice hands-on tool to do your own analysis - must read!

huggingface.co/blog/giadap/...

07.05.2025 15:59 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
AI Personas: The Impact of Design Choices A Blog post by Giada Pistilli on Hugging Face

Ever notice how some AI assistants feel like tools while others feel like companions? Turns out, it's not always about fancy tech upgrades, because sometimes it's just clever design.

huggingface.co/blog/giadap/...

07.05.2025 12:27 β€” πŸ‘ 10    πŸ” 5    πŸ’¬ 1    πŸ“Œ 0
Post image

We just integrated the new Qwen3-8B into Chat UI Energy and asked it to do a simple multiplication problem.
What's the energy cost?
β†’ Without reasoning: wrong (πŸ˜…), but low energy use
β†’ With reasoning: correct (!!)… but using 42x more energy!

29.04.2025 19:53 β€” πŸ‘ 16    πŸ” 3    πŸ’¬ 1    πŸ“Œ 1
Preview
Consent by Design: Approaches to User Data in Open AI Ecosystems A Blog post by Giada Pistilli on Hugging Face

πŸ€— New from us! Just published a blog post exploring how we're rethinking consent in the AI ecosystem.

Here's what we're seeing in the @hf.co Hub that differs from traditional closed systems...

17.04.2025 13:04 β€” πŸ‘ 7    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

128k context windows + code specific were the deciding factors, and 32B caught a lot more than 7B!

17.04.2025 12:50 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Space Privacy - a Hugging Face Space by yjernite Analysing privacy concerns in deployed Spaces

The app comes with a bunch of pre-reviewed apps/Spaces, great to see how many process data locally or through (private) HF endpoints πŸ€—

Note that this is a POC, lots of exciting work to do to make it more robust, so:
- try it: hf.co/spaces/yjern...
- reach out to collab: hf.co/spaces/yjern...

4/4 🧡

16.04.2025 20:34 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
A sample from the detailed report with references to specific code snippets from the app

A sample from the detailed report with references to specific code snippets from the app

A sample from the summary describing the Space functionality, AI services, and data journeys

A sample from the summary describing the Space functionality, AI services, and data journeys

The app works in three stages:
1. Download all code files
2. Use the Code LM to generate a detailed report pointing to code where data is transferred/(AI-)processed (screen 1)
3. Summarize the app's main functionality and data journeys (screen 2)
4. Build a Privacy TLDR with those inputs

3/4 🧡

16.04.2025 20:34 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Space Privacy - a Hugging Face Space by yjernite Analysing privacy concerns in deployed Spaces

That requires actually reading the code though, which isn't always easy or quick!
Good news: code LMs have gotten pretty good at automatic review, so we can offload some of the work - here I'm using Qwen2.5-Coder to generate reports and it works pretty OK, have a look πŸ‘‡
hf.co/spaces/yjern...

2/4 🧡

16.04.2025 20:34 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Interface of Space Privacy Analyzer app, describing how it reviews Hugging Face Spaces for data privacy concerns, with a pre-loaded example for the Hugging Face demo app for SmolVLM2

Interface of Space Privacy Analyzer app, describing how it reviews Hugging Face Spaces for data privacy concerns, with a pre-loaded example for the Hugging Face demo app for SmolVLM2

A TLDR report generated by the Spaces Privacy app outlining the different types of data used and where they go when using the app

A TLDR report generated by the Spaces Privacy app outlining the different types of data used and where they go when using the app

Today in Privacy & AI Tooling - introducing a nifty new tool to examine where data goes in open-source apps on @hf.co πŸ€—

HF Spaces have tons (100Ks!) of cool demos leveraging or examining AI systems - and because most of them are OSS we can see exactly how they handle user data πŸ“šπŸ”

1/4 🧡

16.04.2025 20:34 β€” πŸ‘ 8    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

Thrilled to share that our paper:
"It's not a representation of me": Examining Accent Bias and Digital Exclusion in Synthetic AI Voice Services - has been accepted at @facct.bsky.social 2025! - with @shiramichel.bsky.social , Sufi Kaur, Sarah Gilespie, Jeffrey Gleason and Dr. Christo Wilson.

15.04.2025 13:44 β€” πŸ‘ 15    πŸ” 7    πŸ’¬ 2    πŸ“Œ 0
Preview
Empowering Public Organizations: Preparing Your Data for the AI Era A Blog post by Avijit Ghosh on Hugging Face

New blog post led by @evijit.io on wrangling public data for AI - and helping public orgs have more control over how AI systems serve their mission by shaping how their data's usedπŸ“š
Have a read especially if your org's being asked to do more AI (common theme these days πŸ€—)

hf.co/blog/evijit/...

10.04.2025 15:35 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

🚨 New Article: Empowering Public Organizations: Preparing Your Data for the AI Era, with @yjernite.bsky.social

Let’s discuss how public organizations can unlock the full potential of their data in the age of AI.

10.04.2025 15:05 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

This is *extremely* cool

I'm increasingly excited about using the OLMo based apps for daily use - I find the playground genuinely better than the commercial apps whenever I need some originality, and the transparency/privacy guarantees are just so much stronger

09.04.2025 16:43 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

We wrote about this in thecon.ai: with enterprise deals, employees will be initially highly encouraged to use synthetic text generation machines. But at some point, it'll become required to justify the cost.

07.04.2025 18:04 β€” πŸ‘ 27    πŸ” 9    πŸ’¬ 2    πŸ“Œ 0

AB 566 has officially passed out of the committee in the CA Assembly! πŸŽ‰ This bill ensures that web browsers & mobile OS vendors provide a simple, automated way for users to send opt-out signalsβ€”closing a loophole in CA privacy law. A huge step toward meaningful data rights!

03.04.2025 18:44 β€” πŸ‘ 10    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0
Preview
Reasoning models don't always say what they think Research from Anthropic on the faithfulness of AI models' Chain-of-Thought

One of the reasons it's so hard to easily debunk the most outlandish claims on "loss of control" risks is the lack of self-consistency - e.g. same labs showing model outputs are NOT "human thinking" and still doubling down on calling the phenomenon "deception" πŸ™ƒ

www.anthropic.com/research/rea...

03.04.2025 18:49 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Meaningful Opt-Out Rights Require Companies to Do Their Part. State Governments Might Have to Make Them. Update: CDT submitted a letter on March 7, 2025 to the California State Assembly’s Committee on Consumer Protection in support of AB 566, which requires vendors of web browsers and mobile operating systems to include a setting that enables users to send an automated signal to businesses with which they interact through their browser or […]

πŸ“’ Today, the California Assembly will hold a hearing on AB 566 β€” a bill requiring web browser & mobile operating system vendors to incl. a setting that enables users to send automated signals indicating that they wish to opt-out of sales of their personal data. cdt.org/insights/mea...

01.04.2025 17:30 β€” πŸ‘ 18    πŸ” 7    πŸ’¬ 1    πŸ“Œ 2
Boxes showing the headshots of the new affiliates in black and white.

Boxes showing the headshots of the new affiliates in black and white.

We’re thrilled to announce nine new Data & Society affiliates! Welcome, Sareeta Amrute, Omer Bilgin, Minsu Longiaru, John Edgar Lopez, Sanjay Pinto, Lana Swartz, ZoΓ« West, @davidthewid.bsky.social, and Sara Ziff! datasociety.net/announcement...

27.03.2025 15:38 β€” πŸ‘ 13    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

@yjernite is following 20 prominent accounts