Bruce (Zhi) Wen @zhi-bruce-wen

A screenshot of Catherine Yeh's website: https://catherinesyeh.github.io/ The website looks clean, colorful, and modern.

A screenshot of four of the websites included in my Are.na board. They mostly show academic websites in different styles. Some are more text-heavy, some feature more colors and images.

If you're a student in need of a personal website (and if you're doing research, yes, you need a website!), I keep a list of nice examples here, most of which are reusable: www.are.na/maria-antoni...

For example, I just spotted this beautiful website by Catherine Yeh: github.com/catherinesye...

03.11.2025 20:10 — 👍 77 🔁 18 💬 8 📌 0

Language Models are Injective and Hence Invertible Transformer components such as non-linear activations and normalization are inherently non-injective, suggesting that different inputs could map to the same output and prevent exact recovery of the in...

I did not expect that: Large Language Models are invertible:

arxiv.org/abs/2510.15511

01.11.2025 03:27 — 👍 77 🔁 16 💬 4 📌 13

Burning out The international AI industry's collective risk.

A new essay on the crazy, all or nothing approach to work happening in AI today, the looming human costs, and the lack of a finish line.

I wouldn't say it's okay, but I'm not sure how to fix it.
www.interconnects.ai/p/burning-out

25.10.2025 14:35 — 👍 61 🔁 14 💬 4 📌 5

LawZero   About LawZero LawZero is a non-profit organization committed to advancing research and creating technical solutions that enable safe-by-design AI systems. Its ...

LawZero is growing fast, and we're always looking for dedicated people to join our team.
If you're interested in working on technical safeguards to create safe-by-design AI systems, check out the openings on our website and don't hesitate to reach out to our team!
job-boards.greenhouse.io/lawzero

24.10.2025 14:58 — 👍 7 🔁 3 💬 0 📌 0

Search Jobs | Microsoft Careers

Are you a PhD student interested in ML and biology or health? Come do an internship with me, @avapamini.bsky.social, Alex Lu, @lcrawford.bsky.social, or Kristen Severson at MSRNE!

Applications are due Dec 1: make sure you include a research statement!

jobs.careers.microsoft.com/global/en/jo...

21.10.2025 19:32 — 👍 18 🔁 9 💬 0 📌 2

entire article

20.10.2025 23:00 — 👍 676 🔁 61 💬 15 📌 13

Every Language Model Has a Forgery-Resistant Signature The ubiquity of closed-weight language models with public-facing APIs has generated interest in forensic methods, both for extracting hidden model details (e.g., parameters) and for identifying...

We discovered that language models leave a natural "signature" on their API outputs that's extremely hard to fake. Here's how it works 🔍

📄 arxiv.org/abs/2510.14086 1/

17.10.2025 17:59 — 👍 88 🔁 24 💬 4 📌 6

I am recruiting PhD students to start in 2026! If you are interested in robustness, training dynamics, interpretability for scientific understanding, or the science of LLM analysis you should apply. BU is building a huge LLM analysis/interp group and you’ll be joining at the ground floor.

16.10.2025 15:45 — 👍 58 🔁 19 💬 1 📌 1

Reasoning with Sampling: Your Base Model is Smarter Than You Think Frontier reasoning models have exhibited incredible capabilities across a wide array of disciplines, driven by posttraining large language models (LLMs) with reinforcement learning (RL). However, desp...

Clever sampling from base model > GRPO post-training.

One of the coolest papers I've read recently (in addition to QAlign/QUEST which has similar approaches).

arxiv.org/abs/2510.14901

17.10.2025 15:59 — 👍 1 🔁 0 💬 0 📌 0

This is so cool. When you look at representational geometry, it seems intuitive that models are combining convex regions of "concepts", but I wouldn't have expected that this is PROVABLY true for attention or that there was such a rich theory for this kind of geometry.

16.10.2025 18:33 — 👍 33 🔁 5 💬 2 📌 1

Keynote at #COLM2025: Nicholas Carlini from Anthropic

"Are language models worth it?"

Explains that the prior decade of his work on adversarial images, while it taught us a lot, isn't very applied; it's unlikely anyone is actually altering images of cats in scary ways.

09.10.2025 13:12 — 👍 80 🔁 22 💬 2 📌 2

When I said data poisoning instead of food poisoning in a completely non-ML context I knew I probably need a break.

06.10.2025 23:19 — 👍 0 🔁 0 💬 0 📌 0

Here’s a #COLM2025 feed!

Pin it 📌 to follow along with the conference this week!

06.10.2025 20:26 — 👍 26 🔁 17 💬 2 📌 1

Scientifique en recherche appliquée - Applied Research Scientist - Mila - Institut québécois d'intelligence artificielle Mila, reconnue pour sa formation académique exceptionnelle et sa recherche d'avant-garde en intelligence artificielle, a pour mission de contribuer au développement économique du Québec et du Canada p...

You get to do cutting-edge ML research with clients in aviation, healthcare, geophysics, entertainment etc., and see your work making a real-world, tangible impact.

Apply below if that sounds interesting to you, or hit me up at #COLM2025 to know more!

apply.workable.com/mila-2/j/1C3...

06.10.2025 12:41 — 👍 0 🔁 0 💬 0 📌 0

On my way (back) to Montreal for #COLM2025 🥯.

Looking forward to see what people are thinking about controllable/safe generation, eval, diffusion LM, interpretability, etc.

And we’re still hirng applied research scientists at Mila! ⬇️

06.10.2025 12:41 — 👍 1 🔁 0 💬 1 📌 0

[📄] Are LLMs mindless token-shifters, or do they build meaningful representations of language? We study how LLMs copy text in-context, and physically separate out two types of induction heads: token heads, which copy literal tokens, and concept heads, which copy word meanings.

07.04.2025 13:54 — 👍 77 🔁 20 💬 1 📌 6

This is a great distinction to make and the characterization is very accurate.

19.09.2025 19:39 — 👍 1 🔁 0 💬 0 📌 0

If you're an undergrad and want to intern with me, this is where you need to apply!

13.09.2025 10:42 — 👍 6 🔁 5 💬 0 📌 1

amen

those with savior and superiority complex, obsessed with sci fi, blinded by dollar signs, devoid of empathy, and severely gullible.

28.08.2025 15:50 — 👍 9 🔁 1 💬 0 📌 0

I believe if you have a kitten you’re legally required to take 100+ photos and send them to all your friends every day. That’s what I had to do anyway.

27.08.2025 03:25 — 👍 1 🔁 0 💬 0 📌 0

We were delighted to welcome the Honorable @mark-carney.bsky.social and Evan Solomon to Mila today for a rich discussion on AI's potential to drive innovation, social progress, and economic resilience in the country, alongside key players from our ecosystem.

20.08.2025 22:07 — 👍 13 🔁 2 💬 0 📌 1

**Please repost** If you're enjoying Paper Skygest -- our personalized feed of academic content on Bluesky -- we'd appreciate you reposting this! We’ve found that the most effective way for us to reach new users and communities is through users sharing it with their network

19.08.2025 17:15 — 👍 39 🔁 41 💬 1 📌 5

Why I took a 4.5 hr $90 flixbus to Ottawa instead of a 4.5 hr $500 train when Air Canada messed up ⬇️

18.08.2025 02:01 — 👍 1 🔁 0 💬 0 📌 0

With fresh support of $75M from NSF and $77M from NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. 💡

14.08.2025 12:16 — 👍 45 🔁 7 💬 1 📌 6

I’m surprised! Maybe the Reddit communities aren’t good comparisons (they have their own norms)? Or the random seed words are more random than prior work? Or it just really helps to prompt with the start of a story?

12.08.2025 16:19 — 👍 13 🔁 4 💬 4 📌 0

As co-organizer, Mila is excited to invite you to the AIMS Hackathon 2025! This global online event will unite developers, researchers, and human rights advocates to create innovative AI solutions to combat modern slavery.
📅 Registration open until August 20 fundacionpasoslibres.org/aimshackathon/

11.08.2025 17:59 — 👍 7 🔁 2 💬 0 📌 0

My least popular (and most correct) view is that cars should be automatically limited to the local speed limit. Put the pedal to the floor and you still can't go over 25mph in a residential area.

(15 in Manhattan btw)

09.08.2025 16:47 — 👍 7849 🔁 931 💬 484 📌 476

I pray to god this project will pan out without too much delay and being watered down, and I’m not even religious. Meanwhile I’ll keep contemplating moving to Europe.

05.08.2025 11:59 — 👍 2 🔁 0 💬 0 📌 0

I feel the same way traveling between Toronto and Montreal too. Worst part is the existing train is slower AND more expensive than flying (and has the same luggage restriction).

05.08.2025 10:48 — 👍 2 🔁 0 💬 1 📌 1

EPFL NLP Postdoctoral Scholar Posting - Swiss AI LLMs The EPFL Natural Language Processing (NLP) lab is looking to hire a postdoctoral researcher candidate in the area of multilingual LLM design, training, and evaluation. This postdoctoral position is as...

The EPFL NLP lab is looking to hire a postdoctoral researcher on the topic of designing, training, and evaluating multilingual LLMs:

docs.google.com/document/d/1...

Come join our dynamic group in beautiful Lausanne!

04.08.2025 15:54 — 👍 21 🔁 12 💬 0 📌 1

Bruce (Zhi) Wen

Latest posts by zhi-bruce-wen.bsky.social on Bluesky

@zhi-bruce-wen is following 20 prominent accounts