Leon Derczynski's Avatar

Leon Derczynski

@leonderczynski.bsky.social

LLM Security & Safety at NVIDIA Prof in CS/NLP at IT University of Copenhagen garak guy, garak.ai "berømt skikkelse" "like a gazelle" Copenhagen/Seattle

670 Followers  |  366 Following  |  163 Posts  |  Joined: 20.11.2024  |  2.0475

Latest posts by leonderczynski.bsky.social on Bluesky

what thing

08.08.2025 04:37 — 👍 2    🔁 0    💬 1    📌 0
Post image

Come to LLMSEC at ACL & hear Niloofar's keynote

"What does it mean for agentic AI to preserve privacy?" - Niloofar Mireshghallah, Meta/CMU

(Friday 1st Aug, 11.00; Austria Center Vienna Hall B)

See you there!

#acl2025 #acl2025nlp

28.07.2025 15:19 — 👍 12    🔁 2    💬 1    📌 0

the "oyster tower" bit was great. brutal

24.07.2025 09:28 — 👍 1    🔁 0    💬 1    📌 0

Or he's pushing a product into which x sunk significant capex?

11.07.2025 06:36 — 👍 0    🔁 0    💬 0    📌 0
have the courage to use your own intelligence

have the courage to use your own intelligence

logging on

10.07.2025 11:34 — 👍 154    🔁 29    💬 1    📌 1

Brazen of them. Sounds extremely awkward for you, I'm sorry. What would they have done with unaltered slides? Cancelled and left a gap in their schedule for no doubt paying participants?

08.07.2025 12:44 — 👍 5    🔁 0    💬 0    📌 0
Preview
Release v0.12.0 · NVIDIA/garak What's Changed New plugins Add audio NIM model and audio probes by @erickgalinkin in #1163 Leakreplay refactor by @dchiitmalla in #1264 probes: refactor fact snippet mixin by @leondz in #1187 New...

new garak, llm vuln scanner rls (v0.12.0)

* Audio attacks, for multimodal models
* More training data membership inference attacks
* Multilingual attacks can now also use GCP
* Detailed eval summary in one JSONL row/object

+more :)

details: github.com/NVIDIA/garak...

02.07.2025 15:32 — 👍 0    🔁 0    💬 0    📌 0

the dying but clinging on battery in the bathroom's Frozen-branded soap dispenser reminds me that it's only 4-5 months til Bublé & Let It Go season. aren't you looking forward

27.06.2025 05:01 — 👍 2    🔁 0    💬 0    📌 0

why do academics send and expect so much weekend email and work. not healthy

22.06.2025 06:10 — 👍 1    🔁 0    💬 0    📌 0

It's been 2.5 years but ANY SECOND NOW, right?

20.06.2025 12:41 — 👍 0    🔁 0    💬 0    📌 0

data indicates students don't like using it, sorry

19.06.2025 16:16 — 👍 0    🔁 0    💬 1    📌 0

computer scientists encountering the concept of "desirable difficulty"

19.06.2025 16:15 — 👍 0    🔁 0    💬 0    📌 0

remembering the time i checked in to my reasonably classy russian business hotel late with my wife, and the staff said "sir, this... girl.. not allowed"

she's a serious professor

we went through to the room, opened the balcony door, and buried a bottle of champagne in the metre of snow

good times

18.06.2025 06:56 — 👍 1    🔁 0    💬 0    📌 0
Login • Instagram Welcome back to Instagram. Sign in to check out what your friends, family & interests have been capturing & sharing around the world.

@jjvincent.bsky.social woah ur really famous! love this attack also. I automate and run it for a living

www.instagram.com/reel/DKz9ezj...

14.06.2025 06:06 — 👍 2    🔁 0    💬 0    📌 0

Michael... OK...

14.06.2025 05:01 — 👍 0    🔁 0    💬 0    📌 0

Great to see our work uncovering dangerous issues in commercial LLM "therapists" getting some coverage: futurism.com/stanford-the...

14.06.2025 04:01 — 👍 2    🔁 1    💬 0    📌 0

I have not updated since Christmas, I see. Guess I'd better put on some summer Bublé

09.06.2025 06:04 — 👍 1    🔁 0    💬 0    📌 0
The Internet Used to Be a Place
YouTube video by Sarah Davis Baker The Internet Used to Be a Place

www.youtube.com/watch?v=oYlc...

08.06.2025 10:11 — 👍 2    🔁 0    💬 1    📌 0

"natwirkung"

"wirk smorter nat horder"

accents dreamed up by the utterly deranged

(what is going on with that 🇺🇸 vowel sheft)

08.06.2025 10:05 — 👍 0    🔁 0    💬 0    📌 0

what is this photoshoot
delete this omg

05.06.2025 05:26 — 👍 1    🔁 0    💬 0    📌 0

i need you to understand that "alternate uses" is a terrible test/definition of creativity and has been for some time. it's extremely narrow, very shallow, and misses almost everything we know about creativity

03.06.2025 05:09 — 👍 0    🔁 0    💬 0    📌 0

i had a father in law once who did a talk like this called "Looking up Orion's skirt"

02.06.2025 17:39 — 👍 1    🔁 0    💬 0    📌 0
Post image

3² + 4² = 5² ? big if true

21.05.2025 10:24 — 👍 1    🔁 0    💬 0    📌 0

ARE YOU SAYING IT'S A CON

15.05.2025 20:39 — 👍 1    🔁 0    💬 1    📌 0

if overleaf being down slows "ai progress", i'm not sure "ai progress" is particularly well defined

15.05.2025 05:17 — 👍 7    🔁 0    💬 1    📌 0

very! in may, truly a blessing

14.05.2025 09:32 — 👍 1    🔁 0    💬 1    📌 0

is a dropped copula a dropula

14.05.2025 08:29 — 👍 0    🔁 0    💬 0    📌 0
A small plastic cup trophy with stuck-on label

A small plastic cup trophy with stuck-on label

Here's my "Most Inappropriate Demo" trophy at NVIDIA, 2024. For garak's "atkgen.Tox" probe, an unfettered LLM used to goad other LLMs into being toxic.

19.03.2025 13:30 — 👍 8    🔁 0    💬 0    📌 0

“If she wants to know something specific, but doesn’t want people to notice her asking questions, she should simply make incorrect statements while in the company of experts. Her companions will correct her, especially if they're men.”

- Advice for female agents in WW2, provided during SOE training

17.03.2025 11:52 — 👍 10192    🔁 2432    💬 140    📌 345

its amazing how chatgpt knows everything about subjects I know nothing about, but is wrong like 40% of the time in things im an expert on. not going to think about this any further

08.03.2025 00:13 — 👍 12422    🔁 3113    💬 88    📌 106

@leonderczynski is following 20 prominent accounts