@moohax - Bluesky Profile

Latest posts by moohax.bsky.social on Bluesky

What's your take on the growing dominance of automated attacks and the implications for AI red teams? Here's ours— based on our analysis of 30 LLM challenges, attempted by 1,674 unique Crucible users, across 214,271 attack attempts: arxiv.org/abs/2504.19855

29.04.2025 16:14 — 👍 4 🔁 5 💬 0 📌 1

Red-Teaming in the Public Interest This report offers a vision for red-teaming in the public interest: a process that goes beyond system-centric testing of already built systems to consider the full range of ways the public can be invo...

@datasociety.bsky.social and the AI Risk and Vulnerability Alliance just released “Red Teaming in the Public Interest,” a report examining how red teaming methods are being adapted to evaluate genAI.

Read the report, featuring commentary from @moohax.bsky.social: datasociety.net/library/red-...

13.02.2025 18:50 — 👍 5 🔁 3 💬 0 📌 0

Sniped. Fell down the rabbit hole, found some code exec 😬

10.02.2025 14:12 — 👍 1 🔁 0 💬 0 📌 0

NEW Crucible Challenge: DeepTweak, an exploration of reasoning model behavior. Cause enough confusion 😵‍💫, retrieve the flag.

Think fast; The first three users to solve DeepTweak will be announced Friday!

➡️ https://crucible.dreadnode.io/challenges/deeptweak?utm_source=social&utm_medium=social&u…

04.02.2025 17:36 — 👍 4 🔁 3 💬 0 📌 1

New to Rigging:

🔥 Tracing
🛠️ API Tools
💻 HTTP Generator
🐍 Prompts as Tools

→ github.com/dreadnode/ri...

06.02.2025 19:09 — 👍 7 🔁 4 💬 0 📌 0

Stanford CRFM

First distillation/extraction attack for OAI was the Stanford Alpaca research. It was after this that OAI changed its ToS to disallow training on outputs. It can happen to all the model providers.

crfm.stanford.edu/2023/03/13/a...

29.01.2025 23:15 — 👍 2 🔁 2 💬 0 📌 0

People learning what alignment means by asking DeepSeek about Taiwan.

29.01.2025 23:14 — 👍 7 🔁 0 💬 0 📌 0

Writing Malware With ChatGPT There are a lot of articles floating around about how ChatGPT can or can't write malware, and I tend to avoid them.

Did some early work here. moohax.substack.com/p/writing-ma...

Working on something better @dreadnode.bsky.social ,can’t wait to show folks what we’ve been working on….soon.

27.11.2024 20:47 — 👍 3 🔁 1 💬 1 📌 0

@moohax is following 20 prominent accounts

Xoreax
@xoreax

Graduate Student and Windows Kernel enjoyer

@comathematician

Flo 🔶
@faz.ms

Interested in #Infosec and #ProductiveDisagreement | #StayAtHomeDad, worked in #Biotech, co-built tiny companies in renewables and structural engineering sectors, ex #HumanRights observer | likes #ElixirLang, #Boardgames |🔸 #10PercentPledge

Ian C {ohCoz}
@cioaonk

Researcher at <…> doing Cyber R&D. Likes all things CTI, OSINT, Adversarial Emulation, Detection Engineering. Recently: World team @ CPTC 10 Currently: #100ishDaysOfYara

Greg Wells
@gregwells

Head of Product @ Dreadnode

BlaiseBits
@blaisebits

Hacker streamer dude with a side of shenanigans. https://twitch.tv/blaisebits

Rad Ads
@radads

Mikel Bober-Irizar
@mikel.ai

23 // Kaggle Competitions Grandmaster & ML/AI Researcher. Building video games @ Iconic, machine reasoning @ Cambridge, bioscience @ ForecomAI. https://mxbi.net / tw: @mikb0b

Steve
@sckain

Coffee-n-chromoly. Speed Regime. Awkward Bit Flips. Occasional reality TV hot takes. Views posted are mine and mine only. RT != necessarily endorsement.

Amanda M
@nmspinach

Microsoft AI Red Team Former Tweep

Ashley
@khyperia

fractal witch, space dork 🔭 ~ she/her, anxiety (AvPD, GAD), 🖤🩶🤍💜 ~ makes stuff at @landfall.se

Jeff Dean
@jeffdean

Google Chief Scientist, Gemini Lead. Opinions stated here are my own, not those of Google. Gemini, TensorFlow, MapReduce, Bigtable, Spanner, ML things, ...

Chris Thompson
@retbandit

Head of Red team @ IBM X-Force. Black Hat Review Board. Founder and co-organizer of Offensive AI Con. Co-Founder of RemoteThreat. inveni et usurpa

Gabriel
@morecoffeeplz

AI Research scientist. Former OpenAI, Apple infosec. “Professor” at John’s Hopkins SAIS Alperovitch Institute. Great deceiver of hike length and difficulty.

Ethan Mollick
@emollick

Professor at Wharton, studying AI and its implications for education, entrepreneurship, and work. Author of Co-Intelligence. Book: https://a.co/d/bC2kSj1 Substack: https://www.oneusefulthing.org/ Web: https://mgmt.wharton.upenn.edu/profile/emollick

Roman Lutz
@romanlutz

Responsible AI | AI Red Teaming at Microsoft

Joe Lucas
@hackthis.ai

AI Security @ NVIDIA OSS Security @ Project Jupyter and NumFOCUS https://developer.nvidia.com/blog/author/jolucas/

Martin Wendiggensen
@machinavelli.com

PhD candidate @ JHU Alperovitch Institute ; AI Research Scientist @ Dreadnode

Justin Elze
@hackinglz.hackpwn.net

CTO @TrustedSec.com | Former Optiv/SecureWorks/Accuvant Labs/Redspin | Race cars

ChillD
@0actionspire