Gabriel's Avatar

Gabriel

@morecoffeeplz.bsky.social

AI Research scientist. Former OpenAI, Apple infosec. “Professor” at John’s Hopkins SAIS Alperovitch Institute. Great deceiver of hike length and difficulty.

1,049 Followers  |  461 Following  |  373 Posts  |  Joined: 25.05.2023  |  2.4655

Latest posts by morecoffeeplz.bsky.social on Bluesky

“What if you could fuck the singularity?” is the apotheosis of technofuturism (2025)

14.10.2025 21:22 — 👍 8    🔁 3    💬 0    📌 0
Post image

BREAKING: Friday night massacre underway at CDC. Doznes of "disease detectives," high-level scientists, entire Washington staff and editors of the MMWR (Morbidity and Mortality Weekly Report) have all been RIFed and received the following notice:

11.10.2025 02:10 — 👍 15391    🔁 8478    💬 859    📌 1123

👋

05.10.2025 16:05 — 👍 0    🔁 0    💬 0    📌 0

Some research from my team!

01.10.2025 18:33 — 👍 2    🔁 0    💬 0    📌 0

@sentinelone.com social team I am also on bluesky 😂

01.10.2025 18:32 — 👍 0    🔁 0    💬 0    📌 0

james comey (2025)

26.09.2025 00:35 — 👍 4    🔁 2    💬 1    📌 0

Not the BPO report we need, but definitely the one we deserve.

24.09.2025 20:55 — 👍 1    🔁 0    💬 0    📌 0

3. What additional constraints do LLMs produce for adversaries? Hunting with the contraints of our adversaries was our initial premise. We've been doing it for years, LLMs simply present a new dimension for us to explore. If you'd like to work with us on this please let us know!

22.09.2025 21:52 — 👍 0    🔁 0    💬 0    📌 0

Malware that can run simple instructions, identify the target device, important files, and provide summaries back to a C2 would eliminate or streamline a significant amount of adversary workload.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

2. LLM-enabled malware is interesting and (we believe) important to study, but it is unclear exactly what the operational advances are. Assuming we get to the point of LLMs running natively on endpoints malware that could hijack that process may be extremely useful.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

Ok some questions that this research posed for us:

1. Hunting for prompts and API keys works, but it is a brittle detection. Eventually adversaries will move to proxy services that provide some level of obfuscation. What do we do then?

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

If we want to understand LLM risks, we should align expectations with risks we can observe and measure, not hype.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

Understanding how capable LLMs are wrt hacking is important work, but setting that aside for the moment, in a year of analysis we did not observe the capabilities that labs are concerned with being deployed by malicious actors in the wild.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

We noted that the capabilities we observed in LLM-enabled malware were operational, that is they helped adversaries with specific tasks.

That aligns with current LLM capabilities in software development and how they’re deployed.

22.09.2025 21:52 — 👍 1    🔁 0    💬 1    📌 0

Traditionally, malware analysis starts at a disadvantage, you work backward from development assumptions.

With prompts, intent is immediately visible. No need to second-guess the adversary’s aim.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

This was by far our most successful technique.

We quickly identified prompts for agentic computer-network exploitation, vulnerability injectors, shellcode generators, WormGPT copycats, apps designed to control Android screens, and red-teaming tools for LLM-agent benchmarking.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

In our case, we knew the prompts were hardcoded and that they followed certain formats, structures, or keywords.

This isn’t so different from hunting code patterns, but instead we’re hunting strings and patterns in hardcoded prompts.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

There’s a lot to improve here, but we were excited the classification method worked at scale. Still we wanted to find more samples which led us to our final method.

In traditional malware, we hunt code; in LLM-enabled malware, we hunt prompts.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

For Java files, we built a custom solution: extract potential prompts via regex, identify/sanitize API keys, and send file content to an LLM to summarize class behavior and assess suspicious/malicious/benign. As a lightweight classifier, it was efficient, but flagged only one file as malicious.

22.09.2025 21:52 — 👍 2    🔁 0    💬 1    📌 0

We used regex to spot LLM provider indicators (API keys, domains, prompt syntax), then decompiled and identified classes referencing LLM artifacts or behaviors.

Java files and .so binaries were the most promising since, at this stage, code is decompiled and readable.

22.09.2025 21:52 — 👍 1    🔁 0    💬 1    📌 0

Back to our original catch: of nearly 7k samples, ~4k were Android. @alex.leetnoob.com built a GCP pipeline to upload APK/DEX to a bucket, then processed them with a script that decompiled each APK.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

So, hunting by API keys and simple clustering works... but it isn’t the most effective.

There were many false positives, and sorting required significant manual review.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

Then something interesting, TestMal3 ironically a defensive tool (“FalconShield”).

This is a brittle scanner checking for “import openai” + “exec(” patterns in a target Python file. It asks GPT to judge if code is malicious, writes a “malware analysis” report, and (claims to) hand off to VT.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

TestMal2 has a more structured builder/loader, more nuanced ransomware/reverse-shell menu options with IP/Port, cleans the LLM response, writes to a separate script, then executes it.

Some obfuscation attempts are present, but still a straight malware-generation pipeline.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

So MalTerminal.exe is the compiled malware.

testAPI (1) & (2) are functionally identical Python loaders offering Ransomware or Reverse Shell options.

TestMal2 is a more advanced version of testAPI likely an early Python version of the compiled Malterminal.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

You could instrument and intercept the generated code. However... the context and intent of the returned code is visible in the prompt itself.

And with such precise requests, its obvious what the results will be.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

Like Lamehug and PromptLock, Malterminal generates and executes LLM-generated code at runtime requesting specific Python functions from the LLM and then executing them as returned.

Risky without proper input sanitization, but it limits traditional detection of the generated functions.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

This suggests the malware was written before the update. Digging deeper, we found 15 samples associated with the cluster and significant code similarities in a public GitHub project.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

This cluster used OpenAI GPT-4 to generate ransomware or reverse-shell code dynamically, specifically an outdated call to the OpenAI chat-completions API (deprecated early Nov 2023).

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

And what I mean by that is sometimes you get lucky and a malware author names their executable “malware.exe.” And then you pivot to finding additional Python scripts using the same API key.

22.09.2025 21:52 — 👍 0    🔁 0    💬 1    📌 0

@morecoffeeplz is following 20 prominent accounts