Lukas Aichberger's Avatar

Lukas Aichberger

@aichberger.bsky.social

Machine Learning ELLIS PhD at Johannes Kepler University Linz and University of Oxford

207 Followers  |  172 Following  |  9 Posts  |  Joined: 19.11.2024  |  1.3222

Latest posts by aichberger.bsky.social on Bluesky

Hot take: I think we just demonstrated the first AI agent computer worm ๐Ÿค”

When an agent sees a trigger image it's instructed to execute malicious code and then share the image on social media to trigger other users' agents

This is a chance to talk about agent security ๐Ÿ‘‡

20.03.2025 14:28 โ€” ๐Ÿ‘ 8    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Attacking Multimodal OS Agents with Malicious Image Patches Recent advances in operating system (OS) agents enable vision-language models to interact directly with the graphical user interface of an OS. These multimodal OS agents autonomously perform computer-...

๐Ÿ›๏ธ This work was made possible with OATML and TVG at the University of Oxford (@ox.ac.uk). Special thanks to @yaringal.bsky.social, @adelbibi.bsky.social, @philiptorr.bsky.social, and @alasdair-p.bsky.social for their contributions.

๐Ÿ“– Read the paper: www.arxiv.org/abs/2503.10809

18.03.2025 18:25 โ€” ๐Ÿ‘ 1    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

๐Ÿ’€ Harmful actions could include engaging with the malicious social media post to amplify its spread, navigating to a malicious website, or causing a memory overflow to crash your computer. Preventing such harmful actions remains an open challenge. [6/6]

18.03.2025 18:25 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

๐ŸŽฏ Once an OS agent โ€“ among those the MIP was optimised for โ€“ encounters the MIP during the execution of everyday tasks, empirical results indicate harmful actions are triggered in at least 9 out of 10 cases, regardless of the original task or screenshot layout. [5/6]

18.03.2025 18:25 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

๐Ÿšจ The real danger? Attackers can simply embed MIPs in social media posts, wallpapers, or ads and spread them across the internet. Unlike text-based attacks, MIPs are hard to detect, allowing them to spread unnoticed. [4/6]

18.03.2025 18:25 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

๐Ÿ”“ Our work reveals that OS agents are not ready for safe integration into everyday life. Attackers can craft Malicious Image Patches (MIPs), subtle modifications to an image on the screen that, once encountered by an OS agent, deceive it into carrying out harmful actions. [3/6]

18.03.2025 18:25 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

๐Ÿ’ป AI assistants, known as OS agents, autonomously control computers just like humans do. They navigate by analysing the screen and take actions via mouse and keyboard. OS agents could soon take over everyday tasks, saving users time and effort. [2/6]

18.03.2025 18:25 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Video thumbnail

โš ๏ธ Beware: Your AI assistant could be hijacked just by encountering a malicious image online!

Our latest research exposes critical security risks in AI assistants. An attacker can hijack them by simply posting an image on social media and waiting for it to be captured. [1/6] ๐Ÿงต

18.03.2025 18:25 โ€” ๐Ÿ‘ 8    ๐Ÿ” 8    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 3

Often LLMs hallucinate because of semantic uncertainty due to missing factual training data. We propose a method to detect such uncertainties using only one generated output sequence. Super efficient method to detect hallucination in LLMs.

20.12.2024 12:52 โ€” ๐Ÿ‘ 15    ๐Ÿ” 3    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 2
Preview
Rethinking Uncertainty Estimation in Natural Language Generation Large Language Models (LLMs) are increasingly employed in real-world applications, driving the need to evaluate the trustworthiness of their generated text. To this end, reliable uncertainty estimatio...

๐—ก๐—ฒ๐˜„ ๐—ฃ๐—ฎ๐—ฝ๐—ฒ๐—ฟ ๐—”๐—น๐—ฒ๐—ฟ๐˜: Rethinking Uncertainty Estimation in Natural Language Generation ๐ŸŒŸ

Introducing ๐—š-๐—ก๐—Ÿ๐—Ÿ, a theoretically grounded and highly efficient uncertainty estimate, perfect for scalable LLM applications ๐Ÿš€

Dive into the paper: arxiv.org/abs/2412.15176 ๐Ÿ‘‡

20.12.2024 11:44 โ€” ๐Ÿ‘ 9    ๐Ÿ” 5    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

๐Ÿ™‹โ€โ™‚๏ธ

19.11.2024 11:45 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

@aichberger is following 20 prominent accounts