Augustin Godinot @grodino - Bluesky Profile

In addition to norms, there are some works on technical tools to *try* to reduce risks: detecting data arxiv.org/abs/2507.20708, model arxiv.org/abs/2505.04796 or explanations openreview.net/pdf?id=3vmKQ... manipulations (non exhaustive).
Nice to read about those risks from an other pov!

24.10.2025 08:56 — 👍 2 🔁 0 💬 0 📌 0

A screenshot of the new landing page, showing the heading "A new foundation for documents" and a few drawings representing document types with a thick specification binder highlighted in color.

In the past two years, Typst has become the foundation to base document writing on for so many people. With the lessons from their experience, we are launching our new website today.

20.08.2025 14:15 — 👍 45 🔁 11 💬 1 📌 1

We are presenting the paper with Milos tomorrow at 11am, come chat at our poster !
11am, East Exhibition Hall A-B #E-1911

16.07.2025 17:18 — 👍 0 🔁 0 💬 0 📌 0

A new example of discrepancy between what is shown to auditors and what really happens on the system (and this time it's not Meta).Thank you @aiforensics.org !

Can't wait for the "sorry it was an intern's mistake, we fired them" answer.

13.06.2025 13:11 — 👍 1 🔁 0 💬 0 📌 0

TikTok Research API - Availability Dashboard

🧵 We just exposed how TikTok's "Research API" is systematically hiding content from researchers

Despite promises of transparency under EU law, the platform is missing 1 in 8 videos from its own research tools:
aiforensics.org/work/tk-api

13.06.2025 09:12 — 👍 5 🔁 2 💬 1 📌 1

🧵6/6 If you want to read more about this, I encourage you to read the paper, but not only!
There is a lot of exciting works on robust audits, here are a few I enjoyed:
arxiv.org/abs/2402.02675
arxiv.org/abs/2504.00874
arxiv.org/abs/2502.03773
arxiv.org/abs/2305.13883
arxiv.org/abs/2410.02777

09.05.2025 15:38 — 👍 0 🔁 0 💬 0 📌 0

Figure showing how much a prior can help reduce the unfairness can a platform hide by manipulating its answers.

🧵5/6 In this paper, we formalize the second approach as a search for efficient "audit priors".
We instantiate our framework with a simple idea: just look at the accuracy of the platform's answers.
Our experiments show that this can help reduce the amount of unfairness a platform could hide.

09.05.2025 15:38 — 👍 0 🔁 0 💬 1 📌 0

🧵4/6 There are two main approaches to avoid manipulations.
🔒Crypto guarantees: the model provider is forced to commit their model and sign every answer.
📐Clever ML tricks: the auditor uses information about the model (training data, model structure, ...) to understand what is a "good answer".

09.05.2025 15:38 — 👍 0 🔁 0 💬 1 📌 0

Screenshot of the equation describing how to implement an audit manipulation in practice.

🧵3/6 You know the metric, you know the questions, and I don't have access to your model.
Thus, nothing prevents you from manipulating the answers of your model to pass the audit.

And this is very easy! In fact, any fairness mitigation method can be transformed into an audit manipulation attack.

09.05.2025 15:38 — 👍 0 🔁 0 💬 1 📌 0

🧵2/6 link: arxiv.org/pdf/2505.04796

An audit is pretty straightforward.
1/ I, the auditor 🕵️ come up with questions to ask your model.
2/ You, the platform 😈 answer my questions.
3/ I look at your answers and decide whether your system abides by the law by computing a series of aggregate metrics.

09.05.2025 15:38 — 👍 0 🔁 0 💬 1 📌 0

Title and abstract of the "Robust ML Auditing using Prior Knowledge" paper. link: https://arxiv.org/abs/2505.04796

📯 ICML25 spotlight 📯 How to detect and prevent audit manipulations?

Do you remember 🚗 Dieselgate 💨? The car computer would detect when it was on a test-bench and reduce the engine power to fake environmental compliance. Well, this can happen in AI too.

How can we avoid this ? 🧵1/6

09.05.2025 15:38 — 👍 3 🔁 0 💬 1 📌 1

I will be presenting this work at #AAAI on Saturday (board 39).
Come and chat about model comparison, auditing and manipulations !

27.02.2025 20:20 — 👍 1 🔁 0 💬 0 📌 0

📣 Queries, Representation, Detection: the next 100 model fingerprinting schemes

I made some 🍋 Lemon QuRD, I hope you like it!

💻 code : github.com/grodino/QuRD/
📎 paper : arxiv.org/abs/2412.13021
TL;DR: we show that a simple baseline meets or beats existing model fingerprints and investigate why.

27.02.2025 20:17 — 👍 0 🔁 0 💬 0 📌 1

If you work at the intersection of security, privacy, and machine learning, or more broadly how to trust ML, SaTML is a small-scale conference with highly-relevant work where you'll be able to have high-quality conversations with colleagues working in your area.

14.01.2025 16:05 — 👍 12 🔁 5 💬 2 📌 0

Augustin Godinot

Latest posts by grodino.bsky.social on Bluesky

@grodino is following 20 prominent accounts