Sujay Nagaraj @snagaraj - Bluesky Profile

Denied a loan, an interview, or an insurance claim by machine learning models? You may be entitled to a list of reasons.

In our latest w @anniewernerfelt.bsky.social @berkustun.bsky.social @friedler.net, we show how existing explanation frameworks fail and present an alternative for recourse

24.04.2025 06:19 — 👍 16 🔁 7 💬 1 📌 1

We’ll be at #ICLR2025, Poster Session 1 – #516!
Come chat if you’re interested in learning more!

This is work done with wonderful collaborators: Yang Liu, @fcalmon.bsky.social, and @berkustun.bsky.social.

19.04.2025 23:04 — 👍 3 🔁 0 💬 0 📌 0

Our algorithm can improve safety and performance by flagging regretful predictions for abstention or data cleaning.

For example, we demonstrate that, by abstaining from prediction using our algorithm, we can reduce mistakes compared to standard approaches:

19.04.2025 23:04 — 👍 2 🔁 0 💬 1 📌 0

We develop a method that trains models over plausible clean datasets to anticipate regretful predictions, helping us spot when a model is unreliable at the individual-level.

19.04.2025 23:04 — 👍 1 🔁 0 💬 1 📌 0

We capture this effect with a simple measure: regret.

Regret is inevitable with label noise, but it can tell us where models silently fail, and how we can guide safer predictions

19.04.2025 23:04 — 👍 1 🔁 0 💬 1 📌 0

This lottery breaks modern ML:

If we can’t tell which predictions are wrong, we can’t improve models, we can’t debug, and we can’t trust them in high-stakes tasks like healthcare.

19.04.2025 23:04 — 👍 1 🔁 0 💬 1 📌 0

We can frame this problem as learning from noisy labels.

Plenty of algorithms have been designed to handle label noise by predicting well on average, but we show how they still fail on specific individuals.

19.04.2025 23:04 — 👍 1 🔁 0 💬 1 📌 0

Many ML models predict labels that don’t reflect what we care about, e.g.:
– Diagnoses from unreliable tests
– Outcomes from noisy electronic health records

In a new paper w/@berkustun, we study how this subjects individuals to a lottery of mistakes.
Paper: bit.ly/3Y673uZ
🧵👇

19.04.2025 23:04 — 👍 12 🔁 2 💬 1 📌 0

We’ll be at #ICLR2025, Poster Session 1 – #516!
Come chat if you’re interested in learning more! This is work done with wonderful collaborators: Yang Liu, @fcalmon.bsky.social, and @berkustun.bsky.social

19.04.2025 22:09 — 👍 2 🔁 0 💬 0 📌 0

Our algorithm can improve safety and performance by flagging regretful predictions for abstention or for data cleaning. For example, we demonstrate how abstaining from prediction on these instances can reduce mistakes compared to standard approaches:

19.04.2025 22:09 — 👍 0 🔁 0 💬 1 📌 0

We develop a method to anticipate regretful predictions by training models over plausible clean datasets.

This helps us spot when a model is unreliable at the individual-level.

19.04.2025 22:09 — 👍 0 🔁 0 💬 1 📌 0

We capture this effect with a simple measure: regret.

Regret is inevitable with label noise -- it tells us where models silently fail, and how we can guide safer predictions.

19.04.2025 22:09 — 👍 0 🔁 0 💬 1 📌 0

This lottery breaks modern ML:

If we can’t tell which predictions are wrong, we can’t improve models, we can’t debug, and we can’t trust them in high-stakes tasks like healthcare.

19.04.2025 22:09 — 👍 0 🔁 0 💬 1 📌 0

We can frame this as learning from noisy labels.

Plenty of algorithms have been designed to handle label noise by predicting well on average —
But we show how they can still fail on specific individuals.

19.04.2025 22:09 — 👍 0 🔁 0 💬 1 📌 0

🧠 Key takeaway: Label noise isn’t static—especially in time series.

💬 Come chat with me at #ICLR2025 Poster Session 2!

Shoutout to my amazing colleagues behind this work:
@tomhartvigsen.bsky.social
@berkustun.bsky.social

13.04.2025 17:40 — 👍 1 🔁 0 💬 0 📌 0

🔬 Real-world demo:
We applied our method to stress detection from smartwatches where we have noisy self-reported labels vs. clean physiological measures.

📈 Our model tracks the true time-varying label noise—reducing test error over baselines.

13.04.2025 17:40 — 👍 1 🔁 0 💬 1 📌 0

We propose methods to learn this function directly from noisy data.

💥 Results:
On 4 real-world time series tasks:

✅ Temporal methods beat static baselines
✅ Our methods better approximate the true noise function
✅ They work when the noise function is unknown!

13.04.2025 17:40 — 👍 1 🔁 0 💬 1 📌 0

📌 We formalize this setting:
A temporal label noise function defines how likely each true label is to be flipped—as a function of time.

Using this function, we propose a new time series loss function that is provably robust to label noise.

13.04.2025 17:40 — 👍 1 🔁 0 💬 1 📌 0

🕒 What is temporal label noise?

In many real-world time series (e.g., wearables, EHRs), label quality fluctuates over time
➡️ Participants fatigue
➡️ Clinicians miss more during busy shifts
➡️ Self-reports drift seasonally

Existing methods assume static noise → they fail here

13.04.2025 17:40 — 👍 1 🔁 0 💬 1 📌 0

Learning under Temporal Label Noise Many time series classification tasks, where labels vary over time, are affected by label noise that also varies over time. Such noise can cause label quality to improve, worsen, or periodically chang...

🚨 Excited to announce a new paper accepted at #ICLR2025 in Singapore!

“Learning Under Temporal Label Noise”

We tackle a new challenge in time series ML: label noise that changes over time 🧵👇

arxiv.org/abs/2402.04398

13.04.2025 17:40 — 👍 5 🔁 1 💬 1 📌 1

Would be great to be added :)

22.12.2024 02:57 — 👍 1 🔁 0 💬 1 📌 0

Sujay Nagaraj

Latest posts by snagaraj.bsky.social on Bluesky

@snagaraj is following 20 prominent accounts