Eugene Berta @eberta - Bluesky Profile

COLT Workshop on Predictions and Uncertainty was a banger!

I was lucky to present our paper "Minimum Volume Conformal Sets for Multivariate Regression", alongside my colleague @eberta.bsky.social and his awsome work on calibration.

Big thanks to the organizers!

#ConformalPrediction #MarcoPolo

09.07.2025 16:58 — 👍 2 🔁 1 💬 0 📌 0

What if we have been doing early stopping wrong all along?
When you break the validation loss into two terms, calibration and refinement
you can make the simplest (efficient) trick to stop training in a smarter position

08.04.2025 15:22 — 👍 34 🔁 8 💬 1 📌 0

This suggests a clear link with the ROC curve in the binary case, but writing it down formally, the relationship between the two is a bit ugly…

05.02.2025 10:38 — 👍 1 🔁 0 💬 0 📌 0

Classifier Calibration with ROC-Regularized Isotonic Regression Calibration of machine learning classifiers is necessary to obtain reliable and interpretable predictions, bridging the gap between model outputs and actual probabilities. One prominent technique, ...

Isotonic regression minimizes the risk of any « Bregman loss function » (included cross-entropy, see section 2.1 below) up to monotonic relabeling, which looks a lot like our « refinement as a minimiser » formulation. It also find the ROC convex hull.
proceedings.mlr.press/v238/berta24...

05.02.2025 10:37 — 👍 1 🔁 0 💬 1 📌 0

However, for calibration of the final model, adding an intercept or doing matrix scaling might work even better in certain scenario (imbalanced, non-centered). We’ve experimented with existing implementation with limited success for now, maybe we should look at that in more details…

05.02.2025 10:23 — 👍 0 🔁 0 💬 0 📌 0

Not yet! Vector/matrix scaling has more parameters so it is more prone to overfitting the validation set, and simple TS seems to calibrate well empirically, which is why we stuck with that to estimate refinement error for early stopping.

05.02.2025 10:20 — 👍 1 🔁 0 💬 1 📌 0

I’ve observed refinement being minimized before calibration for small (probably under-fitter) neural nets. In many cases, the refinement curve also starts « overfitting » at some point.

04.02.2025 19:33 — 👍 2 🔁 0 💬 0 📌 0

We’ve not tried what you’re suggesting but if the training cost is small this might indeed be a good option!

04.02.2025 19:32 — 👍 1 🔁 0 💬 0 📌 0

Indeed regularisation seems very important. It can have large impact on how calibration error behaves. Combined with learning rate schedulers, this can have surprising effects, like calibration error starting to go down again at some point.

04.02.2025 19:28 — 👍 1 🔁 0 💬 1 📌 0

Thanks! We have experimented with many models, observing various behaviours. The « calibration going up while refinement goes down » seems typical in deep learning from what I’ve seen. With smaller models other things can appear, as suggested by our logistic regression analysis (section 6).

04.02.2025 19:24 — 👍 2 🔁 0 💬 2 📌 0

03.02.2025 13:33 — 👍 4 🔁 0 💬 0 📌 0

Rethinking Early Stopping: Refine, Then Calibrate Machine learning classifiers often produce probabilistic predictions that are critical for accurate and interpretable decision-making in various domains. The quality of these predictions is generally ...

📖 Read the full paper: arxiv.org/abs/2501.19195
💻 Check out our code: github.com/dholzmueller...

03.02.2025 13:03 — 👍 4 🔁 1 💬 0 📌 0

Early stopping on validation loss? This leads to suboptimal calibration and refinement errors—but you can do better!
With @dholzmueller.bsky.social, Michael I. Jordan, and @bachfrancis.bsky.social, we propose a method that integrates with any model and boosts classification performance across tasks.

03.02.2025 13:03 — 👍 18 🔁 9 💬 4 📌 0

Eugene Berta

Latest posts by eberta.bsky.social on Bluesky

@eberta is following 20 prominent accounts