Nicola Branchini's Avatar

Nicola Branchini

@nicolabranchini.bsky.social

๐Ÿ‡ฎ๐Ÿ‡น Stats PhD @ University of Edinburgh ๐Ÿด๓ ง๓ ข๓ ณ๓ ฃ๓ ด๓ ฟ @ellis.eu PhD - visiting @avehtari.bsky.social ๐Ÿ‡ซ๐Ÿ‡ฎ ๐Ÿค”๐Ÿ’ญ Monte Carlo, UQ. Interested in many things relating to UQ, keen to learn applications in climate/science. https://www.branchini.fun/about

1,403 Followers  |  814 Following  |  117 Posts  |  Joined: 12.11.2024  |  1.6788

Latest posts by nicolabranchini.bsky.social on Bluesky

MCM 2025 - Program | MCM 2025 Chicago MCM 2025 -- Program.

Flying today towards Chicago ๐ŸŒ† for MCM 2025

fjhickernell.github.io/mcm2025/prog...

Will give a talk on our recent/ongoing works on self-normalized importance sampling, including learning a proposal with MCMC and ratio diagnostics.

www.branchini.fun/pubs

24.07.2025 09:06 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Really cool work : ) @alexxthiery.bsky.social

www.tandfonline.com/doi/full/10....

16.07.2025 08:57 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

agree; you should check out @yfelekis.bsky.social 's work on this line ๐Ÿ˜„

08.07.2025 11:03 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Just don't see that the PPD_q of the original post leads somewhere useful.
Anyway, thanks for engaging @alexlew.bsky.social : )

06.07.2025 11:03 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I agree, except I think it can be ok to shift the criteria of "good q" to instead some well-defined measure of predictive performance (under no model misspecification, let's say). Ofc Bayesian LOO-CV is one. We could discuss to use other quantities, and how to estimate them, ofc.

06.07.2025 11:03 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Genuine question: what is the estimated value used for then ?

06.07.2025 10:46 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

(computed with the inconsistent method)

06.07.2025 10:38 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Well, re: [choose q1 or q2 based on whether P_q1 > P_q2]
I seem to understand that many VI papers say: here's a new VI method, it produces q1; old VI method gives q2. q1 is better than q2 because test-log PPD is higher !

06.07.2025 10:36 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Not entirely obvious to me, but I see the intuition !

05.07.2025 13:31 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Am definitely at least *trying* to think carefully about the evaluation here ๐Ÿ˜… ๐Ÿ˜‡

05.07.2025 13:30 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Right ! Definitely not sure if necessary, but I like to think there would be value / would be interesting if we wanted to somehow speak formally about generalizing over unseen test points

05.07.2025 13:28 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

It still seems "dangerous" to use the numerical value of (an estimate of) โˆซ p(y|ฮธ) q(ฮธ) to decide which approximate q is better.

(Of course, you may argue we maybe shouldn't use even any MC estimates of the original โˆซ p(y|ฮธ) p(ฮธ|D) with q as proposal, but the above is even less justified)

05.07.2025 13:25 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

I don't see that it needs to get that philosophical ?
It is totally possible to formally estimate the pdf itself, since we have some 'test' samples of y, and consider MISE type errors, even if in this case pointwise evaluations of the pdfs have the intractable integral.

05.07.2025 13:20 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

thatโ€™s why I like to just say posterior predictive integral(s) instead

05.07.2025 13:13 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

This is an aside btw, but nobody in practice actually in principle estimates the marginal ppd density, right ?
Just the pointwise density evaluations at some specific yโ€™s (whose randomness is not taken into account)

05.07.2025 13:12 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

(I mean it is obvious in the unnormalized weights case, less so when self-normalizing, which was what I refer to here)

05.07.2025 12:54 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

(also, with with approach using q as proposal, it is theoretically possible to choose q that is better than iid sampling from the true posterior ;) )

05.07.2025 12:39 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

(not sure where that link came from)

I could even be somewhat ok with an inconsistent estimator, but we canโ€™t then make a decision comparing values with different proposals in any mathematically sound way ?

05.07.2025 11:54 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

but then you can just tune the underlying P_q to make it high, by changing / optimizing q !
It doesnโ€™t make any sense

05.07.2025 11:53 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Thatโ€™s my point about 2.
If you donโ€™t use it as proposal, you have an inconsistent estimator of P*. With the inconsistent approach, youโ€™re changing the true underlying quantity !
(They argue itโ€™s of interest by itself)
This is bad as I see, since people choose q1 or q2 based on whether P_q1 > P_q2

05.07.2025 11:51 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
Understanding the difficulties of posterior predictive estimation Predictive posterior densities (PPDs) are essential in approximate inference for quantifying predictive uncertainty and comparing inference methods. Typically, PPDs are estimated by simple Monte...

This new ICML paper (openreview.net/forum?id=Tzf...) reinforces and insists on the notion that one would want to replace the posterior predictive P* := โˆซ p(y|ฮธ) p(ฮธ|D) ,
with P^{q} := โˆซ p(y|ฮธ) q(ฮธ) , with q(ฮธ) โ‰ˆ p(ฮธ|D), then estimate _that_ with MC.
You know me. I don't get it.
What do I miss?

05.07.2025 11:03 โ€” ๐Ÿ‘ 8    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

It will seem very silly, but every single time I have a poster, I have at least one person being very surprised that you can do better Monte Carlo than "perfect" i.i.d. sampling from p, for E_p[f(ฮธ)] ๐Ÿ˜ƒ

17.06.2025 20:17 โ€” ๐Ÿ‘ 7    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Video thumbnail

Here's how the gradient flow for minimizing KL(pi, target) looks under the Fisher-Rao metric. I thought some probability mass would be disappearing on the left and appearing on the right (i.e. teleportation), like a geodesic under the same metric, but I was very wrong... What's the right intuition?

13.06.2025 16:29 โ€” ๐Ÿ‘ 23    ๐Ÿ” 6    ๐Ÿ’ฌ 4    ๐Ÿ“Œ 0
Preview
When Does Monte Carlo Depend Polynomially on the Number of Variables? We study the classical Monte Carlo algorithm for weighted multivariate integration. It is well known that if Monte Carlo uses n randomized sample points for a function of d variables then it has error...

link.springer.com/chapter/10.1...

13.06.2025 15:17 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Wish I had found this when I got started : P (or, maybe not)

13.06.2025 15:16 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

it is to mimick the name of their coffee place, which is โ€œcafetoriaโ€

04.06.2025 12:52 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

As Edinburgh's appointed minister of Gelato by @nolovedeeplearning.bsky.social, I look forward to try ๐Ÿ˜„

04.06.2025 12:48 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

I have cleaned up the notebooks for my course on Optimal Transport for Machine Learners and added links to the slides and lecture notes. github.com/gpeyre/ot4ml

25.05.2025 09:12 โ€” ๐Ÿ‘ 58    ๐Ÿ” 9    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Towards Adaptive Self-Normalized Importance Samplers The self-normalized importance sampling (SNIS) estimator is a Monte Carlo estimator widely used to approximate expectations in statistical signal processing and machine learning. The efficiency of S...

๐Ÿšจ New paper: โ€œTowards Adaptive Self-Normalized ISโ€, @ IEEE Statistical Signal Processing Workshop.

TLDR;
To estimate ยต = E_p[f(ฮธ)] with SNIS, instead of doing MCMC on p(ฮธ) or learning a parametric q(ฮธ), we try MCMC directly on p(ฮธ)| f(ฮธ)-ยต | (variance-minimizing proposal).

arxiv.org/abs/2505.00372

02.05.2025 13:29 โ€” ๐Ÿ‘ 31    ๐Ÿ” 11    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image

If one of the two distributions is an isotropic Gaussian, then flow matching is equivalent to a diffusion model. This is known as Tweedie's formula. In particular, the vector field is a gradient vector, as in optimal transport. speakerdeck.com/gpeyre/compu...

31.05.2025 10:16 โ€” ๐Ÿ‘ 45    ๐Ÿ” 4    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

@nicolabranchini is following 20 prominent accounts