Konstantin Mishchenko's Avatar

Konstantin Mishchenko

@konstmish.bsky.social

Research Scientist at Meta Paris Code generation, math, optimization

296 Followers  |  113 Following  |  5 Posts  |  Joined: 21.11.2024  |  1.4881

Latest posts by konstmish.bsky.social on Bluesky

I think it's much more important we get a better scoring system for matching reviewers and papers. High affinity scores on OpenReview are often misleading. A lot of reviewers complained to me they get random papers from TMLR, and they don't enjoy reviewing as a consequence.

13.12.2024 16:15 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Cool new result: random arcsine stepsize schedule accelerates gradient descent (no momentum!) on separable problems. The separable class is clearly very limited, and it remains unclear if acceleration using stepsizes is possible on general convex problems.
arxiv.org/abs/2412.05790

10.12.2024 13:04 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

The idea that one needs to know a lot of advanced math to start doing research in ML seems so wrong to me. Instead of reading books for weeks and forgetting most of them a year later, I think it's much better to try do things, see what knowledge gaps prevent you from doing them, and only then read.

06.12.2024 14:26 โ€” ๐Ÿ‘ 9    ๐Ÿ” 2    ๐Ÿ’ฌ 4    ๐Ÿ“Œ 0
Post image Post image

It's a bit hard to say because this kind of results are still quite new, but one of the most recent papers on the topic, arxiv.org/abs/2410.16249, mentions a conjecture on the optimality of its 1/n^{logโ‚‚(1+โˆš 2)} (not for the last iterate though).

27.11.2024 23:01 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Gradient Descent with large stepsizes converges faster than O(1/T) but it was only shown for the *best* iterate before. Cool to see new results showing we can also get an improvement for the last iterate:
arxiv.org/abs/2411.17668
I am still waiting to see a version with adaptive stepsizes though ๐Ÿ‘€

27.11.2024 15:02 โ€” ๐Ÿ‘ 10    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

@konstmish is following 20 prominent accounts