Jingfeng Wu @uuujf - Bluesky Profile

26.09.2025 03:49 — 👍 0 🔁 0 💬 0 📌 0

GD dominates ridge

sharing a new paper w/ Peter Bartlett, @jasondeanlee.bsky.social @shamkakade.bsky.social, Bin Yu

ppl talking about implicit regularization, but how good is it? We show it's surprisingly effective: GD dominates ridge for linear regression, w/ more cool stuff on GD vs SGD

arxiv.org/abs/2509.17251

26.09.2025 03:49 — 👍 0 🔁 0 💬 1 📌 0

Lassen in August

01.09.2025 21:03 — 👍 0 🔁 0 💬 0 📌 0

I’m an award-winning mathematician. Trump just cut my funding. The “Mozart of Math” tried to stay out of politics. Then it came for his research.

I wrote an op-ed on the world-class STEM research ecosystem in the United States, and how this ecosystem is now under attack on multiple fronts by the current administration: newsletter.ofthebrave.org/p/im-an-awar...

18.08.2025 15:45 — 👍 792 🔁 323 💬 19 📌 32

More Than 50 Simons Foundation Grantees to Speak at 2026 International Congress of Mathematicians More Than 50 Simons Foundation Grantees to Speak at 2026 International Congress of Mathematicians on Simons Foundation

Congratulations to our colleague and friend, former Simons Institute Associate Director Peter Bartlett, who will be delivering one of the plenary lectures for the 2026 International Congress of Mathematicians.

www.simonsfoundation.org/2025/07/11/m...

15.08.2025 06:12 — 👍 7 🔁 1 💬 0 📌 0

Terence Tao (@tao@mathstodon.xyz) It is tempting to view the capability of current AI technology as a singular quantity: either a given task X is within the ability of current tools, or it is not. However, there is in fact a very wid...

My thoughts on the crucial importance of methodology on self-reported AI performance on mathematics competitions, and my policy on commenting on such reports going forward: mathstodon.xyz/@tao/1148814...

19.07.2025 22:37 — 👍 232 🔁 52 💬 2 📌 10

📣Join us at COLT 2025 in Lyon for a community event!
📅When: Mon, June 30 | 16:00 CET
What: Fireside chat w/ Peter Bartlett & Vitaly Feldman on communicating a research agenda, followed by mentorship roundtable to practice elevator pitches & mingle w/ COLT community!
let-all.com/colt25.html

24.06.2025 18:22 — 👍 16 🔁 7 💬 0 📌 1

Large Stepsizes Accelerate Gradient Descent for Regularized Logistic Regression We study gradient descent (GD) with a constant stepsize for $\ell_2$-regularized logistic regression with linearly separable data. Classical theory suggests small stepsizes to ensure monotonic reducti...

2/2 For regularized logistic regression (strongly cvx and smooth) with separable data, we show GD, with simply a large stepsize, can match Nesterov’s acceleration, among other cool results.

arxiv.org/abs/2506.02336

04.06.2025 18:55 — 👍 0 🔁 0 💬 0 📌 0

Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes We study $\textit{gradient descent}$ (GD) for logistic regression on linearly separable data with stepsizes that adapt to the current risk, scaled by a constant hyperparameter $η$. We show that after ...

1/2 For the task of finding linear separator of a separable dataset with margin gamma, 1/gamma^2 steps suffice for adaptive GD with large stepsizes (applied to logistic loss). This is minimax optimal for first-order methods, and is impossible for GD with small stepsizes.

arxiv.org/abs/2504.04105

04.06.2025 18:55 — 👍 0 🔁 0 💬 1 📌 0

effects of stepsize for GD

Sharing two new papers on accelerating GD via large stepsizes!

Classical GD analysis assumes small stepsizes for stability. However, in practice, GD is often used with large stepsizes, which lead to instability.

See my slides for more details on this topic: uuujf.github.io/postdoc/wu20...

04.06.2025 18:55 — 👍 1 🔁 0 💬 1 📌 0

Jingfeng Wu, Pierre Marion, Peter Bartlett
Large Stepsizes Accelerate Gradient Descent for Regularized Logistic Regression
https://arxiv.org/abs/2506.02336

04.06.2025 05:26 — 👍 1 🔁 1 💬 0 📌 0

Rocky mountain in May

19.05.2025 17:24 — 👍 0 🔁 0 💬 0 📌 0

Announcing the first workshop on Foundations of Post-Training (FoPT) at COLT 2025!

📝 Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models!

🗓️ Deadline: May 19, 2025

09.05.2025 17:09 — 👍 17 🔁 6 💬 1 📌 1

We were very lucky to have Peter Bartlett visit @uwcheritoncs.bsky.social and give a Distinguished Lecture on "Gradient Optimization Methods: The Benefits of a Large Step-size." Very interesting and surprising results.

(Recording will be available eventually)

07.05.2025 10:11 — 👍 26 🔁 3 💬 0 📌 0

Tips on How to Connect at Academic Conferences I was a kinda awkward teenager. If you are a CS researcher reading this post, then chances are, you were too. How to navigate social situations and make friends is not always intuitive, and has to …

I wrote a post on how to connect with people (i.e., make friends) at CS conferences. These events can be intimidating so here's some suggestions on how to navigate them

I'm late for #ICLR2025 #NAACL2025, but in time for #AISTATS2025 #ICML2025! 1/3
kamathematics.wordpress.com/2025/05/01/t...

01.05.2025 12:57 — 👍 68 🔁 19 💬 3 📌 2

Yosemite in April

28.04.2025 17:36 — 👍 1 🔁 0 💬 0 📌 0

Ruiqi Zhang, Jingfeng Wu, Licong Lin, Peter L. Bartlett
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
https://arxiv.org/abs/2504.04105

08.04.2025 05:26 — 👍 4 🔁 1 💬 0 📌 0

The Future of Language Models and Transformers Transformers have now been scaled to vast amounts of static data. This approach has been so successful it has forced the research community to ask, "What's next?". This workshop will bring together re...

Join us for a week of talks on The Future of Language Models and Transformers at the Simons Institute. Talks by @profsanjeevarora.bsky.social, Azalia Mirhoseini, Kilian Weinberger and others. Mon, March 31 - Fri, April 4.
simons.berkeley.edu/workshops/future-language-models-transformers

31.03.2025 16:22 — 👍 2 🔁 2 💬 1 📌 0

Jingfeng Wu

Latest posts by uuujf.bsky.social on Bluesky

@uuujf is following 20 prominent accounts