Anh Ta's Avatar

Anh Ta

@anhta24.bsky.social

Mathematician by training. Geometry and Combinatorics. Machine Learning and Cryptography now. https://scholar.google.com/citations?user=1y0vv1wAAAAJ&hl=en

91 Followers  |  1,066 Following  |  14 Posts  |  Joined: 18.11.2024  |  1.9018

Latest posts by anhta24.bsky.social on Bluesky

Fabian Falck, Teodora Pandeva, Kiarash Zahirnia, Rachel Lawrence, Richard Turner, Edward Meeds, Javier Zazo, Sushrut Karmalkar
A Fourier Space Perspective on Diffusion Models
https://arxiv.org/abs/2505.11278

19.05.2025 05:03 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

We now have a whole YouTube video explaining our MINDcraft paper, check it out!
youtu.be/MeEcxh9St24

10.05.2025 20:08 β€” πŸ‘ 9    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0

We define a new cryptographic system to allow user to show that he has a valid certificate from a public set of authorities, while hiding all the message, signature and identity of the authority

30.04.2025 01:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

When using digital certificate, one usually gets signature from some authority, then show the message and signature for verification.

30.04.2025 01:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
A visualization of compressed column evaluation in sparse autodiff. Here, columns 1, 2 and 5 of the matrix (in yellow) have no overlap in their sparsity patterns. Thus, they can be evaluated together by multiplication with a sum of basis vectors (in purple).

A visualization of compressed column evaluation in sparse autodiff. Here, columns 1, 2 and 5 of the matrix (in yellow) have no overlap in their sparsity patterns. Thus, they can be evaluated together by multiplication with a sum of basis vectors (in purple).

Wanna learn about autodiff and sparsity? Check out our #ICLR2025 blog post with @adrhill.bsky.social and Alexis Montoison. It has everything you need: matrices with lots of zeros, weird compiler tricks, graph coloring techniques, and a bunch of pretty pics!
iclr-blogposts.github.io/2025/blog/sp...

28.04.2025 17:07 β€” πŸ‘ 50    πŸ” 12    πŸ’¬ 1    πŸ“Œ 0
Preview
The DeepSeek Series: A Technical Overview An overview of the papers describing the evolution of DeepSeek

Recently, my colleague Shayan Mohanty published a technical overview of the papers describing deepseek. He's now revised that article, adding more explanations to make it more digestible for those of us without a background in this field.

martinfowler.com/articles/dee...

21.04.2025 13:20 β€” πŸ‘ 62    πŸ” 14    πŸ’¬ 2    πŸ“Œ 1
Video thumbnail

Huawei's Dream 7B (Diffusion reasoning model), the most powerful open diffusion large language model to date.

Blog: hkunlp.github.io/blog/2025/dr...

02.04.2025 14:50 β€” πŸ‘ 24    πŸ” 4    πŸ’¬ 1    πŸ“Œ 2
Preview
A Tour of Reinforcement Learning: The View from Continuous Control This manuscript surveys reinforcement learning from the perspective of optimization and control with a focus on continuous control applications. It surveys the general formulation, terminology, and ty...

This week's #PaperILike is "A Tour of Reinforcement Learning: The View from Continuous Control" (Recht 2018).

Pairs well with the PaperILiked last week -- another good bridge between RL and control theory.

PDF: arxiv.org/abs/1806.09460

09.03.2025 15:32 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
A screenshot of the course description

A screenshot of the course description

I taught a grad course on AI Agents at UCSD CSE this past quarter. All lecture slides, homeworks & course projects are now open sourced!

I provide a grounding going from Classical Planning & Simulations -> RL Control -> LLMs and how to put it all together
pearls-lab.github.io/ai-agents-co...

04.03.2025 16:37 β€” πŸ‘ 38    πŸ” 9    πŸ’¬ 3    πŸ“Œ 1
Preview
Model Predictive Control and Reinforcement Learning: A Unified Framework Based on Dynamic Programming In this paper we describe a new conceptual framework that connects approximate Dynamic Programming (DP), Model Predictive Control (MPC), and Reinforcement Learning (RL). This framework centers around ...

This week's #PaperILike is "Model Predictive Control and Reinforcement Learning: A Unified Framework Based on Dynamic Programming" (Bertsekas 2024).

If you know 1 of {RL, controls} and want to understand the other, this is a good starting point.

PDF: arxiv.org/abs/2406.00592

02.03.2025 16:19 β€” πŸ‘ 43    πŸ” 8    πŸ’¬ 0    πŸ“Œ 0
David Picard

I updated my ML lecture material: davidpicard.github.io/teaching/
I show many (boomer) ML algorithms with working implementation to prevent the black box effect.
Everything is done in notebooks so that students can play with the algorithms.
Book-ish pdf export: davidpicard.github.io/pdf/poly.pdf

27.02.2025 19:09 β€” πŸ‘ 37    πŸ” 6    πŸ’¬ 0    πŸ“Œ 0
Post image

Our beginner's oriented accessible introduction to modern deep RL is now published in Foundations and Trends in Optimization. It is a great entry to the field if you want to jumpstart into RL!
@bernhard-jaeger.bsky.social
www.nowpublishers.com/article/Deta...
arxiv.org/abs/2312.08365

22.02.2025 19:32 β€” πŸ‘ 62    πŸ” 14    πŸ’¬ 2    πŸ“Œ 0

KS studies the Matrix Multiplication Verification Problem (MMV), in which you get three n x n matrices A, B, C (say, with poly(n)-bounded integer entries) and want to decide whether AB = C. This is trivial to solve in MM time O(n^omega) deterministically: compute AB and compare it with C. 2/

21.02.2025 04:50 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

Introducing The AI CUDA Engineer: An agentic AI system that automates the production of highly optimized CUDA kernels.

sakana.ai/ai-cuda-engi...

The AI CUDA Engineer can produce highly optimized CUDA kernels, reaching 10-100x speedup over common machine learning operations in PyTorch.

Examples:

20.02.2025 01:50 β€” πŸ‘ 89    πŸ” 17    πŸ’¬ 3    πŸ“Œ 4

why on earth that somebody thought of doing this in the first place

17.02.2025 11:56 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Lorenzo Pastori, Arthur Grundner, Veronika Eyring, Mierk Schwabe
Quantum Neural Networks for Cloud Cover Parameterizations in Climate Models
https://arxiv.org/abs/2502.10131

17.02.2025 05:35 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Przemys{\l}aw Pawlitko, Natalia Mo\'cko, Marcin Niemiec, Piotr Cho{\l}da
Implementation and Analysis of Regev's Quantum Factorization Algorithm
https://arxiv.org/abs/2502.09772

17.02.2025 07:19 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Digital illustration of a school of red fish with simple geometric features on a purple background. Some fish are connected by curved dashed and solid lines, suggesting interactions or relationships between them.

Digital illustration of a school of red fish with simple geometric features on a purple background. Some fish are connected by curved dashed and solid lines, suggesting interactions or relationships between them.

Enjoyed sharing our work on electric fish with @dryohanjohn.bsky.social⚑🐟 Their electric "conversations" help us build models to discover neural mechanisms of social cognition. Work led by Sonja Johnson-Yu & @satpreetsingh.bsky.social with Nate Sawtell

kempnerinstitute.harvard.edu/news/what-el...

14.02.2025 21:16 β€” πŸ‘ 43    πŸ” 9    πŸ’¬ 1    πŸ“Œ 1
Post image Post image Post image Post image

Model-free deep RL algorithms like NFSP, PSRO, ESCHER, & R-NaD are tailor-made for games with hidden information (e.g. poker).
We performed the largest-ever comparison of these algorithms.
We find that they do not outperform generic policy gradient methods, such as PPO.
arxiv.org/abs/2502.08938
1/N

14.02.2025 18:41 β€” πŸ‘ 93    πŸ” 20    πŸ’¬ 3    πŸ“Œ 4
Post image

πŸ”₯ Want to train large neural networks WITHOUT Adam while using less memory and getting better results? ⚑
Check out SCION: a new optimizer that adapts to the geometry of your problem using norm-constrained linear minimization oracles (LMOs): πŸ§΅πŸ‘‡

13.02.2025 16:51 β€” πŸ‘ 17    πŸ” 6    πŸ’¬ 3    πŸ“Œ 1
Preview
Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models Consistency models (CMs) are a powerful class of diffusion-based generative models optimized for fast sampling. Most existing CMs are trained using discretized timesteps, which introduce additional hy...

this paper is a pretty impressive tour de force in neural network training: arxiv.org/abs/2410.11081

pretty inspiring to me -- network isn't converging? rigorously monitor every term in your loss to identify where in the architecture something is going wrong!

13.02.2025 12:52 β€” πŸ‘ 9    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
Towards Integrating Personal Knowledge into Test-Time Predictions Machine learning (ML) models can make decisions based on large amounts of data, but they can be missing personal knowledge available to human users about whom predictions are made. For example, a mode...

Obsessed with the work coming out of Finale Doshi-Velez's group; they don't just take the limits of the real world for ML deployment seriously but instead turn it into new algorithmic ideas
arxiv.org/abs/2406.08636

13.02.2025 04:13 β€” πŸ‘ 63    πŸ” 9    πŸ’¬ 1    πŸ“Œ 1

Our new paper with @chrismlangdon is just out in @natureneuro.bsky.social! We show that high-dimensional RNNs use low-dimensional circuit mechanisms for cognitive tasks and identify a latent inhibitory mechanism for context-dependent decisions in PFC data.
www.nature.com/articles/s41...

12.02.2025 18:19 β€” πŸ‘ 71    πŸ” 24    πŸ’¬ 0    πŸ“Œ 1

I just checked the data of accepted papers at ICLR '25. The authors with most submission had 21 accepted out of 42 submitted. Oh well!

10.02.2025 20:50 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Also note that, instead of adding KL penalty in the reward, GRPO regularizes by directly adding the KL divergence between the trained policy and the reference policy to the loss, avoiding complicating the calculation of the advantage.

Also note that, instead of adding KL penalty in the reward, GRPO regularizes by directly adding the KL divergence between the trained policy and the reference policy to the loss, avoiding complicating the calculation of the advantage.

@xtimv.bsky.social and I were just discussing this interesting comment in the DeepSeek paper introducing GRPO: a different way of setting up the KL loss.

It's a little hard to reason about what this does to the objective. 1/

10.02.2025 04:32 β€” πŸ‘ 50    πŸ” 10    πŸ’¬ 3    πŸ“Œ 0

Restarting an old routine "Daily Dose of Good Papers" together w @vaibhavadlakha.bsky.social

Sharing my notes and thoughts here 🧡

23.11.2024 00:04 β€” πŸ‘ 61    πŸ” 8    πŸ’¬ 5    πŸ“Œ 3
Post image

It's finally out!

Visual experience orthogonalizes visual cortical responses

Training in a visual task changes V1 tuning curves in odd ways. This effect is explained by a simple convex transformation. It orthogonalizes the population, making it easier to decode.

10.1016/j.celrep.2025.115235

02.02.2025 09:59 β€” πŸ‘ 153    πŸ” 44    πŸ’¬ 5    πŸ“Œ 2
Post image

group relative policy optimization (GRPO)

A friendly intro to GRPO. The algorithm is quite simple and elegant when you compare it to PPO, TRPO etc - and it's remarkable how well that worked out for deepseek R1.

superb-makemake-3a4.notion.site/group-relati...

01.02.2025 07:54 β€” πŸ‘ 33    πŸ” 4    πŸ’¬ 0    πŸ“Œ 1

Eugen Coroi, Changhun Oh
Exponential advantage in continuous-variable quantum state learning
https://arxiv.org/abs/2501.17633

30.01.2025 06:51 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
Can Transformers Do Enumerative Geometry? How can Transformers model and learn enumerative geometry? What is a robust procedure for using Transformers in abductive knowledge discovery within a mathematician-machine collaboration? In this work...

I am extremely happy to announce that our paper
Can Transformers Do Enumerative Geometry? (arxiv.org/abs/2408.14915) has been accepted to the
@iclr-conf.bsky.social!!
Congrats to my collaborators Alessandro Giacchetto at ETH ZΓΌruch and Roderic G. Corominas at Harvard.
#ICLR2025 #AI4Math #ORIGINS

23.01.2025 10:17 β€” πŸ‘ 12    πŸ” 3    πŸ’¬ 1    πŸ“Œ 2

@anhta24 is following 19 prominent accounts