At NeurIPS this week? DM if you want to meet!
30.11.2025 21:53 β π 6 π 0 π¬ 0 π 0At NeurIPS this week? DM if you want to meet!
30.11.2025 21:53 β π 6 π 0 π¬ 0 π 0Congrats Eugene!
02.10.2025 17:29 β π 5 π 0 π¬ 1 π 0Happy to announce that I've joined Periodic Labs as member of technical staff. We're a mission driven startup aimed at accelerating scientific discovery using AI, with a strong focus on material science (discovery of new materials such as superconductors and such). We're hiring: periodic.com
02.10.2025 17:28 β π 26 π 2 π¬ 1 π 0Follow me for more CS cuisine!
19.03.2025 14:57 β π 1 π 0 π¬ 1 π 0The fact that all LLM libraries don't have the same data format is as surprising as the fact that there is more than one sign language dialect
06.03.2025 03:42 β π 2 π 0 π¬ 1 π 0Ray is an excellent way of testing if all your `__repr__` are coded properly (but it shouldn't be)
06.03.2025 02:26 β π 2 π 0 π¬ 0 π 0
Just stumbled upon RouteRL: a multiagent RL framework to facilitate the testing and development of efficient route choice strategies
coexistence-project.github.io/RouteRL/
Looks pretty cool!
What is GGUF, Safetensors, PyTorch, ONNX?
In this blog post, let's discover common formats for storing an AI model.
huggingface.co/blog/ngxson/...
MLGym makes it super easy to set up complex tasks to be solved by LLMs. Honestly one of the most intuivite APIs I have ever seen in that space!
21.02.2025 16:44 β π 0 π 0 π¬ 0 π 0After that, your LLM reads these instructions, and outputs prompts with some thoughts. The commands are executed in the docker's bash, and the result is returned to the agent.
21.02.2025 16:44 β π 0 π 0 π¬ 1 π 0
Today we're opensourcing MLGym, an API for AI research agents.
MLGym relies on a gym environment that wraps a docker image. Each env has a task specified as a YAML file, telling in plain english what you want your LLM to achieve
π
Good old cProfile with snakeviz is pretty cool too jiffyclub.github.io/snakeviz/
Again, not for cuda ops, and not as fine-grained as line-profiler but quite useful for macro-tracking of compute time
torch.utils.benchmark.Timer is amazing to assess the runtime of a whole isolated piece of code, but be mindful that the way it plays with global variables isn't always obvious and may differ from time.time() on occasions
19.02.2025 17:13 β π 2 π 0 π¬ 1 π 0I use line_profiler to check the code line-by-line (careful: cuda ops re async, do not trust it for these!) - very useful to check cpu-overhead pypi.org/project/line...
19.02.2025 17:13 β π 2 π 0 π¬ 1 π 0The profilers I use: PyTorch profiler to view the time spend doing the various ops of my code. It can reliably show you what's going on for a single iteration of your function. pytorch.org/tutorials/re...
19.02.2025 17:13 β π 2 π 0 π¬ 1 π 0In general, in-place operations are not preferable to regular ones (you won't gain much mem improvement or speed-ups). Don't load your code with ReLU(inplace=True), mul_, add_ if not absolutely necessary.
19.02.2025 17:13 β π 1 π 0 π¬ 1 π 0Using hydra or similar fancy config objects: Avoid calling cfg.attribute often in the code. Instead, cache the args values in your script as global workspace variables.
19.02.2025 17:13 β π 2 π 0 π¬ 1 π 0If you have a tiny model (robotics, RL) cpu-overhead bound, avoid frequent calls to eval() or train() in eager mode, or model.parameters() or anything that goes through your model. Prefer cached versions of these calls.
19.02.2025 17:13 β π 2 π 0 π¬ 1 π 0Avoid calling tensor.item() in between cuda operations. This triggers a cuda synchronization and blocks your code. Do the logging after all code (forward / backward / optim) has completed. See how to find sync points here)
19.02.2025 17:13 β π 3 π 0 π¬ 1 π 0Avoid pinning memory in your code unless you thoroughly tested that it accelerates runtime (see this tutorial for more info). As an aside, pin_memory is also less safe! pytorch.org/tutorials/in...
19.02.2025 17:13 β π 2 π 0 π¬ 1 π 0Don't send tensors to device using to(device) if you can instantiate them directly there. For instance, prefer randn((), device=device) to randn(()).to(device)
19.02.2025 17:13 β π 3 π 0 π¬ 1 π 0A few tips I share when I talk about perf with PyTorch in eager mode (with focus on small models): πͺ’
19.02.2025 17:13 β π 12 π 2 π¬ 1 π 0I guess my point was that a proper name + definition is necessary to write good code. When I see βpolicyβ, βcriticβ, βreplay bufferβ, βenvβ I know exactly what does and doesnβt belong to them. With agent is systematically a βhm yeah why notβ - then you end up with ill-defined monster classes
11.02.2025 09:14 β π 0 π 0 π¬ 0 π 0If your agent is a policy call it policy, if it's a trainer call it trainer! If it's just a big undefined collection of methods, consider refactoring it...
11.02.2025 08:52 β π 2 π 0 π¬ 1 π 0Every time I meet with people and someone talks about agent, there's at least one person who asks "what do you mean by agent?" or "you should not call that an agent".
11.02.2025 08:52 β π 1 π 0 π¬ 1 π 0
I stand by my point that the word "agent" should be avoided at all costs.
At least in RL, anytime I see an "Agent" class it's meant to be a "whatever doesn't fit in any other bucket in my codebase".
hard to tell, let's try :D
06.02.2025 15:42 β π 1 π 0 π¬ 0 π 0Everyone's like "hey I just coded and trained a SOTA LLM in my garage last week, also wrote a blogpost about it and opensourced the repo" and the only thing I did in the meantime was fix a CI and configure a remote interpreter on a server... π’
06.02.2025 15:37 β π 5 π 0 π¬ 0 π 0
Side note: we saw some nice adoption from DeepSeek-R1 reprod repos, which is humbling, if not thrilling!
github.com/Jiayi-Pan/Ti...