Janu Verma's Avatar

Janu Verma

@januverma.bsky.social

Principal Applied Scientist, Microsoft. Interested in AI, RecSys, Maths. Trains and fine-tunes models. januverma.substack.com

32 Followers  |  132 Following  |  37 Posts  |  Joined: 19.11.2024  |  1.8801

Latest posts by januverma.bsky.social on Bluesky

Post image

Been thinking about the trends at the intersection of AI and RecSys. Where are we heading. Based on my own work, extensive research, and lots of analysis/thinking, I have put together my thoughts in a detailed article on substack.
Link: open.substack.com/pub/januverm...

29.01.2026 10:39 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

When we try to fight Cold Shyness with willpower, we usually lose. The brain is too good at bargaining for comfort. The thing that fixes it is low-stakes repetition.
I wrote about why I stopped trying to "Win January" and started treating it as a training block for a February 1st "official" start.

05.01.2026 18:13 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

There is a specific kind of friction that hits on January 1st. I call it "Cold Shyness." It’s that reluctance to be out thereβ€”whether physically in the cold or metaphorically in a new skillβ€”exposed and uncomfortable.

05.01.2026 18:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Multi-Turn Tool Use with RL Think β†’ Code β†’ Check β†’ Answer

New work: Multi-turn tool use using RL
Link: open.substack.com/pub/januverm...

04.11.2025 15:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Whether throughΒ multi-task learning,Β auxiliary objectives, or simply smarter input design, giving modelsΒ context unlocks generalization, robustness, and sometimes surprising insights.

It’s a good reminder: The best models don’t just predict, theyΒ understand

11.07.2025 12:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Auxiliary Tasks:Β When training for sentiment analysis, add an auxiliary task like predicting part-of-speech tags. A better understanding of grammar leads to a better understanding of sentiment.

11.07.2025 12:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Additional Contextual Data e.g. search queries in recommendations models:Β A user's search history is pure gold. A streaming service that sees you're searching for "Oscar-winning movies" can offer far more relevant suggestions than one relying on watch history alone.

11.07.2025 12:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Multi-Objective Training:Β Don't just predict customerΒ purchase; also predict the likelihood of aΒ returnΒ and aΒ positive review. This creates a more holistic and useful e-commerce model.

11.07.2025 12:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Add. Related. Context.

Often, the most significant performance gains come from enriching models with related, contextual info. Models get better by being exposed toΒ auxiliary signalsΒ that deepen their understanding of the task.

11.07.2025 12:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
The Protein Folding Problem Why shape is life’s code

Link: januverma.substack.com/p/the-protei...

09.07.2025 10:15 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Covers the significance of Anfinsen’s experiment, the role of the CASP competition, and why protein structure prediction was considered an AI-complete problem. This sets the stage for understanding how AlphaFold-2 achieved its breakthrough.

09.07.2025 10:15 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

My latest blog post dives into the protein folding problem - a fundamental question in molecular biology that puzzled scientists for decades, until deep learning models like AlphaFold changed the game. I walk through the biological and computational roots of the problem.

09.07.2025 10:15 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Part III involves using frontier models to generate (synthetic) β€˜reasoning’ for user engagement based on past interactions and then use the reasoning-augmented data to SFT Qwen 1.5B model. Comparable or better results with just 10% of the interaction open.substack.com/pub/januverma/…

12.02.2025 21:58 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Large Language Models for Recommender Systems II - Scaling Do scaling laws extend to recommendation?

Part II of my explorations with LLMs for Recommendation tasks involves experimenting with base models of varying sizes from 0.5B to 14B params(Qwen 2.5 Series) and incorporating user attributes.
januverma.substack.com/p/large-language-models-for-recommender-35c

04.02.2025 14:31 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Large Language Models for Recommender Systems Can LLMs reason over user behaviour data to decipher preferences?

First experiment is on building a proof of concept for LLM recommender by supervised fine-tuning (SFT) a small-scale LLM (Llama 1B). januverma.substack.com/p/large-lang...

04.02.2025 14:27 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Incomplete Distillation | Janu Verma | Substack personal research journal containing articles based on my explorations with cutting edge AI. Click to read Incomplete Distillation, by Janu Verma, a Substack publication. Launched 11 days ago.

As a personal research project, I’m exploring the efficacy of LLMs for Recommendation System tasks. Check out my experments at januverma.substack.com

04.02.2025 14:27 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Large Language Models for Recommender Systems Can LLMs reason over user behaviour data to decipher preferences?

Recently, I’ve been exploring the potential of LLMs for recommendation tasks. Sharing the first report of my project where I experiment with the ability of Llama 1B model to understand user preferences from their past behavior.

open.substack.com/pub/januverm...

24.01.2025 18:03 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Have we swapped β€œreasoning” for β€œagentic” as the new shibboleth

16.01.2025 16:17 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Just came back after a month in India, no-laptop family time. Any tips on how to motivate myself to do any work are highly appreciated πŸ™

10.01.2025 11:57 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The queen of examples and counterexamples!

08.12.2024 01:50 β€” πŸ‘ 44    πŸ” 9    πŸ’¬ 1    πŸ“Œ 0
Post image

The FineWeb team is happy to finally release "FineWeb2" πŸ₯‚πŸ₯³

FineWeb 2 extends the data driven approach to pre-training dataset design that was introduced in FineWeb 1 to now covers 1893 languages/scripts

Details: huggingface.co/datasets/Hug...

A detailed open-science tech report is coming soon

08.12.2024 09:08 β€” πŸ‘ 106    πŸ” 13    πŸ’¬ 3    πŸ“Œ 2

Nothing like waking up to see your models training in a nice way. #neuralnets

04.12.2024 07:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Large Language Models as Markov Chains Large language models (LLMs) have proven to be remarkably efficient, both across a wide range of natural language processing tasks and well beyond them. However, a comprehensive theoretical analysis o...

This seems like… what we started with, no? arxiv.org/abs/2410.02724

03.12.2024 12:19 β€” πŸ‘ 167    πŸ” 9    πŸ’¬ 16    πŸ“Œ 1

Or they are too narcissistic to even notice the work/life of others. I feel there could be a coping mechanism to make their view quite myopic - ignorance is a bliss, I guess.

03.12.2024 07:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Taxi Driver knew better

02.12.2024 16:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Can’t wait for new stuff in the RLHF book. Part of my holidays reading plan.

02.12.2024 16:00 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Jonathan Berant (Tel Aviv University / Google) / Towards Robust Language Model Post-training
YouTube video by Yoav Artzi Jonathan Berant (Tel Aviv University / Google) / Towards Robust Language Model Post-training

I am seriously behind uploading Learning Machines videos, but I did want to get @jonathanberant.bsky.social's out sooner than later. It's not only a great talk, it also gives a remarkably broad overview and contextualization, so it's an excellent way to ramp up on post-training
youtu.be/2AthqCX3h8U

02.12.2024 03:45 β€” πŸ‘ 53    πŸ” 12    πŸ’¬ 1    πŸ“Œ 0

And a never ending pit. Where do you stop with prompt refinement and based on what criteria is surely messy.

01.12.2024 10:37 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Won’t help with my reputation but since I worked on social network analysis/regulation: if Bluesky ever is a success, they are extremely likely to retrain AI models (not necessarily LLM) on user data.

29.11.2024 18:53 β€” πŸ‘ 60    πŸ” 19    πŸ’¬ 6    πŸ“Œ 7

Oh that’s so interesting. I do think given the political landscape, it is hard to have a consensus on what is a lie or who you believe.

30.11.2024 08:59 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@januverma is following 19 prominent accounts