Akshita Bhagia's Avatar

Akshita Bhagia

@akshitab.bsky.social

Research Engineer at Ai2 https://akshitab.github.io/

2,196 Followers  |  147 Following  |  5 Posts  |  Joined: 05.10.2023  |  1.8034

Latest posts by akshitab.bsky.social on Bluesky

Post image

Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flowโ€”not just the final weights, but the entire training journey.
Best fully open 32B reasoning model & best 32B base model. ๐Ÿงต

20.11.2025 14:37 โ€” ๐Ÿ‘ 68    ๐Ÿ” 17    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2

Such an interesting paper!

31.10.2025 18:41 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

The Cancer AI Alliance (CAIA) is already prototyping Asta DataVoyager in a federated, multi-institution setup for cancer studiesโ€”keeping clinical data local and secure.
Read more about CAIA here: buff.ly/ACpxLNT

01.10.2025 13:02 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Video thumbnail

Introducing FlexOlmo, a new paradigm for language model training that enables the co-development of AI through data collaboration. ๐Ÿงต

09.07.2025 16:02 โ€” ๐Ÿ‘ 15    ๐Ÿ” 6    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2
Post image

Announcing OLMo 2 32B: the first fully open model to beat GPT 3.5 & GPT-4o mini on a suite of popular, multi-skill benchmarks.

Comparable to best open-weight models, but a fraction of training compute. When you have a good recipe, โœจ magical things happen when you scale it up!

13.03.2025 18:36 โ€” ๐Ÿ‘ 58    ๐Ÿ” 15    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 3

I caught myself wanting to respond similarly to Claude and then told myself that it will be wasteful inference. But now I also mentally thank it each time because what if I lose that instinct with humans.. I'm already impatient with smart speakers.

12.02.2025 20:00 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

They made me do video ๐Ÿ˜ฌ but for a good reason!

We are launching an iOS appโ€“it runs OLMoE locally ๐Ÿ“ฑ We're gonna see more on-device AI in 2025, and wanted to offer a simple way to prototype with it

App: apps.apple.com/us/app/ai2-o...
Code: github.com/allenai/OLMo...
Blog: allenai.org/blog/olmoe-app

11.02.2025 14:18 โ€” ๐Ÿ‘ 49    ๐Ÿ” 13    ๐Ÿ’ฌ 5    ๐Ÿ“Œ 3
Post image

kicking off 2025 with our OLMo 2 tech report while payin homage to the sequelest of sequels ๐Ÿซก

๐Ÿš— 2 OLMo 2 Furious ๐Ÿ”ฅ is everythin we learned since OLMo 1, with deep dives into:

๐Ÿš– stable pretrain recipe
๐Ÿš” lr anneal ๐Ÿค data curricula ๐Ÿค soups
๐Ÿš˜ tulu post-train recipe
๐Ÿšœ compute infra setup

๐Ÿ‘‡๐Ÿงต

03.01.2025 16:02 โ€” ๐Ÿ‘ 69    ๐Ÿ” 17    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 1
Post image

Want to predict the task performance of LMs before pretraining them?

We develop task scaling laws and model ladders, which predict the accuracy on individual tasks by OLMo 2 7B & 13B models within 2 points of absolute error. The cost is 1% of the compute used to pretrain them.

09.12.2024 17:07 โ€” ๐Ÿ‘ 33    ๐Ÿ” 14    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
The OLMo 2 models sit at the Pareto frontier of training FLOPs vs model average performance.

The OLMo 2 models sit at the Pareto frontier of training FLOPs vs model average performance.

Meet OLMo 2, the best fully open language model to date, including a family of 7B and 13B models trained up to 5T tokens. OLMo 2 outperforms other fully open models and competes with open-weight models like Llama 3.1 8B โ€” As always, we released our data, code, recipes and more ๐ŸŽ

26.11.2024 20:51 โ€” ๐Ÿ‘ 151    ๐Ÿ” 36    ๐Ÿ’ฌ 5    ๐Ÿ“Œ 12

I use GoodNotes

26.11.2024 18:53 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

๐Ÿ™‹โ€โ™€๏ธ

22.11.2024 01:24 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
OLMo: Open Language Model A State-Of-The-Art, Truly Open LLM and Framework

release day release day ๐Ÿฅณ OLMo 1b +7b out today and 65b soon...

OLMo accelerates the study of LMs. We release *everything*, from toolkit for creating data (Dolma) to train/inf code

blog blog.allenai.org/olmo-open-la...
olmo paper allenai.org/olmo/olmo-pa...
dolma paper allenai.org/olmo/dolma-p...

01.02.2024 19:33 โ€” ๐Ÿ‘ 28    ๐Ÿ” 14    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2

๐Ÿ‘‹

09.01.2024 03:42 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Perplexity macro averaged over any domains within each of the 18 top-level data sources in Paloma, using baselines with pretraining controls including decontamination. Evaluating on one monolithic corpus, such as C4, does not tell the complete story of model fit. Paloma lets us see when trends differ from one distribution of language to another. For instance, the 3 baselines trained on only Common Crawl data (C4, mC4-en, Falcon-RefinedWeb) exhibit high perplexity, sometimes with non-monotonic scaling over tokens seen, on specific evaluation sources such as The Pile, Dolma, and Dolma-100-Programming-Languages.

Perplexity macro averaged over any domains within each of the 18 top-level data sources in Paloma, using baselines with pretraining controls including decontamination. Evaluating on one monolithic corpus, such as C4, does not tell the complete story of model fit. Paloma lets us see when trends differ from one distribution of language to another. For instance, the 3 baselines trained on only Common Crawl data (C4, mC4-en, Falcon-RefinedWeb) exhibit high perplexity, sometimes with non-monotonic scaling over tokens seen, on specific evaluation sources such as The Pile, Dolma, and Dolma-100-Programming-Languages.

LMs are used to process text from many topics, styles, dialects, etc., but how well do they do?

๐Ÿ“ˆ Evaluating perplexity on just one corpus like C4 doesn't tell the whole story ๐Ÿ“‰

โœจ๐Ÿ“ƒโœจ
We introduce Paloma, a benchmark of 585 domains from NY Times to r/depression on Reddit.

20.12.2023 20:28 โ€” ๐Ÿ‘ 17    ๐Ÿ” 7    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

@akshitab is following 20 prominent accounts