thx to all the feedback from OSS community!
our olmOCR lead Jake Poznanski shipped a new model fixing lotta issues + some more optimization for better throughput
have fun converting PDFs!
@kylelo.bsky.social
#nlp #ml #hci research scientist @ai2.bsky.social, Co-lead of Data for OLMo w/ @soldaini.net, statistics @uw, open science, tabletop, seattle, he/him,π§ kyleclo.com
thx to all the feedback from OSS community!
our olmOCR lead Jake Poznanski shipped a new model fixing lotta issues + some more optimization for better throughput
have fun converting PDFs!
olmoTrace for connecting model generations to training data won Best Paper for System Demonstrations at #ACL2025!
31.07.2025 20:33 β π 19 π 3 π¬ 1 π 0People at #ACL2025, come drop by our poster today & chat with me about how context matters for reliable language model evaluations!
Jul 30, 11:00-12:30 at Hall 4X, board 424.
issues w preference LM benchmarks:
π‘data contains cases where the "bad" response is just as good as chosen one
πmodel rankings can feel off (claude ranks lower than expected)
led by @cmalaviya.bsky.social, we study underspecified queries & detrimental effect on model evals; accepted to TACL 2025
lol 'what does skibidi mean' was not on my bingo card for @actinterp.bsky.socialβ¬ workshop #icml2025 π
19.07.2025 16:39 β π 6 π 0 π¬ 0 π 0presenting olmOCR at the poster session (2:15pm 211 West) for #codeml workshop at #icml2025!
π fully open source OCR, comparable or better than frontier VLMs
π all weights, data, code free & public
π‘ new benchmark of OCR "unit tests" on diverse PDFs & challenging OCR cases
Presenting two posters at ICML over the next two days:
- Both at 11am - 1:30pm
- Both about how to improve pre-training with domains
- Both at stall # E-2600 in East Exhibition Hall A-B (!)
Tomorrow: WebOrganizer w/ @soldaini.net & @kylelo.bsky.social
Thursday: MeCo by @gaotianyu1350.bsky.social
will be at #icml2025, lemme kno if wanna chat about OLMo pretraining data curation, evaluation, data mixing, etc!π
find us at poster sess on π
Wed 7/16 @ 11amβ²οΈ to learn about Web Organizer, distilling web data taxonomies into small models & using them for LM data mixing!
only took few days to descend into madness
01.07.2025 20:12 β π 12 π 0 π¬ 0 π 0shame internet ppl dont kno u as bringer of office cookies
28.06.2025 19:42 β π 2 π 0 π¬ 0 π 0it's still working for me? wat r u seein
28.06.2025 19:40 β π 1 π 0 π¬ 1 π 0back from copenhagen & berkeley travels, now moving into new @ai2.bsky.social office!
26.06.2025 15:45 β π 16 π 0 π¬ 0 π 1thx for summary! im still wondering if it implies what ur saying. the judge is giving a pass because the filters around the model prevent content leakage. couldnβt one also interpret this as open model developers may be expected to provide such filters alongside released weights?
26.06.2025 03:54 β π 3 π 0 π¬ 1 π 0thx for organizing! great to meet NLP folks & consume fancy bread π₯ππ₯
21.06.2025 14:32 β π 21 π 0 π¬ 0 π 0woahh thx this is clearer than how in presented it π
21.06.2025 10:29 β π 2 π 0 π¬ 0 π 0we developed the benchmark independently so no dev/test leakage, and even so, results show olmOCR produces often higher quality output than even proprietary OCR tools & is way cheaper + local as well!
our team will be at #ICML2025, come find me, Jake (our OCR lead!) & @soldni.bsky.social there!
the benchmark works based on thousands of "unit tests"
so instead of fuzzy matching between a model-generated table with a gold reference table,
we define Pass/Fail tests like "the cell to the left of the cell containing 0.001 should contain 1.96"
excited to release our new benchmark for OCR addressing 3 eval challenges:
π coverage of many types of docs (born digital vs old scans, pages w tiny fonts, etc)
π‘ coverage of many different OCR targets (e.g. equations, tables, etc)
π apples-to-apples comparison across systems
we won honorable mention for Best Paper at #CVPR2025 π for Molmo & Pixmo, showing the value of high-quality data for VLMs!
recalling when we released same time as Llama 3.2 π
huge kudos to Matt Deitke, Chris Clark & Ani Kembhavi for their leadership on this project!
@cvprconference.bsky.social
google down, guess ill go smell flowers or sthn π€·ββοΈ
12.06.2025 19:31 β π 4 π 0 π¬ 0 π 0excited to see this release of 1M public domain & CC zero books, digitized and OCR'd! π big win for open data, congrats to the authors!
arxiv.org/abs/2506.08300
thx for sharing! from your paper:
βthe artifacts that they protect, namely model weights and model outputs, are largely not copyrightable, making it unclear whether there is even anything to be licensed.β
dumb question but I thought itβs possible to license non-copyrightable artifacts?
fave part of this work is "ties" βοΈ cases with many correct answers
"Q: name a color of rainbow", a good reward model should know all answers "green", "blue", ... are correct but w/ no strong color preference
congrats @saumyamalik.bsky.social et al @ai2.bsky.social & friends on RewardBench 2!
latex dummy here - what problem does this solve?
31.05.2025 18:45 β π 2 π 0 π¬ 1 π 0@sarahwiegreffe.bsky.social pls?
31.05.2025 02:44 β π 5 π 0 π¬ 0 π 0reviewers asking for comparison w qwen 3, have yall forgotten model was released a month ago lol π€ͺ
30.05.2025 16:15 β π 8 π 0 π¬ 1 π 0i think highest bar is having yanai read the draft & say it is good π
30.05.2025 00:16 β π 2 π 0 π¬ 0 π 0what we dont see is how many submission attempts, each incurring reviewing costs, until one accepted π€·ββοΈ
30.05.2025 00:15 β π 4 π 0 π¬ 1 π 0feels literally xkcd.com/882
30.05.2025 00:14 β π 0 π 0 π¬ 0 π 0