Kyle Lo's Avatar

Kyle Lo

@kylelo.bsky.social

#nlp #ml #hci research scientist @ai2.bsky.social, Co-lead of Data for OLMo w/ @soldaini.net, statistics @uw, open science, tabletop, seattle, he/him,πŸ§‹ kyleclo.com

6,462 Followers  |  577 Following  |  462 Posts  |  Joined: 17.02.2023  |  2.3264

Latest posts by kylelo.bsky.social on Bluesky

thx to all the feedback from OSS community!

our olmOCR lead Jake Poznanski shipped a new model fixing lotta issues + some more optimization for better throughput

have fun converting PDFs!

01.08.2025 18:40 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

olmoTrace for connecting model generations to training data won Best Paper for System Demonstrations at #ACL2025!

31.07.2025 20:33 β€” πŸ‘ 19    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0

People at #ACL2025, come drop by our poster today & chat with me about how context matters for reliable language model evaluations!

Jul 30, 11:00-12:30 at Hall 4X, board 424.

30.07.2025 06:05 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

issues w preference LM benchmarks:

🐑data contains cases where the "bad" response is just as good as chosen one
🐟model rankings can feel off (claude ranks lower than expected)

led by @cmalaviya.bsky.social, we study underspecified queries & detrimental effect on model evals; accepted to TACL 2025

22.07.2025 17:02 β€” πŸ‘ 14    πŸ” 4    πŸ’¬ 2    πŸ“Œ 0

lol 'what does skibidi mean' was not on my bingo card for @actinterp.bsky.social‬ workshop #icml2025 πŸ˜†

19.07.2025 16:39 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

presenting olmOCR at the poster session (2:15pm 211 West) for #codeml workshop at #icml2025!
🐟 fully open source OCR, comparable or better than frontier VLMs
🐠 all weights, data, code free & public
🐑 new benchmark of OCR "unit tests" on diverse PDFs & challenging OCR cases

18.07.2025 21:19 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Presenting two posters at ICML over the next two days:
- Both at 11am - 1:30pm
- Both about how to improve pre-training with domains
- Both at stall # E-2600 in East Exhibition Hall A-B (!)

Tomorrow: WebOrganizer w/ @soldaini.net & @kylelo.bsky.social
Thursday: MeCo by @gaotianyu1350.bsky.social

16.07.2025 05:19 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

will be at #icml2025, lemme kno if wanna chat about OLMo pretraining data curation, evaluation, data mixing, etc!πŸ‘‹

find us at poster sess on πŸ“…Wed 7/16 @ 11am⏲️ to learn about Web Organizer, distilling web data taxonomies into small models & using them for LM data mixing!

14.07.2025 16:54 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

only took few days to descend into madness

01.07.2025 20:12 β€” πŸ‘ 12    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

shame internet ppl dont kno u as bringer of office cookies

28.06.2025 19:42 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

it's still working for me? wat r u seein

28.06.2025 19:40 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

back from copenhagen & berkeley travels, now moving into new @ai2.bsky.social office!

26.06.2025 15:45 β€” πŸ‘ 16    πŸ” 0    πŸ’¬ 0    πŸ“Œ 1

thx for summary! im still wondering if it implies what ur saying. the judge is giving a pass because the filters around the model prevent content leakage. couldn’t one also interpret this as open model developers may be expected to provide such filters alongside released weights?

26.06.2025 03:54 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Paramore - The News [OFFICIAL VIDEO]
YouTube video by Paramore Paramore - The News [OFFICIAL VIDEO]

🎧

youtu.be/YSFa_wOZPXg?...

26.06.2025 03:19 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

thx for organizing! great to meet NLP folks & consume fancy bread πŸ₯–πŸžπŸ₯

21.06.2025 14:32 β€” πŸ‘ 21    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

woahh thx this is clearer than how in presented it πŸ˜†

21.06.2025 10:29 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

we developed the benchmark independently so no dev/test leakage, and even so, results show olmOCR produces often higher quality output than even proprietary OCR tools & is way cheaper + local as well!

our team will be at #ICML2025, come find me, Jake (our OCR lead!) & @soldni.bsky.social there!

19.06.2025 13:25 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

the benchmark works based on thousands of "unit tests"

so instead of fuzzy matching between a model-generated table with a gold reference table,

we define Pass/Fail tests like "the cell to the left of the cell containing 0.001 should contain 1.96"

19.06.2025 13:25 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

excited to release our new benchmark for OCR addressing 3 eval challenges:
🐟 coverage of many types of docs (born digital vs old scans, pages w tiny fonts, etc)
🐑 coverage of many different OCR targets (e.g. equations, tables, etc)
🐠 apples-to-apples comparison across systems

19.06.2025 13:25 β€” πŸ‘ 8    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

we won honorable mention for Best Paper at #CVPR2025 πŸ† for Molmo & Pixmo, showing the value of high-quality data for VLMs!

recalling when we released same time as Llama 3.2 πŸ˜†

huge kudos to Matt Deitke, Chris Clark & Ani Kembhavi for their leadership on this project!

@cvprconference.bsky.social

13.06.2025 17:46 β€” πŸ‘ 34    πŸ” 3    πŸ’¬ 0    πŸ“Œ 1
Post image

google down, guess ill go smell flowers or sthn πŸ€·β€β™‚οΈ

12.06.2025 19:31 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

excited to see this release of 1M public domain & CC zero books, digitized and OCR'd! πŸ‘ big win for open data, congrats to the authors!

arxiv.org/abs/2506.08300

12.06.2025 00:21 β€” πŸ‘ 37    πŸ” 5    πŸ’¬ 0    πŸ“Œ 1

thx for sharing! from your paper:

β€œthe artifacts that they protect, namely model weights and model outputs, are largely not copyrightable, making it unclear whether there is even anything to be licensed.”

dumb question but I thought it’s possible to license non-copyrightable artifacts?

06.06.2025 02:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

fave part of this work is "ties" βš–οΈ cases with many correct answers

"Q: name a color of rainbow", a good reward model should know all answers "green", "blue", ... are correct but w/ no strong color preference

congrats @saumyamalik.bsky.social et al @ai2.bsky.social & friends on RewardBench 2!

03.06.2025 00:36 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

latex dummy here - what problem does this solve?

31.05.2025 18:45 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@sarahwiegreffe.bsky.social pls?

31.05.2025 02:44 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

reviewers asking for comparison w qwen 3, have yall forgotten model was released a month ago lol πŸ€ͺ

30.05.2025 16:15 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

i think highest bar is having yanai read the draft & say it is good πŸ˜†

30.05.2025 00:16 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

what we dont see is how many submission attempts, each incurring reviewing costs, until one accepted πŸ€·β€β™‚οΈ

30.05.2025 00:15 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Significant

feels literally xkcd.com/882

30.05.2025 00:14 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@kylelo is following 20 prominent accounts