Now, some acknowledgments: this work was made possible thanks to a generous compute grant from Lambda!
And I've got a hosted version of the model that I'll be sharing in a couple days hosted on @modal-labs.bsky.social, which makes it basically free for me to host and scale
24.09.2025 17:51 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
GitHub - jbarrow/commonforms: CommonForms dataset and models
CommonForms dataset and models. Contribute to jbarrow/commonforms development by creating an account on GitHub.
As part of the paper, I'm working on releasing the dataset and FFDNet models on HuggingFace.
Those will be out in the coming days, you can follow along here: github.com/jbarrow/comm...
๐คPaper: huggingface.co/papers/2509....
arXiv: arxiv.org/abs/2509.16506
24.09.2025 17:51 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Now, just because we filtered for the cleanest forms doesn't mean we got _perfect_ forms. There are still a lot of inconsistencies in how people prepare forms! In future work I'll be looking at mitigating data quality issues like these.
24.09.2025 17:51 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
(Note, this doesn't _just_ apply to Acrobat, it's also better than Apple Preview -- neither Acrobat nor Preview even make an attempt at checkboxes, and they're often fooled by any straight, horizontal line. Left: Acrobat, Right: FFDNet)
24.09.2025 17:51 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
If we train object detectors to find the form fields on these pages, we get a much cleaner set of forms than if you used Acrobat to automatically prepare your form. (Left: Acrobat, Right: FFDNet).
24.09.2025 17:51 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Step 1 is to filter out for the cleanest forms possible. We start with 8MM PDFs from Common Crawl, and work our way down to ~60k of the cleanest forms we can find. The results is a ~500k page dataset, called CommonForms.
24.09.2025 17:51 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Paper thread of some work Iโm *incredibly* proud of, my first single author paper!
Converting a PDF to a fillable form is a hard problem, and a lot of solutions donโt work very well! In CommonForms, I show that you can train models that outperform Adobe Acrobat for <$500! ๐งต
24.09.2025 17:51 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Yeah I wonder if that statistic is flipped between the cities (though operated by the same provider โ Lyft โ I assume?)
No way that 99 out of every 100 riders in Boston have visited more than 27 stations?
31.05.2025 05:07 โ ๐ 3 ๐ 0 ๐ฌ 0 ๐ 0
Pretty sure you want that number to be lower. :p (my stats for DC ridership)
31.05.2025 05:04 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
Would absolutely love that!
11.03.2025 12:28 โ ๐ 2 ๐ 0 ๐ฌ 0 ๐ 0
"AI TOPS our stock price"
- Nvidia, today
07.01.2025 21:35 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
Ah, yes, that ol' familiar unit of measure "AI TOPS"
07.01.2025 20:13 โ ๐ 4 ๐ 0 ๐ฌ 1 ๐ 0
Palma as a FreeWrite
Agree, my ideal would be if you could type into an old, cheap, refurb kindle personally.
Hereโs a video of a person typing into the Palma: www.reddit.com/r/Onyx_Boox/...
My experience (tablet) is that itโs maybe 10s from pickup to writing โ wake up (3s), navigate to apps (2s), open app (3-5s)?
22.12.2024 08:56 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Iโve got one of the older, larger eInk tablets and use it for reading books/papers and taking notes. Battery after several years lasts about a week of average use, longer if I keep WiFi off.
21.12.2024 21:29 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Not necessarily hitting the price point but there are eInk mini tablets (e.g. Boox Palma at around $200) that have Android, no sim (so no phone distractions), and long battery life (thanks to the eInk and being generally underpowered). They accept keyboards, too.
21.12.2024 21:27 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
Drake meme template.
No to: clear, concise prose
Yes to: negative vspace
18.12.2024 10:37 โ ๐ 3 ๐ 1 ๐ฌ 0 ๐ 0
Drake meme template.
No to: clear, concise prose
Yes to: negative vspace
18.12.2024 10:37 โ ๐ 3 ๐ 1 ๐ฌ 0 ๐ 0
Holy moly that created an extra half page of space!
18.12.2024 10:09 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
A picture of a teapot to the right of a teacup, both in a flat-bottomed basket. The teacup has eta and flowers in it. There are 2 blue bounding boxes on the image, one labeled "Teapot" and one labeled "Teacup" that are over the teapot and teacup respectively.
Gemini 2.0 Flash is pretty good at localization in images. for an LMM (much better than GPT-4o in my experiments).
18.12.2024 09:26 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
ML history question: is there an earlier reference to pixel-only in-context (i.e. no fine-tuning) DocVQA performance than the GPT-4 announcement from OpenAI?
09.12.2024 09:45 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
Aged white tea, the kind that comes in a compressed disk or ball. My favorite kind of tea, imo they taste naturally quite sweet. Yunnan Sourcing has a bunch!
25.11.2024 05:58 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Breakthrough AI to solve the world's biggest problems.
โบ Join us: http://allenai.org/careers
โบ Get our newsletter: https://share.hsforms.com/1uJkWs5aDRHWhiky3aHooIg3ioxm
Minnesota guy.
"This particular activist will not stop." Sen. Chris Murphy
Researcher (OpenAI. Ex: DeepMind, Brain, RWTH Aachen), Gamer, Hacker, Belgian.
Anon feedback: https://admonymous.co/giffmana
๐ Zรผrich, Suisse ๐ http://lucasb.eyer.be
Menswear writer. Editor at Put This On. Words at The New York Times, The Washington Post, The Financial Times, Esquire, and Mr. Porter.
If you have a style question, search:
https://dieworkwear.com/ | https://putthison.com/start-here/
Professor at Wharton, studying AI and its implications for education, entrepreneurship, and work. Author of Co-Intelligence.
Book: https://a.co/d/bC2kSj1
Substack: https://www.oneusefulthing.org/
Web: https://mgmt.wharton.upenn.edu/profile/emollick
Research Scientist at Adobe Research. PhD from Northwestern University on LLMs and all that fun stuff. Opinions my own.
Google Chief Scientist, Gemini Lead. Opinions stated here are my own, not those of Google. Gemini, TensorFlow, MapReduce, Bigtable, Spanner, ML things, ...
AI Engineer @ Google ๐จโ๐ป โ Educator ๐จโ๐ซ โ Traveller โ๏ธ โ Hobby photographer ๐ท โ Foodie ๐ฎ โ Film fan ๐ฟ โ Boardgamer ๐ฒ โ Londoner๐โโ๏ธ
Medium: https://heiko-hotz.medium.com/
Github: https://github.com/heiko-hotz
LI: https://www.linkedin.com/in/heikohotz/
An electrical engineer who likes to make things
https://d-winker.github.io/
Blog: https://argmin.substack.com/
Webpage: https://people.eecs.berkeley.edu/~brecht/
hot takes, linear Algebra, JAX apologist, Raconteur
Building. Affiliations: @JHU, @Penn, @UCSC, @Amazon, @Twitter || Art: #NLProc, Vision, Speech, #DeepLearning || Life: ้ๅ
, improv, running ๐
More good things for everyone. Public sector appreciator. Tax and welfare policy knower. Hyperinflation doubter.
Professor of HCII and LTI at Carnegie Mellon School of Computer Science.
jeffreybigham.com
Searching for the numinous
Australian Canadian, currently living in the US
https://michaelnotebook.com
๐ฅ LLMs together (co-created model merging, BabyLM, textArena.ai)
๐ฅ Spreading science over hype in #ML & #NLP
Proud shareLM๐ฌ Donor
@IBMResearch & @MIT_CSAIL
information science professor (tech ethics + internet stuff)
kind of a content creator (elsewhere also @professorcasey)
though not influencing anyone to do anything except maybe learn things
she/her
more: casey.prof
AI policy researcher, wife guy in training, fan of cute animals and sci-fi. Started a Substack recently: https://milesbrundage.substack.com/