j soma's Avatar

j soma

@dangerscarf.bsky.social

326 Followers  |  663 Following  |  15 Posts  |  Joined: 06.02.2024  |  2.03

Latest posts by dangerscarf.bsky.social on Bluesky

Preview
The Automated Newsroom: Build AI Workflows That Work A six-week, hands-on course teaching journalists how to design, test, and improve AI workflows. Learn evaluation, testing, and product thinking for newsroom automation.

Find out more about the AI newsroom workflow course at its awful sales-y site, and feel free to shoot me any questions you might have!

littlecolumns.com/courses/ai-n...

30.10.2025 20:47 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

The course itself is six weeks long, and while it does cost money (which is crazy strange for me!), there are steep geographic pricing discounts and coupon codes for close readers of the course site.

30.10.2025 20:47 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

It's maybe like 35% a tech course, and a lot of the theory is stuff that seems simple once you've heard it: see what goes wrong, fix it, track it. That's it!

Yes, we'll learn automation tools like n8n/ActivePieces and eval suites like Opik/Arize Phoenix, buuut they're just one part

30.10.2025 20:47 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

This course is going to solve every step of those crises. How do you...

- set up an AI pipeline?
- measure if it's working?
- iterate and improve it?
- make sure you're solving a reader/reporter problem instead of just playing tech games?

It isn't magic! It's easy!!!!

30.10.2025 20:47 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
The Automated Newsroom: Build AI Workflows That Work A six-week, hands-on course teaching journalists how to design, test, and improve AI workflows. Learn evaluation, testing, and product thinking for newsroom automation.

I'm running a six-week course in November on building and evaluating AI newsroom workflows!

It's targeted at people who don't know where to start, or who build little prototypes and end up stumped about making them production-ready.

littlecolumns.com/courses/ai-n...

30.10.2025 20:44 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
a three-column table with the middle column highlighted

a three-column table with the middle column highlighted

three columns being restructured into a vertical flow

three columns being restructured into a vertical flow

tables being selected irrespective of their columns

tables being selected irrespective of their columns

the eventual pandas df

the eventual pandas df

Natural PDF v0.1.13 out โ€“ a handful of useful changes but my favorite is๐Ÿ—ผpage restructuring support!

Grab sections and "flow" them together vertically or horizontally, making multi-column extraction infinitely easier than 24 hours ago.

Details at jsoma.github.io/natural-pdf/...

05.06.2025 14:02 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image Post image Post image Post image

it looks like someone has been going very hard on scans

ONE MORE DAY OF ACCEPTING BAD PDF SUBMISSIONS

16.05.2025 12:36 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

you could have won EVERY CATEGORY

14.05.2025 20:21 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image Post image

Woke up to ton of new non-English BAD PDF CONTEST submissions: ๐Ÿ’ฅ Serbian! Romanian! Chinese! ๐Ÿ’ฅ

Mostly not scans, though, so I predict they'll easy-peasy to extract the info from. I want to have to train a custom OCR model!!! Someone submit a big scanned non-English PDF!!!

12.05.2025 12:56 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image

i know you all are hiding worse scans from me

11.05.2025 13:33 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
screenshot of a spreadsheet with very tiny text

screenshot of a spreadsheet with very tiny text

i love this giant-pdf-with-tiny-text submission, we need a smallest font size category

08.05.2025 15:24 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Bad PDF Contest I'm looking for the most frustrating, painful, real-world PDFs.

I am running a contest. It is about bad pdfs.

It can make you independently wealthy (for immeasurably small measures of independent wealth)

badpdfs.com

07.05.2025 16:19 โ€” ๐Ÿ‘ 4    ๐Ÿ” 2    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 2
Post image Post image

Live colab demo/walkthrough here: colab.research.google.com/github/jsoma...

03.04.2025 16:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
a screenshot of natural PDF documentation

a screenshot of natural PDF documentation

New release of ๐Ÿ“ Natural PDF ๐Ÿ“

A million and one table extraction/document layout/Q&A/quality of life improvements for all your PDF-processing needs

jsoma.github.io/natural-pdf/

03.04.2025 16:16 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Columbia Student Hunted by ICE Sues to Prevent Deportation Yunseo Chung, a legal permanent resident who has lived in the U.S. since she was 7, participated in pro-Palestinian demonstrations. Immigration agents visited residences looking for her.

the law clinic repping this student, CLEAR, is based out of CUNY.....once again the public city university absolutely flounces the ivy league when it comes to having a backbone and standing on actual principles

24.03.2025 23:16 โ€” ๐Ÿ‘ 1575    ๐Ÿ” 364    ๐Ÿ’ฌ 15    ๐Ÿ“Œ 22

Thank you โ€“ if only we could get a fix for the bug that prevents it from working 100%!

14.03.2025 03:16 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

@dangerscarf is following 20 prominent accounts