's Avatar

@dshkol.bsky.social

32 Followers  |  8 Following  |  15 Posts  |  Joined: 15.11.2024  |  1.5712

Latest posts by dshkol.bsky.social on Bluesky


data, geo, viz, cities, causal inf and ml, generative design and art, llms, anyone doing anything interesting i suppose

i've managed to survive picking through the remnants of the other place while wearing a thick mental hazard suit, but its getting a bit much now

15.01.2026 06:03 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

who should I follow on this thing

13.01.2026 05:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

and 2. I've found that these agents work better in scripts than in notebooks. Like they iterate better, hallucinate less, and troubleshoot faster and catch issues on an R/py script than they do in a notebook. I suspect the notebook backend itself (esp ipynb) chews through working context.

13.01.2026 05:10 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I think the single quarto notebook would probably work fine paired with the same set of SKILLS and would reduce the need for the rest of the harness. But:

1. I was interested in a solution for a system more elaborate than a single notebook

13.01.2026 05:10 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Jens getting claude-pilled... watch out folks! πŸ˜…

12.01.2026 20:32 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

So check it out! Would love some feedback.

Disclaimer: I am not in anyway affiliated with STC and this is not in any way an official publication. Use with caution!

12.01.2026 20:18 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Building a lie detector for The D-AI-LY | Dmitry Shkolnik Building a system using skills and defensive engineering to catch a model that's very good at lying convincingly to you.

And a follow up post on what turned out to be the most interesting part -- fighting model hallucination and errors. This one gets a bit weedy.

dshkol.com/posts/the-da...

12.01.2026 20:18 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
The D-AI-LY: An Autonomous Statistical Digest | Dmitry Shkolnik Replicating Statistics Canada's The Daily using Claude Code alongside dedicated tools and a skills-based harness.

I wrote about the process and open-sourced the repos on my site dshkol.com/posts/the-da...

12.01.2026 20:18 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The idea was to simultaneously layer:

1. SKILL documents to carefully fine tune instructions, expectations, skepticism and reasoning
2. Giving CC access to and forcing reliance on specialized tools like the cansim R package
3. Over-engineered data provenance tracking to trace each data point

12.01.2026 20:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Nothing in any given release is all that complicated and could be easily one-shotted by any current AI tool.

The challenge is consistency of execution against previously unseen data and defending against the kind of hallucination and data mistakes that LLMs are prone to.

12.01.2026 20:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

CANSIM has 60k+ tables. StatCan covers a handful at a time in The Daily. What if the cost of coverage was (nearly) zero?

The D-AI-LY (dshkol.com/thedaily/) checks for recently updated and neglected series and writes up releases for them with viz, links to source material, and reproducible code.

12.01.2026 20:18 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 2    πŸ“Œ 1

run of the mill 😀

05.01.2026 05:44 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

the library has been tested extensively for equivalence with equivalent R code in cancensus, but hasn't been extensively tested in the wild, so please use with caution

and please leave feedback and issues on the github page

27.10.2025 00:59 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

this relied heavily on agentic cli coding tools (mostly cc + sonnet 4.5). agentic coders work really well when given lots of examples and a clearly defined reward function, which in this case relied on the extensive unit testing @jensvb.bsky.social and I built for cancensus

27.10.2025 00:58 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
GitHub - dshkol/pycancensus: python port of cancensus R package python port of cancensus R package. Contribute to dshkol/pycancensus development by creating an account on GitHub.

the cancensus R package has had 81k dls since 2018. It's the best way to interact with StatsCan data for R users.

pycancensus is a full python port, with equivalent data access and manipulation grammar, output, and geospatial retrieval.

check it out here: github.com/dshkol/pycan...

27.10.2025 00:41 β€” πŸ‘ 10    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1

@dshkol is following 8 prominent accounts