Robin Linacre 's Avatar

Robin Linacre

@robinlinacre.bsky.social

Lead developer of Splink. Data scientist at Ministry of Justice. Trustee, GiveDirectly UK. Pledgee, http://givingwhatwecan.org. All views my own.

176 Followers  |  345 Following  |  53 Posts  |  Joined: 23.09.2024
Posts Following

Posts by Robin Linacre (@robinlinacre.bsky.social)

Video thumbnail

We're particularly proud of how easy this is to use. The following 2 minute video demos the end-to-end process of matching 100k council tax records.

This process is more fully documented here:

moj-analytical-services.github.io/uk_address_m...

04.03.2026 12:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
uk_address_matcher

Docs: moj-analytical-services.github.io/uk_address_m...
Code: github.com/moj-analytic...
Discussion forum: github.com/moj-analytic...

04.03.2026 11:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The full end-to-end process from raw OS data to 100k matched addresses can be completed in less than a minute if matching to a small geographic area such as a local authority, and about 11 minutes for the whole UK (including one-time setup). Additional 100k records take <1m

04.03.2026 11:44 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

In addition, we have published reproducible accuracy benchmarks using publicly available labelled datasets. This allows it to be compared head-to-head with other approaches.

04.03.2026 11:43 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Key features:

- Python only. Set up in seconds, runs on a laptop. No separate infrastructure or services needed.
- Fast. Match 100,000 addresses in ~30 seconds.
- We provide an automated build pipeline for users wishing to match to Ordnance Survey data.

04.03.2026 11:43 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
UK Address Matcher logo

UK Address Matcher logo

We are pleased to release `uk_address_matcher`, a free Python package for address matching and geocoding, developed by Tom Hepworth and me.

The package has several aims: simplicity, speed and accuracy.

04.03.2026 11:43 β€” πŸ‘ 27    πŸ” 9    πŸ’¬ 1    πŸ“Œ 2
Video thumbnail

πŸ“£ NEW! I’ve just released the BIGGEST and perhaps most creative project I’ve ever worked on!

β€œSearching for Birds” searchingforbirds.visualcinnamon.com 🐀

A project, an article, an exploration that dives into the data that connects humans with birds, by looking at how we search for birds.

12.02.2026 10:02 β€” πŸ‘ 472    πŸ” 175    πŸ’¬ 25    πŸ“Œ 49

In case you missed it: New blog: Respectful use of AI in software development teams
www.robinlinacre.com/respectful_u...

LLMs are increasingly able to write production quality code. But what cognitive work can be delegated to LLMs without damaging the health of the team?

23.01.2026 18:57 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Robin Linacre's blog Probabilistic record linkage, Data Deduplication, Data Science, Engineering and the Environment

πŸ‘‹ I'm a FOSS dev. My blog is here : www.robinlinacre.com.

22.01.2026 20:34 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

For anyone with FOMO wondering whether to pay for Opus 4.5/Claude Code, my experience is that OpenAI Codex is very similar in performance. i.e. both are excellent, but Claude Code is not a magic unlock

20.01.2026 18:10 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Excited to be a part of the initiative to "Move Fast and Fix Things", announced in Chief Secretary to the Prime Minister's speech today. One measure is an expansion of the No10 Innovation Fellowship, for which we've launched a new website!

fellows.ai.gov.uk

Speech: www.gov.uk/government/s...

20.01.2026 15:46 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

I've wondered about this too. Feels like it'd be well suited to a Kaggle type problem where you're just after the most accurate predictive model. Feels like Claude should be able to chug away trying lots of different types of approaches, though prob need to be a bit careful about reward hacking

20.01.2026 13:44 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

New blog: Respectful use of AI in software development teams
robinlinacre.com/respectful_u...

LLMs are increasingly able to write production quality code. But what cognitive work can be delegated to LLMs without damaging the health of the team?

12.01.2026 08:42 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 0    πŸ“Œ 2

Blows my mind how long this would have taken 5 years ago. This is where the edtech revolution is IMO, we just need experts in pedagogy to learn how to vibe code. My guess is in <3 years we'll have systems that can almost oneshot entire apps like this, inc all assets

06.12.2025 17:14 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

Made a simple game to help my daughter learn her alphabet. Took about 1 day for full code, >200 images >200 voice files and music

nano-banana-pro is incredible. But it's also so quick to vibe code image and audio processing scripts.
robinlinacre.com/bee_letters/
github.com/RobinL/bee_l...

06.12.2025 17:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

I struggle to find no-nonsense, free and 'fun'(ish) maths games for my son (7yo) so I have been making a few

Here's another one: Maths vs monsters. This is his fav so far
rupertlinacre.com/maths_vs_mon...

Code:
github.com/rupertLinacr...

Other games/maths utilities:
rupertlinacre.com

30.11.2025 17:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

OpenUK Awards 25 Open Data Category sponsored by Open Data Institute, Shortlist is live, congratulations to the shortlisted nominees: Ministry of Justice UK Splink Team (@robinlinacre.bsky.social), OpenActive, and UK Power Networks (Yiu-Shing Pang) 🍾πŸ₯‚πŸ†

#openukawards #opensource #opendata

04.11.2025 11:30 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Screenshot of sample of Islington's Council Tax address data, visualised in Google Earth

Screenshot of sample of Islington's Council Tax address data, visualised in Google Earth

More progress on #openaddresses:

Islington Council in London has released its Council Tax address list for re-use as #opendata under the Open Government Licence www.owenboswarva.com/blog/post-ad...

I've made a geocoded version by adding coordinates from ONS

#FOI #localgov #UKhousing #proptech

28.10.2025 08:46 β€” πŸ‘ 3    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

No worries - thanks for the report on the repo, we'll take a look

02.10.2025 13:08 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
uk_address_matcher/examples at main Β· moj-analytical-services/uk_address_matcher Contribute to moj-analytical-services/uk_address_matcher development by creating an account on GitHub.

(Incidentally, uk_address_matcher should work ok for non-UK addresses, that's just no our focus. See examples here for how to use the package github.com/moj-analytic...)

02.10.2025 06:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - moj-analytical-services/uk_address_matcher Contribute to moj-analytical-services/uk_address_matcher development by creating an account on GitHub.

Did you try github.com/moj-analytic...?

The trie is WIP, but the idea is that it will be used as an initial step to skim off the easy ones. The remainder will go through to the main matching phase which already exists in uk_address_matcher, but is more computationally intensive

02.10.2025 06:21 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Post image

New ✨interactive✨ explainer: Address matching using a fault tolerant trie:

robinlinacre.com/fault_tolera...

Which illustrates a powerful technique for address matching that we're currently working on building into uk_address_matcher (github.com/moj-analytic...)

24.09.2025 07:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

You select the columns you want, and it handles the joins for you.

It's just a rough sketch for now. I feel like it must have done before, but couldn't find anything. Feedback welcome!

18.08.2025 06:40 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

When working a complex postgres schema, I find it time consuming to figure out the joins.

I had an idea: a 'join generator' that traverses the relationship graph for you, and writes the joins.

You give it a dump of the postgres schema, and it gives you a UI.

www.robinlinacre.com/vite_live_pg...

18.08.2025 06:40 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

We're working on a DuckDB community extension called `splink_udfs` to add some record linkage related functions to DuckDB. It's currently very much WIP, but you can already use it wherever you're using DuckDB.
github.com/moj-analytic...

22.07.2025 16:50 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Speed enhancement: 'Pushing up' common elements of CASE statements into reused computations by RobinL Β· Pull Request #2738 Β· moj-analytical-services/splink This is a clean rewrite of #2630. The rationale is explained further in #2580, but in a nutshell it eliminates repeated computations of potentially expensive functions in some backends, e.g CASE ...

For more details see

github.com/moj-analytic...

15.07.2025 16:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Speed enhancement: 'Pushing up' common elements of CASE statements into reused computations by RobinL Β· Pull Request #2738 Β· moj-analytical-services/splink This is a clean rewrite of #2630. The rationale is explained further in #2580, but in a nutshell it eliminates repeated computations of potentially expensive functions in some backends, e.g CASE ...

If you're using Splink with DuckDB you should see significant speed improvements by updating to DuckDB 1.3.x. You can also add more granularity to your comparison levels statements without an impact on run times. Depending on your model spec, it could be twice as fast or better.

15.07.2025 16:46 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Then give output to VS Code copilot in agent mode to implement

11.07.2025 08:33 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

My most commonly used pattern for AI coding: Dump entire source code into Gemini 2.5 pro, write prompt specifying what I want, and then: Give precise instructions for an LLM to follow to implement this feature. Break the solution down into steps where each step is verifiable.

11.07.2025 08:33 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I think more blocking stage. UK blocking is relatively easy because postcode gets you down to about 50 or fewer addresses. So if your postcodes are accurate, blocking isn't too hard. For addresses outside UK, you might need to lean more heavily on the signature based approaches

05.07.2025 21:01 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0