bentasker.co.uk Feed's Avatar

bentasker.co.uk Feed

@mastodon.bentasker.co.uk

I am a bot. I exist to provide content to those who want to follow new posts on bentasker.co.uk but don't want @ben's jibber-jabber in their feed […] [bridged from https://mastodon.bentasker.co.uk/@rssbot on the fediverse by https://fed.brid.gy/ ]

4 Followers  |  0 Following  |  54 Posts  |  Joined: 14.11.2024  |  3.9345

Latest posts by mastodon.bentasker.co.uk on Bluesky

Preview
Automatically Backing Up Ente Photos About 6 months ago, I imported our photo collection into Ente Photos. Before that, our memories lived in Nextcloud. Although it's great for file syncing, Nextcloud didn't really work all that well for us as a photo manager (in fairness, that's probably largely because I'd pulled them in via a Shared Storage rather than putting any _real_ effort in). Like Nextcloud, Ente is open source. Originally, I'd intended to use Ente's SaaS offering for a little while and the migrate to self-hosting once I was happy that it was the right solution. That migration never happened (I've been happy enough, and the cost is low enough, that I've not really felt the urge to move). It does mean, though, that the availability of our (ever growing) photo collection is reliant on Ente's disaster recovery posture. Ente have not given me any reason to doubt them (in fact, their approach to reliability is well documented), but our photos are _utterly irreplaceable_ and using _any hosted offering_ comes with some risk of disappearance (or of being acquired by a company which goes on to ruin it), often without any prior warning. To their credit, this is something that Ente explicity recognised when first introducing the CLI. This blog post talks about using the `ente` CLI to automate a periodic incremental backup of the photos that we store in Ente. It's primarily focused on deploying into Kubernetes but also details how to do so using Docker or a native install. * * * ### Contents * Contents * Containerising * Storage * Running In Kubernetes * First Time Setup * Scheduling * TL:DR * Manual Run * Running Without Kubernetes * With Docker * Without Containers * Caveats * Conclusion * * * ### Containerising There wasn't any _particular_ need for the backup to run within Kubernetes, other than that I've had a cluster that I could run it in. The CLI is actually pretty simple, so setting things up without using Docker or Kubernetes isn't too much different (details of that are below). To keep things lightweight, I based my container on Wolfi: FROM cgr.dev/chainguard/wolfi-base AS builder Ente is written in Go, so I installed `go` and `git` before cloning Ente's source down and compiling the CLI: RUN apk add go git \ && mkdir /build \ && cd /build \ && git clone --depth=1 --branch=$ENTE_RELEASE https://github.com/ente-io/ente.git \ && cd ente/cli \ && go build -o "bin/ente" main.go This produced a standalone binary, so I copied it into a fresh image, created the directories that it needed and configured the container to run as a non-privileged user: FROM cgr.dev/chainguard/wolfi-base # Copy the built binary over # Make sure we also ship the license file COPY --from=builder /build/ente/cli/bin/ente /usr/bin COPY --from=builder /build/ente/LICENSE /LICENSE RUN mkdir /cli-data/ /cli-export/ \ && chown -R nonroot:nonroot /cli-data/ \ && chown -R nonroot:nonroot /cli-export/ USER nonroot ENTRYPOINT ["/usr/bin/ente"] The full Dockerfile can be found in Codeberg and my build of the image can be pulled from `codeberg.org/bentasker/ente-cli-docker`. * * * ### Storage The CLI requires a couple of storage volumes: * `/cli-data` this is where the CLI will maintain a database of image metadata (along with the creds it uses to talk to Ente). * `/cli-export` this is where photos will be exported to The CLI data path can be overridden via env variable `ENTE_CLI_CONFIG_DIR`. The export path can be any arbitrary path, but has to be provided when adding an account to the CLI's config. * * * ### Running In Kubernetes #### First Time Setup The CLI isn't of _much_ use until it's linked to an account. Unfortunately, there isn't a programmatic way to pre-configure it, so I needed to spin up a pod so that I could login to the CLI. As it seemed _possible_ that I might need to manually interact with the CLI again in future, rather than manually creating a pod, I defined a deployment but set it to be scaled to 0 pods: apiVersion: v1 kind: Namespace metadata: name: ente-backup --- apiVersion: apps/v1 kind: Deployment metadata: name: ente-backup-cli namespace: ente-backup spec: selector: matchLabels: app: ente-backup replicas: 0 template: metadata: labels: app: ente-backup spec: containers: - name: ente-backup image: codeberg.org/bentasker/ente-cli-docker:v0.1 env: - name: ENTE_CLI_CONFIG_DIR value: "/cli-data/" - name: ENTE_CLI_SECRETS_PATH value: "/cli-data/.secrets" command: [ "/bin/sh", "-c", "while true; do sleep 3600; done" ] resources: requests: cpu: 150m memory: 64Mi volumeMounts: - mountPath: /cli-data name: kubestorage subPath: ente-backup/config - mountPath: /cli-export name: kubestorage subPath: ente-backup/export restartPolicy: Always volumes: - name: kubestorage nfs: server: 192.168.3.233 path: "/volume1/kubernetes_misc_mounts" readOnly: false The important thing here is that the pod needs to use the same storage volumes as our cronjob will. Scaling to 0 means that the necessary configuration will be present in the cluster when I need it, but won't waste resources by running pods unnecessarily. I scaled the deployment up to `1` so that a pod would come online: kubectl -n ente-backup scale --replicas=1 deployment/ente-backup-cli I exec'd into the new pod and triggered the account addition flow: kubectl -n ente-backup exec -it ente-backup-cli-669dff58f4-vzbsv -- /usr/bin/ente account add When prompted, I set the export directory to `/cli-export/` (you can enter whatever you want, but be aware that the path needs to exist - the setup flow won't create it for you if it doesn't). Once the account had been added, I scaled the deployment back down to 0: kubectl -n ente-backup scale --replicas=0 deployment/ente-backup-cli * * * #### Scheduling `ente` was now configured to work with my account. The next step was to configure an automated run, using a `CronJob`. The podspec is, more or less, identical to the spec used for the deployment above. The only _real_ change is the command (which invokes `ente export`): --- apiVersion: batch/v1 kind: CronJob metadata: name: ente-backup namespace: ente-backup spec: schedule: "0 4 * * *" failedJobsHistoryLimit: 5 successfulJobsHistoryLimit: 5 concurrencyPolicy: "Forbid" jobTemplate: spec: template: spec: restartPolicy: Never containers: - name: ente-backup image: codeberg.org/bentasker/ente-cli-docker:v0.1 imagePullPolicy: IfNotPresent env: - name: ENTE_CLI_CONFIG_DIR value: "/cli-data/" - name: ENTE_CLI_SECRETS_PATH value: "/cli-data/.secrets" command: ["/usr/bin/ente", "export"] volumeMounts: - mountPath: /cli-data name: kubestorage subPath: ente-backup/config - mountPath: /cli-export name: kubestorage subPath: ente-backup/export volumes: - name: kubestorage nfs: server: 192.168.3.233 path: "/volume1/kubernetes_misc_mounts" readOnly: false This schedules the job to trigger at 4am each day. * * * #### TL:DR You can grab a copy of the above config from Codeberg. You'll need to update the storage volumes so that they are applicable to your cluster, but once that's done, you just need to apply: kubectl apply -f ente-backup.yml Scale up the deployment so that you can login to the CLI: kubectl -n ente-backup scale --replicas=1 deployment/ente-backup-cli kubectl -n ente-backup get pods kubectl -n ente-backup exec -it ente-backup-cli-669dff58f4-vzbsv -- /usr/bin/ente account add Once the flow's completed, scale back down: kubectl -n ente-backup scale --replicas=0 deployment/ente-backup-cli Wait for the cron to trigger (or move onto the next section to trigger it manually). * * * #### Manual Run I didn't want to have to wait for the next day to find out whether the backup had run, so I manually created a job from the `CronJob`: kubectl -n ente-backup create job ente-backup-manual --from=cronjob/ente-backup I then tailed the logs kubectl -n ente-backup logs job/ente-backup-manual It took some time to work through all our photos, but eventually it logged completion: Starting a new job resulted in a quick exit, as there was nothing new to do: * * * ### Running Without Kubernetes #### With Docker For those without a cluster to hand, the container can also be run using Docker. Just as with Kubernetes, the important thing here is that volumes persist between manual invocations and cron'd runs: ENTE_BACKUP_DIR=/path/to/backups # Set up a storage location mkdir -p ${ENTE_BACKUP_DIR}/ente/data ${ENTE_BACKUP_DIR}/ente/config # Do the first time setup docker run --rm \ -it \ -v $ENTE_BACKUP_DIR/ente/data:/cli-export \ -v $ENTE_BACKUP_DIR/ente/config:/cli-data \ codeberg.org/bentasker/ente-cli-docker account add A backup wrapper would then look something like this: #!/bin/bash # # Trigger the export container ENTE_BACKUP_DIR=/path/to/backups cd "$ENTE_BACKUP_DIR" docker run --rm \ -it \ -v $PWD/ente/data:/cli-export \ -v $PWD/ente/config:/cli-data \ codeberg.org/bentasker/ente-cli-docker The backup wrapper then just needs adding to a crontab 0 4 * * * /path/to/wrapper.sh * * * #### Without Containers `ente` is a standalone binary, so can also be run without using containers _at all_. If you want to build it from source, you'll need `go` installed - see the dockerfile steps above for an indicator of how to build it. If you're happy fetching a pre-built binary, though, you can grab one from Github: curl https://github.com/ente-io/ente/releases/download/cli-v0.2.3/ente-cli-v0.2.3-linux-amd64.tar.gz | tar xvz Setup is: ente account add and the command that you need to add to cron is: ente export * * * ### Caveats There are a couple of caveats here: **Exports are per _user_ account**: although Ente allows sharing of albums between users, it's very much set up as an individual user thing1. If you've got multiple Ente users all uploading their own photos (particularly automatically), you'll need to export from each of these (you can run `account add` multiple times to add them). **There's limited overlap protection** : The `CronJob` is configured to try and prevent overlapping runs, however there's nothing to prevent manually triggering a job while another is running. I don't know exactly what the outcome of an overlapping run will be, but it's unlikely to be anything good. * * * ### Conclusion There are additional docs on the CLI available here, but the setup above provides for a scheduled incremental backup of an Ente account. Using this with their hosted service provides the low maintenance associated with using SaaS offerings2, but maintains some of the access to data that self-hosting would provide. The export separates images into albums (mirroring the organisation that's performed in-app), so if something _were_ to happen to Ente, the backup of our photos is already sorted for convenient import into something else. * * * 1. This is one of the things that I like _least_ about Ente - shared albums currently give something of a sub-par experience because they appear in a totally different place within the interface. ↩ 2. Well... good ones anyway ↩

Automatically Backing Up Ente Photos
Author: Ben Tasker

https://www.bentasker.co.uk/posts/documentation/general/automating-backup-of-photos-from-ente.html?utm_source=rss

#backup #containers #ente #kubernetes #photos

02.11.2025 22:00 — 👍 1    🔁 0    💬 0    📌 0
Preview
Monitoring a UPS with Telegraf and Grafana Our power supply is normally pretty reliable, but last week we had a an outage. Although we've got solar, we don't (currently) have an islanding switch, so when the grid goes down, so do we. This power outage only lasted about 45 minutes, but came at a _really_ bad time: I was due to be interviewing someone, so had to try and get signal so that I could _at least_ send a SMS and tell them that we'd need to re-schedule. I _used_ to have a UPS, but didn't replace it after the battery reached end-of-life - at the time we had a young child in the house, so having something be persistently energised seemed like quite a bad idea. That's no longer a concern though, so I decided that it was time to plug important things (laptop, switch router etc) into a UPS - partly to protect them from damage, but also so that there's something that I can _do_ during an outage (this week, I couldn't do much more than sit and work my way through a Toblerone). This post details the process of installing Network UPS Tools (NUT) and configuring Telegraf to collect metrics from it, allowing graphing and alerting in Grafana. * * * ### The UPS It doesn't matter _too much_ what model of UPS you have, NUT supports a wide range of kit. Mine has a USB connection, so we're using NUT's `usbhid` support. My UPS is a Powerwalker VI Series UPS and shows up in `lsusb` like this Bus 006 Device 015: ID 0764:0601 Cyber Power System, Inc. PR1500LCDRT2U UPS The UPS has 4 mains plug sockets on the back, so I've got a few things plugged in: * My router/firewall (our fiber ONT is in a different room and has it's own battery backup) * My main switch * My NAS * An external HDD array * The extension lead which runs my desk Running my desk means that it has to power a couple of monitors **and** a couple of laptops. This isn't _quite_ as bad as it sounds though: * If I'm not at my desk, the monitors will be off and the laptops will be (relatively) idle * If _I am_ at my desk, the plan is to unplug the laptops and have them run off battery so that they're not using the UPS's capacity * * * ### NUT setup #### Installing NUT is in the Ubuntu repos, so: sudo apt update sudo apt install nut nut-client nut-server Next we confirm that NUT can actually see the UPS: sudo nut-scanner -U If all is well, this'll write out a config block: [nutdev1] driver = "usbhid-ups" port = "auto" vendorid = "0764" productid = "0601" product = "2200" serial = "11111111111111111111" vendor = "1" bus = "006" We need to write that into NUT's config, so invoke again but redirect: sudo nut-scanner -UNq 2>/dev/null | sudo tee -a /etc/nut/ups.conf The name `nutdev1` isn't _particularly_ informative, though, so we can also hand edit `ups.conf` to change it (and add a `desc` attribute to provide a description of the UPS): sudo nano /etc/nut/ups.conf I set mine like this: [deskups] desc = "Cyber Power System UPS" driver = "usbhid-ups" port = "auto" vendorid = "0764" productid = "0601" product = "2200" serial = "11111111111111111111" vendor = "1" bus = "006" Make a note of the name (the bit in square brackets), we'll need it shortly. * * * #### Setting Up For Monitoring Next we want to set up credentials for NUT server I used my `gen_passwd` utility to generate a random password, but use whatever method suits you: NUT_PW=`gen_passwd 24 nc` Create the user: echo -e "\n[monitor]\n\tpassword = ${NUT_PW}\n\tupsmon master\n" | sudo tee -a /etc/nut/upsd.users Now provide the credentials to `upsmon`, change the value of `UPS_NAME` to match the one that you set for the UPS in `ups.conf` earlier: # Change to match the name in ups.conf UPS_NAME="deskups" echo -e "\nMONITOR $UPS_NAME@localhost 1 monitor $NUT_PW master\n" | sudo tee -a /etc/nut/upsmon.conf Keep a note of the UPS name and password, we'll need it again when configuring `telegraf`. Configure NUT to run as a netserver (so that Telegraf can talk to it): sudo sed -e 's/MODE=none/MODE=netserver/' -i /etc/nut/nut.conf Restart services: for i in nut-server nut-client nut-driver nut-monitor do sudo systemctl restart $i done Confirm that nutserver is listening: $ sudo netstat -lnp | grep 3493 tcp 0 0 127.0.0.1:3493 0.0.0.0:* LISTEN 3854210/upsd tcp6 0 0 ::1:3493 :::* LISTEN 3854210/upsd Check that we get data back about the UPS: upsc $(upsc -l 2>/dev/null) 2>/dev/null If all is well, we're ready to move onto collecting data. * * * ### Collection and Visualisation With NUT now able to report on the UPS, the next step is to have that data collected so that we can visualise it and (optionally) alert based upon it. * * * #### Telegraf We're going to use the upsd input plugin to talk to NUT. This was introduced in Telegraf v1.24.0 so, if you're using an existing install, make sure that your `telegraf` is recent enough: telegraf version If you don't have Telegraf, there are install instructions here (note: you're also going to want an InfluxDB instance or free cloud account because the Dashboard that we'll use for visualisation uses Flux). The input plugin is pretty simple to configure, append the following to `/etc/telegraf/telegraf.conf`: [[inputs.upsd]] ## A running NUT server to connect to. ## IPv6 addresses must be enclosed in brackets (e.g. "[::1]") server = "127.0.0.1" port = 3493 # The values for these are found in /etc/nut/upsmon.conf username = "deskups@localhost" password = "[redacted]" additional_fields = ["*"] # Map enum values according to given table. ## ## UPS beeper status (enabled, disabled or muted) ## Convert 'enabled' and 'disabled' values back to string from boolean [[processors.enum]] [[processors.enum.mapping]] field = "ups_beeper_status" [processors.enum.mapping.value_mappings] true = "enabled" false = "disabled" After restarting (or reloading) `telegraf`, you should start to see metrics appearing in InfluxDB: * * * #### Visualisation I use Grafana for visualisation and, conveniently, there was already a community dashboard (the source for which can be found on Github). On the community page Click `Download JSON`. Then, in Grafana * `New Dashboard` * `Import JSON` * Drag the JSON file over You'll be presented with a set of options for the Dashboard - choose the relevant InfluxDB datasource to query against: You'll then be taken to the dashboard itself. It's quite likely that the dashboard will be broken - by default it looks for a bucket called `upsd-Telegraf` (I write into a bucket called `telegraf`). To fix it * `Settings` * `Variables` * `bucket` Scroll down to find `Values seperated by comma` and change it to contain the name of your bucket Click `Back to Dashboard` and the dashboard should now load: I already track electricity costs, plus we're on a 30 minute tariff, so I also edited the dashboard to remove the cost related row (and then the associated variables). * * * #### Alerting The `upsd` measurement contains a field called `ups_status` which will normally be `OL` (online). If the mains cuts out (or someone unplugs it to test behaviour...) the value will change to report that the UPS is running from battery: Note: The new state `OB DISCHRG` isn't actually a single status, it's reporting two (closely related) status flags. After power is restored, the UPS reports itself back online _but_ also notes that the battery is now charging: This means that creating an alert is **not** as simple as `if r.ups_status != "OL"`. I also only _really_ wanted an email notification to warn me of the following status symbols: * We're running from battery (flag: `OB`) * The UPS is reporting an alarm (flag: `ALARM`) * The UPS is reporting that the battery charge is too low (flag: `LB`) * The UPS is reporting overload (flag: `OVER`) * The UPS requires battery replacement (flag: `RB`) RFC 9271 is quite well designed in that no defined symbol exists as a sub-string of another, so we can safely do something like: for flag in ["OB", "ALARM", "LB", "OVER", "RB"]: if flag in ups.status: alarm() Of course, to do that with Grafana's alerting we need to translate the logic into a Flux query: // Define the regex to use when checking for alertable states alarm_regex = /(OB|LB|OVER|RB|ALARM)/ // Extract reported status from(bucket: "telegraf") |> range(start: v.timeRangeStart, stop: v.timeRangeStop) |> filter(fn: (r) => r["_measurement"] == "upsd") |> filter(fn: (r) => r["_field"] == "ups_status") |> group(columns: ["ups_name", "_field"]) |> keep(columns: ["_time", "_value", "_field", "ups_name"]) |> aggregateWindow(every: 1m, fn: last, createEmpty: false) // Identify whether the status contains any flags of concern // Grafana alerting requires the main column to be numeric // so we need to shuffle things around |> map(fn: (r) => ({ _time: r._time, //flags: r._value, ups_name: r.ups_name, _value: if r._value =~ alarm_regex then 1 else 0 })) |> group(columns: ["ups_name"]) The return values of this query are based on whether any of the problematic flags exist - if they don't, it'll return 0, if they do the value will be 1. This allows use of a simple threshold in the grafana alerting config: With the alert saved, I unplugged the UPS and waited: A minute later, the alert was escalated to Pagerduty: A couple of minutes after plugging the UPS back in, the alert recovered. * * * ### Conclusion Setting up monitoring of the UPS was pretty easy - NUT supports a wide range of devices and exposes status in a standardised way. NUT is well supported by Telegraf and there was _already_ a community dashboard available to visualise UPS status. This means that, in practice, the hardest part of all of this was fishing the relevant power leads out of the rack to plug into the back of the UPS. Now, if the power fails, I _should_ (depending on whether our fiber connection is still lit up) get a page to warn me. Either way, the UPS will provide some coverage for small outages.

New #Documentation: Monitoring a UPS with Telegraf and Grafana
Author: Ben Tasker

https://www.bentasker.co.uk/posts/documentation/linux/monitoring-a-UPS-with-telegraf-and-grafana.html?utm_source=rss

#alerting #electricity #grafana #monitoring #telegraf #ups

27.10.2025 14:15 — 👍 1    🔁 0    💬 0    📌 0
Preview
Setting up Anti Hot Linking Protections for ChatGPT I'm **not** going to start this post with the words "I don't like LLMs" because, at this point, it'd seem a little redundant. What I _really_ don't like, though, is paying to support the business model of companies like OpenAI. It's bad enough that their business is built off the back of broad scale plagiarism, but, on top of that their activities _continue_ to cost website operators money and resources. Although it's understandable that it might come to mind, I'm _not_ referring to the repeated crawling of their scrapers (along with the activity of all the other crawlers trying to get in on this latest gold rush). ChatGPT's web search mode is able to search the web and then summarise results (not unlike the psuedo-browsing experience that they now promise with Atlas). When the LLM's response includes images, they are hotlinked directly from the original website. Despite their _ridiculous_ valuation, OpenAI have apparently been too cheap to build an image cache into their product (or perhaps, are trying to sidestep copyright concerns). This means that, every time ChatGPT includes one of my images in its answer, **I** pay for the bandwidth necessary to serve it to ChatGPT's customer (who will very likely never visit my site or read an _accurate_ representation of what I've written). Whether or not we're talking about a small or a large amount of money, this is a **500 billion dollar** AI company freeloading on other people's bills (it _is_ somewhat fitting that a company which acts as a leech on the arse of creativity would _also_ be a bandwidth leech). I've been feeling somewhat grumpy this weekend anyway, so I decided to mitigate this by catching the requests and redirecting to a smaller (and therefore _cheaper to serve_) image. * * * ### Identifying Requests `chatgpt.com` sets a `Referrer-Policy` of `strict-origin-when-cross-origin`: This tells the user's browser that, for cross-origin requests (i.e. those that go to another domain), it should include a `referer` header specifying scheme and domain only. So, when `ChatGPT.com` embeds one of my images the request headers look like this: The existence of that `Referer` header means that it's easily possible to identify requests which originated from `chatgpt.com`. * * * ### The Block Page Twenty odd years ago, it was quite common for anti-hotlinking protections to serve up a different image to the one that the user's browser had requested1. Although outright dropping the request is possible, doing so can lead to support overhead: well-intentioned people will helpfully tell you that your images aren't working in ChatGPT. To avoid this, I wanted it to be quite clear that the request was blocked - the easiest way to achieve this was to serve an image which indicated the objection. Bing's AI **really** didn't want to generate the image that I wanted - it seems that Microsoft have configured the filters to try and avoid showing Trump in any kind of satirical or mocking context2, even if what's being asked for is a depiction of something that exists in real life: Still, this is not my first rodeo, so I eventually got Bing to generate the imagery[^2] that I wanted (though I did add the text by hand): The image is hosted on a dedicated subdomain, which should allow me to more easily see how often it's used. * * * ### The Ruleset Although ChatGPT referrals make up an insignificant proportion of my real traffic, I didn't want to interfere with the few users who were actually visiting a page from there: the aim was to **only** impact hotlinking. The ruleset therefore needed to consider what was being requested: IF request is for an image AND referrer contains chatgpt.com OR referrer contains perplexity.ai THEN redirect to blockimage I use BunnyCDN, so the constructed ruleset looks like this: If these rules match, the CDN serves up a temporary redirect (a HTTP 302) to send the user's browser to the block image. * * * ##### Including The App The ruleset above only accounts for people who visit `chatgpt.com` in their browser. Although there are _obviously_ some who do that (otherwise they wouldn't have appeared in my logs in the first place), it's quite likely that they're in the minority. We also need to account for embeds within the app, which (rightfully) doesn't set a `Referer` header. We can, however, identify the app by its user-agent > ChatGPT/1.2025.287 (Android 13; FP4; build 2528715) This is _different_ to the user-agent that ChatGPT uses when fetching something (like a web page) to feed into the LLM for summarisation. A second ruleset catches the app's embeds: * * * ### Testing My logs indicate a particular bias towards hotlinking of images included in Vauxhall repair posts (I've no idea why, it's not like they're uncommon cars). So, I went to `chatgpt.com`, toggled the `search` lozenge and asked it to provide me with images showing how to replace the oil pressure sensor on a Corsa D. The result was even better than I'd expected: I hadn't considered that `chatgpt.com` would crop the image, but the effect is all the better. If the user taps the image, ChatGPT opens a modal displaying the full image: Because the CDN serves a temporary redirect (a HTTP 302), the correct images **are** displayed if the user actually clicks the link to visit my site (and will continue to display correctly while the images are in their cache). I couldn't test the mechanism with Perplexity because they actually seem to have stopped hotlinking my images. Although I'm not complaining, it's a little odd: they still hotlink images from other sites _and_ Perplexity is perfectly willing to regurgitate my content. I've no idea whether that's just luck or whether it might be related to my previous anti-hotlink setup for Perplexity. * * * ### Robustness Anti-hotlinking protections haven't been _particularly_ robust _for years_. They used to be a "good enough" measure because browsers sent a `referer` header by default and most users wouldn't know how to (or wouldn't bother) changing that. However, that changed with the introduction of the Referrer-Policy header, which allows sites to instruct their vis1761491468884itor's browsers to send a more limited `referer` header (or not to send one _at all_). This means that `chatgpt.com` _could_ trivially side-step this mechanism by updating their site to set `Referrer-Policy` to `no-referrer`. Of course, it'd be an obvious bad faith move when they could also do what they should have done from the outset: set up a cache so that it's _them_ carrying the bandwidth bill3 rather than the people who's content they're (mis)using. There are a variety of more robust approaches (including tokenisation), but as long as `referer` headers are available, it's probably not yet worth the additional effort. * * * ### Conclusion I appreciate that, for some, it might come across as petty to be complaining about what _should_ quite be small costs. However, they're still costs that I incur _entirely_ for someone else's benefit: if I _wanted_ to support OpenAI, I'd be paying a monthly subscription. Aside from this being **another** example of AI companies outsourcing what should be their own costs, it's also a matter of freedom. If, as some contend, AI companies are free to consume the _entire public commons_ and regurgitate error-prone facsimilies of it, I am _just_ as free to serve up whatever I see fit in response to requests for my content. It _is_ true that I _could_ have served a simple "request blocked" JPG but, in a political context where Trump is issuing executive orders that will censor AI, it's **much** more amusing to ensure that the product of one of his ~~minions~~ supporters serves something more pertinent to the situation. * * * 1. They tended to be quite explicit (or worse, Goatse) ↩ 2. Which is quite fitting, really, considering the fact that I wanted the image to show billionaire CEO's as being Trump lackies ↩ 3. This is far from a niche idea and is what, Google, whose activities _actually bring my site traffic/benefit_ have done for years. ↩

New #Blog: Setting up Anti Hot Linking Protections for ChatGPT
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/general/blocking-chatgpt-from-hotlinking-to-my-media.html?utm_source=rss

#ai #chatgpt #genai #hotlinking #llm #perplexityai #theinternet

26.10.2025 16:00 — 👍 1    🔁 0    💬 0    📌 0
Preview
Marking 20 Years of Blogging Although I had previously played around creating various mini-sites, today marks the 20th Anniversary of the initial posts onto my first _durable_ site (`benscomputer.no-ip.org`), which would eventually evolve into this blog. I've always primarily written about tech, but have also never sought to limit myself to it. As a result, I've written about a range of topics including privacy, dogs, health, legal dealings, solar, vehicle maintenance and even creating furniture. Life comes with highs and lows and, while I haven't always written _about_ the lows, I have attempted to continue to write _through_ most of them. The aim of this post is to take a quick look back over my first twenty years of publishing. By it's very nature, this kind of post can be a _little_ self-indulgent, so I've made a conscious effort to try and keep that to a minimum. Admittedly, I **did** get a little carried away with some of the statistics. * * * ### Contents * Contents * The Early Days * Then & Now * Posting Stats * Post Tagging * Words * Highlights * Conversations * Conclusion * * * ### The Early Days Much like my current site, benscomputer.no-ip.org was self-hosted. The site lived on an old Fujitsu-Siemens Scenic P300 connected to the (_extremely_ modest) broadband connection that I'd arranged for the HMO that I was living out of at the time. The PC wasn't dedicated to hosting - it was also my day-to-day driver (something that didn't change until I got over-excited after misreading a pricing label in Cash Converters. It wasn't until I got home that I noticed I'd paid 10x more for the second hand Thinkpad than I'd meant to. They refused to refund and I spent the rest of the month hungry as a result). Although I was _interested_ in tech, I wasn't yet working in it - I was a sales assistant in a small Threshers branch. I'd created websites by hand before, but I remember wanting my focus to be on _what I was writing about_ and so chose to use OpenOffice to create the first posts. The traces of that are visible in the page metadata: <meta name="GENERATOR" content="OpenOffice.org 1.1.4 (Linux)"> <meta name="AUTHOR" content="Ben Tasker"> <meta name="CREATED" content="20051007;20080000"> Everything was implemented using inline HTML, I didn't start using CSS until nearly a year later (and even then, only after someone on a forum had convinced me of it's potential). On the first day, I actually published **two** posts, with one following directly on from the other (in hindsight, it does feel a bit "_and another thing..._ "). * DRM is evil * DRM has to be stopped I later changed the site from static HTML to a self-written CMS which used Server Side Includes to populate content by calling BASH scripts. One odd consequence of that transition is that those first two posts appear in the wrong order on the archived copy of the site's index (something that, apparently, has taken twenty years for me to notice). * * * #### Then & Now I can't say that those early posts are something that I feel _particularly_ proud of: My early writing style was _very_ different to the one that I've grown into. A contemporary example of that can be seen by comparing my recent post on ID cards to one on the same topic in 2007. They are both quite opinionated and I now write at quite a bit more length, but the newer post (IMO) makes for much more comfortable and convincing reading. For example, in the earlier post, I wrote: > Quite frankly I believe Mr Blair has his head buried in one of two places, either the sand or his own, probably best not to go there. This is quite an awkward passage to read. I can _hear_ my younger self saying it, but it still doesn't read that way. The newer post contains significantly more reference to stats and metrics, but also makes a statement of authority: > There is, though, a simple truth which underpins all of this: there are **no** simple solutions to systemic issues. Whether it's illegal immigration, political corruption, terrorism or inequality, anyone who promises that "this one thing" will address it is either mistaken or lying. Read in context, both passages convey that the Government are full of shit, but the more recent one does so in a _much_ more accessible manner. Though they may not be to my (current) taste, my early posts **are** still an important part of what led me to where I am now. Although my previous writing style now feels a little awkward and unfamiliar, the posts _do_ also include details that are still part of my life today: > Secondly completely boycott ANY DRM CDs, non DRM versions are usually availbale [sic], so contact the vendor and ask for a DRM free version. I _still_ look for options without Digital Rights Management and contact publishers/authors to ask if un-encumbered options exist. Admittedly, because of the spread that consumer hostile technologies have achieved, I _do_ now allow myself a few more options than I did before: 1. Find unencumbered option 2. Can I get by without it? 3. Take the encumbered option, strip the DRM and send a grumpy email 4. Go without Of course, my writing style wasn't the _only_ thing to change with time - the site did too and, in June 2011, I launched `bentasker.co.uk`. Since then, the site has been through multiple iterations, most of which are captured in a post that I wrote when migrating to Joomla! 3 (before later moving to Nikola). The sitehistory tag also contains screenshots of each era of the site. * * * ### Posting Stats I don't routinely collect stats about my posting activity. They're are also a bit harder to come by with static sites because there isn't a convenient database to run queries against. So, I hacked together a script to scrape posting information from `www.bentasker.co.uk` and the `benscomputer.no-ip.org` archive. The script was then able to write simple stats into a time series database for querying and visualisation: Note: the overlap between `bentasker.co.uk` and `benscomputer.no-ip.org` exists because (in an effort to ensure that it was preserved) I started republishing some old content. However, I eventually found a _meaningful_ backup of the original site and published the archive instead. Over the last 20 years, there, quite visibly, have been _some_ gaps in posting. Some of those gaps are larger than others, but if we chart out the number of months, per year, with at least one posting we can see that some years _really did_ fare worse than others: 2017 did really poorly, with me neglecting the site for _8 months_ of the year. Most of the years with big misses, though, seem to be because I was focusing on something else. For example, if we factor in posting to `snippets.bentasker.co.uk`, I only actually missed 3 months of 2017. There are still exceptions to this, though, and I've written in the past about sometimes suffering from writers block. Breaking my posts down by category leads to a _messy_ chart, but if we look at the proportional makeup of the 10 biggest categories, we can see quite a significant bias towards Linux documentation1 `documentation:linux` and `blog:general` are both quite broad categories, so their dominance isn't too surprising. What _is_ quite interesting, though, is seeing `blog:house stuff` in third place: I only launched that category in 2021, so it's only taken four years for it to start to break into the top three (our solar probably has a lot to do with that). If we chart out the top 3 categories per year, we can also start to see how what I've chosen to write about most has shifted over the years: In many ways, this is a reflection of how _my_ life has changed and, in particular, how I spend my free time. Some of it is borne out of necessity rather than choice, but I now spend a lot more time doing things around the house than I do fiddling with new software. * * * #### Post Tagging Tag based stats only represent more recent posts - the concept of attaching tags to posts didn't really exist when I first started and wasn't available in Joomla! until 2013. My site's JSONFeed includes details of which tags are applied to each post: { "id": "https://www.bentasker.co.uk/posts/blog/privacy/id-cards-come-around-again.html", "url": "/posts/blog/privacy/id-cards-come-around-again.html", "title": "ID Cards Have Come Around Again", "date_published": "2025-10-03T21:21:00+00:00", "tags": [ "blog", "id cards", "identity", "privacy", "united kingdom" ] This makes calculating stats on tagging frequency quite easy: import requests from collections import Counter tag_counters = {} r = requests.get("https://www.bentasker.co.uk/feed.json") for entry in r.json()["items"]: for t in entry["tags"]: if t not in tag_counters: tag_counters[t] = 0 tag_counters[t] += 1 # Turn into a counter c = Counter(tag_counters) # Exclude top-level category names # unsurprisingly, they dominate excludes = [ "documentation", "blog", "republished", "video" ] # Maintain a counter - we're going # to be excluding entries but still # want to show the top 10 x = 0 for t in c.most_common(30): if t[0] in excludes: continue print(f"{t[0]}: {t[1]} posts") x += 1 if x >= 10: break This indicates that the top 10 most used tags on `www.bentasker.co.uk` are: Tag | Uses ---|--- linux | 99 howto | 92 security | 81 privacy | 73 house stuff | 67 release notes | 48 infosec | 45 vehicles | 43 software | 39 opinion | 37 There _is_ a limiting factor here, though: I've used a _huge_ number of tags over the years, which (until recently) left my tags listing looking somewhat ridiculous: The density of this made it quite difficult for me to skim read existing tags when publishing new posts, so there are likely to be more than a few posts which are missing at least one relevant tag. * * * #### Words Given that I'm talking about writing, it'd be remiss of me not to take a peek at word count. | Words | Profanities | Profanity Ratio | Avg Words Per Post ---|---|---|---|--- `benscomputer.no-ip.org` | 220,586 | 102 | 1:2163 | 585.1 `www.bentasker.co.uk` | 865,570 | 244 | 1:3548 | 1059.4 **Total** | **1,086,085** | **346** | **1:3139** | **909.7** The total word count is actually quite a bit lower than I expected. I suspect, too, that the rate of profanity is quite a bit lower than _people who know me_ would expect. Python's `collections.Counter` allows us to check which words I'm most likely to use: c = Counter(profanity_counter) for t in c.most_common(20): print(f"{t[0]}: {t[1]}") The top place makes sense, because I used to have a software project called BUGGER. The second place also makes sense because, well, I **do** use that word quite a lot in day to day life. Which leads me to something quite interesting. Earlier in this post, I observed that my original writing style was quite conversational: I wrote things, more or less, as I would _say_ them. Combining that with `benscomputer.no-ip.org` having a higher proportion of profanity, I would _expect_ that it also included a _wider_ range of that vocabulary. But... that's **not** the case - in fact, it's not even possible to populate a top 10: bugger: 74 fuck: 15 shit: 5 cock: 4 bollocks: 2 knob: 2 If we assume that every use of `bugger` was in relation to the software project, the rate of profanity on `benscomputer.no-ip.org` drops considerably: once every 7876 words. I've always sworn a lot, a condition that was exacerbated by working with the military, but the increase can't be driven by that because those years passed while I was still posting to `benscomputer.no-ip.org`. So... when did it start? There was a fairly low frequency of posts with profanity, but the rate spiked in 2021, recovered and then remained in an elevated position from late 2022 onwards. That increase _broadly_ aligns with me coming off of Twitter: had I still been on Twitter, it's quite possible that some of the swearier posts would have instead been Twitter threads. Curiously, we don't see the same trend when looking at the _number_ of profanities per month: So, I've written _more posts_ containing profanity, but the overall level of swearing isn't much higher than it was in 2018. It's not _entirely_ clear why that is. * * * ### Highlights Over the last 20 years, I've published quite a _lot_ of content - there are 819 posts on `bentasker.co.uk`, with another 700 on `benscomputer.no-ip.org`. It feels a little strange to use the word "adventure" in relation to sitting at a computer and writing, but over the years I've had and documented a range of experiences. Distilling those down to a small set of highlights is, inevitably, quite difficult, but the following are the experiences that have stuck in my mind: * Taking a (now defunct) ISP through arbitration * Convincing a local council to make their permit parking system more inclusive * Creating a children's book * Being part of the fight against the introduction of Phorm in the UK * Messing around with Powerline adapters (and later discovering a vulnerability in them) * Writing about health and medication issues and living with chronic pain * Having my research into the "unhackable" Bitfi cryptowallet referenced at Defcon by a researcher _dressed as Pikachu_ * Catching a digital intelligence outfit in the course of making a mistake * Giving my fishtank an IP address * Mistreating Bing's AI Chat, Perplexity and Deepseek (warning: there isn't enough mind bleach _in the world_) * Hyperfocusing on packet captures and the behaviour of Golang's net/http * Spending an afternoon in the Sizewell Nuclear Power Station control room simulator * Creating a TV station * Somehow getting mistaken for a qualified battery storage engineer _by the supplier_ * "Dating" a romance scammer Of course, it's not _all_ been sweetness and honey - I've also had legal threats, abusive emails, nuisance abuse complaints and sometimes quite concerted efforts to compromise the site. But, the positives definitely outweigh the negatives. * * * #### Conversations Blogging _sounds_ like a one-way interaction: I post something and then people (hopefully) read what I've written. But, I've never found that to truly be the case. Over the years, I've had a lot of wonderful conversations with people who've read _something_ that I've written and felt motivated enough to contact me about it (whether via email or social media). Some made contact to provide feedback, others had questions that my post didn't answer and _occasionally_ it was to let me know that motivated them to do or create something (DNSChat seems to have been _particularly_ good at that). I've always written _for me_ , but there's something quite gratifying about finding out that it's helped someone else. * * * ### Conclusion I've never been _particularly_ prone to looking to the future, but I don't think that I'd ever have expected to still be _enjoying_ writing 20 years later. One of the things that has really stood out to me while preparing this post, is just how different my writing style has become. My writing used to be _extremely_ conversational and _very_ opinionated. Although that change is a good thing, it does make me wonder what my writing style might be in another 20 years time. Over the years, I've been extremely fortunate in having the time and opportunity to explore and write about quite a range of things. It's been **a lot** of fun and, hopefully, the next few decades will be too. For me, Microsoft ~~Bing AI Chat~~ CoPilot summed it up best: * * * 1. BTW, I use Arch ↩

New #Blog: Marking 20 Years of Blogging
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/general/marking-20-years-of-blogging.html?utm_source=rss

#sitenews #sitehistory

07.10.2025 12:30 — 👍 0    🔁 0    💬 0    📌 0
Preview
Stripping and Cleaning a SMK XS26 Air Pistol I've got an SMK Synergy XS26 Air Pistol. It's a pretty satisfying gun to shoot, but ever since I got it, it's been intermittently dieseling (when lubricant gets from the cylinder into the barrel and ignites under pressure). It dieseled whilst attached to my chronograph and was measured throwing a 15.89 grain pellet out at 430 foot/s. That works out to around 6.53 ft/lbs, which is _over_ the UK legal limit for an air pistol! A louder bang and bit of smoke might be a bit of fun, but getting prosecuted for a firearms offence would _definitely_ not be (however much bad luck it might take to reach that stage). On top of that, dieseling isn't great for the seals or shot consistency, so I decided to strip the pistol down and clean it out. This post details the process of disassembling and re-assembling a Sportsmarketing Synergy XS26 Air Pistol * * * ### Strip Down Before you start: be aware that there are a number of small (and easy to lose) parts. Do this on a clear workbench and move slowly so that nothing can spring loose. Ensure that the breech is empty and that the pistol is not cocked. Turn the safety _off_. There's a screw on either side of the gun, remove both: There's also a screw on the underside: Keeping the gun upside down, grip the barrel with one hand and carefully pull the stock up and away. Just behind the trigger, there should be a small plastic pin, backed by a spring. The pin is loose, so if it's not there, it's either still in the stock or has fallen out onto your workbench. Take them out (you can safely tip the gun to get the spring out) and set safely to one side: Now firmly grip the bit of plastic which carries the rear sight and slide it back off the cylinder: Just ahead of the trigger mechanism, there's a pin holding the trigger assembly onto the cylinder. Knock it through and remove it: Towards the back of the assembly is pin with a much bigger diameter - **DO NOT REMOVE THIS**. It doesn't hold the assembly onto the cylinder - it's the pin that the sear pivots on. You also need to take care not to lose it, mine was extremely loose and fell out. If yours also falls out, be aware that there is also a small spring in there - mine went AWOL so I had to replace it (more on that later). Just behind the sear, there are two plastic hooks which sit inside the cylinder. So, slide the entire trigger assembly backwards and you should then be able to lift it up and away: Next, we're going to remove the barrel. Where it meets the cylinder there's a large flat head screw - loosen and remove this, but take care not to lose the nut from the other side. This should leave a sleeve visible: Drive this through. Unhook the cocking lever from the cylinder Pivot the barrel out and away. You should now be left with just the cylinder The next step is removing the end plug and spring guide. However, the spring is under tension, so we need to use a clamp to take up some of the tension before it's possible to drive the retaining pin through. The cap has a dip in it, so I put a socket against it to help the clamp reach. Note: this is far better and safer with a G cramp (or any other type that screws/unscrews), but mine was too big. With the clamp providing opposing tension, it should now be easy to drive the retaining pin through (if it isn't, you've either over-tensioned the clamp or not provided enough): Now we need to release the tension - **this** is where you _really_ want to have used a screw type clamp. I put a heavy blanket over it to stop anything shooting across the workshop and then used a piece of wood to knock the clamp release. Once released, the end plug and spring guide stick a little way out of the cylinder: There's an important thing to note here. In my photo, there's a big hole in the top - there _should_ be a little silver nut in there, but my poorly controlled spring release led to it flying off. I realised later and fished it out of the blanket, but it's missing in a couple of the photos below The piston can now be slid out of the cylinder leaving you with a dismantled pistol: * * * ### Cleaning The reason that the gun had been dieseling so frequently was that it _absolutely lousy_ with oil, which had obviously been making it's way ahead of the piston seal. I sprayed some carb cleaner on to clean the spring, piston and cylinder. After drying everything off, I gave the piston and spring a pretty liberal application of silicone airgun grease (being careful not to apply any ahead of the seal): Note: it's important to use a grease specifically formulated for airguns - using a random grease could introduce ingredients which are more prone to ignition, giving a _much_ bigger bang than intended. * * * ### Re-Assembly Once the piston and spring have been slid back in, we need to compress the spring enough to be able to push the retaining pin back through. First, put the nut back into the hole in the end cap. I couldn't safely put a socket between the end cap and the clamp this time - it kept trying to bend off to an angle and was pretty clearly at risk of pinging off. So, I put the clamp directly against the end-cap and closed it down as far as it would go. Unfortunately this wasn't _quite_ far enough, but it **was** far enough for me to be able to push a screwdriver through. So, I used a screwdriver to hold the clamp in place before loosening the clamp slight and sliding a single penny in between Tightening the clamp again allowed the hole to line up so that I could easily push the retaining pin through: I applied a small amount of grease to the sides of the barrel and pushed it back in place, before sliding the sleeve through I pushed the bolt through and tightened the nut onto the end of it I slid the cocking lever back into place. Next it was time to reinstall the trigger assembly. Hopefully yours didn't come apart during dissassembly. If it **did** then skip to the trigger assembly section below before continuing. I found that it was easiest to re-install with the sear engaged: If yours isn't engaged, run a flathead backwards along it and it should click into place. Carefully line the hooks up with the slots at the back of the cylinder and then slide the entire assembly forwards. Push the front retaining pin back into place (note: I found this was one of the most awkward bits of the whole process, you might need to use a screwdriver as a lever to carefully help line the holes up) Slide the rear sight housing back into place - you may need to carefully lift the sight itself as you do this: the windage adjuster is a screw which pushes against the cylinder to raise the sight up: Flip the gun upside down and put the small spring and plastic probe back in place (rounded end facing out) Carefully slide the stock back over before reinstalling the side screws Install the screw on the underside of the butt (if you _did_ forget to put the nut back into the end-cap, you're about to find out!) It's re-assembled: * * * #### Safety Checks However, you're not quite done as you also need to perform some safety checks. Go somewhere safe and then work through the following checks, stopping if any fail: 1. Ensure that the safety catch operates (it should not be possible to pull the trigger with it engaged) 2. Shake the pistol, you should not hear any rattling 3. Ensure that the pistol cocks reliably 4. Ensure that the pistol fires reliably (make sure there's a pellet in there, springers shouldn't be fired empty) 5. Repeat 3 and 4 a few times to ensure that it's consistent 6. Cock the pistol but _do not_ load a pellet. Holding the pistol out and away from you, use your other hand to hit and knock it - ensure that knocks don't lead to an accidental dishcharge. 7. If you've somewhere you can safely do so, drop it onto some grass: again, it should not fire 8. Finally, load a pellet and fire * * * #### Trigger Assembly Rebuild If, like me, your trigger assembly came apart, it'll need re-assembling. It should go without saying, but it's _extremely_ important that the assembly is put together properly and works reliably. If it's re-assembled incorrectly, the best case scenario is that the pistol won't cock, but the worst case is that the gun fires without you touching the trigger! If you've still got your user manual, there should be an exploded parts diagram on page 10. The trigger assembly looks like this **Hopefully** it's only your sear that has come out. The right hook (number 36 in the diagram) sits at the bottom of the assembly. It's the bit visible on the top of this picture: The sear spring sits over the knobble on the end of the right hook so that it can push up to the bottom of the sear. As I mentioned earlier, my sear spring went missing1. After searching the workshop thoroughly, I still hadn't found it, so I took a spring out of a clicky pen and cut it down to the right length. Although it's tempting to put the sear in first and then work on the assembly upside down, doing this doesn't really work - without the right hook in place, the spring pushes it too far round. Once the right hook and sear spring are installed, the sear should go in with its hook at the front and the slanted section at the back: With both installed, you should be able to look into the inspection hole and see the sear engaging onto the corner of the hook It should not be possible to push the sear forward. The idea is that, when it's locked like this, it'll be holding the piston back. When you pull the trigger, though, the right hook should move and the sear should click forwards Using a flat implement, you should be able to reliably click the sear back and then pull the trigger to release it: There should be a video here, you may need to visit in a browser Now that your trigger assembly is back together, go back to installing it. * * * ### Conclusion SMK air rifles and pistols are pretty popular, but they're also built to a cost. That _can_ make working on them a little frustrating as they're made up of quite a few loose little bits. Stripping the XS26 down and giving it a good clean _does_ seem to have resolved the dieseling issues. The pistol is still a little down on power compared to the manufacturer's claimed spec, but has more than enough for my needs. * * * 1. I actually have some doubts about whether it was ever installed in the first place - the gun hasn't always cocked reliably, something that's ceased to be an issue since this rebuild ↩

New #Documentation: Stripping and Cleaning a SMK XS26 Air Pistol
Author: Ben Tasker

https://www.bentasker.co.uk/posts/documentation/shooting/smk-xs26-stripdown.html?utm_source=rss

#shooting

06.10.2025 21:45 — 👍 0    🔁 0    💬 0    📌 0
Preview
ID Cards Have Come Around Again Back in 2006, Tony Blair was _extremely_ keen on the idea of introducing mandatory ID cards. Fortunately, they were scrapped by the coalition government following Labour's 2010 general election drubbing (for the historians, though, it is worth noting the cards were originally a Tory idea, and that Blair originally campaigned _against them_). 19 years later, Tony Blair (via the Tony Blair Institute) continues to hawk the idea of a "super identity card". Unfortunately, the current government have announced plans for a mandatory "BritCard", claiming that the scheme will > help combat illegal working while making it easier for the vast majority of people to use vital government services. Digital ID will be mandatory for Right to Work checks by the end of the Parliament. Unfortunately, rather than questioning whether they should be taking advice from a yet-to-be convicted war criminal in the first place, the Government have decided that Digital ID should form a key part of their response to the Reform wave which is currently (and quite regrettably) sweeping the nation. ID cards, particularly those envisaged by Blair, have significant issues and are very unlikely to do anything to address illegal immigration. The purpose of this post is to explain (some of) why. * * * ### Preventing Illegal Migration Let's start with the core of the Government's claim. When announcing the scheme, Kier Starmer said: > I know working people are worried about the level of illegal migration into this country. A secure border and controlled migration are reasonable demands, and this government is listening and delivering. > > Digital ID is an enormous opportunity for the UK. It will make it tougher to work illegally in this country, making our borders more secure. Other material released by the Government notes that this is because Right To Work checks will be amended to require that the worker/applicant provide their digital ID. The _theory_ is simple: an illegal immigrant _won't_ have digital ID and so won't be able to pass the checks (eventually putting immigrants off coming here _at all_). However, Right To Work checks are _already_ pretty strict, with only a small set of documents considered suitable proof of ID: * A passport (showing the holder is a British citizen, or has the right of abode) * A passport (showing the holder is an Irish citizen, or has the right of abode) * A passport endorsed to show the holder is allowed to stay in the UK indefinitely * A document issued by Jersey, Guernsey or the Isle of Man _which has been verified as valid by the UK Home Office_ * A current Immigration Status Document (issued by the Home Office) noting an indefinite right to stay in the UK **accompanied by** an official document providing a National Insurance Number * A UK, Channel Islands, Isle of Man or Irish birth (or adoption) certificate **accompnaied by** an official document providing National insurance number * A certificate of naturalisation as a British Citizen **accompnaied by** an official document providing National insurance number Although the Prime Minister's statement made reference to "finding an old utility bill", those aren't actually of any use during Right To Work checks. To summarise the list above, the only forms of ID _currently acceptable_ for Right To Work checks are documents which have been issued or verified by the UK Government. * * * #### Current Workers It's obviously somewhat difficult to get accurate numbers of those working having arrived here via illegal means. The clue to the reason for that is in the name "undocumented migrant". However, the Government's 2025 report notes that, in the prior year, there were 7130 arrests for illegal working and 2105 civil penalties were issued to employers. We don't know the detection rate, so it isn't possible to extrapolate to a total number of workers, but we also probably don't need to. What really matters (and should surprise no-one) is that there _are_ people working illegally, despite the existence of the system that Digital ID will _have_ to rely on. But, how did that come to be? Maybe _some_ of those workers presented fake ID (the Government certainly suggests that forgeries can be a problem), but it seems far more likely that the majority were working for an employer who _didn't conduct Right To Work checks in the first place_. The number of penalties issued supports that theory too - it seems relatively unlikely that a penalty would have been levied if an employer could show that they'd checked ID in good faith. If Right To Work checks aren't actually being conducted, it doesn't matter what additional ID the Government introduces. In fact, they could introduce _mandatory tap classes_ and have about the same level of impact. * * * #### Identifying Approved Migrants Preventing undocumented workers from working might be quite difficult, but perhaps the aim is to provide legal/documented migrants with the means to prove that they **are** allowed to work? The problem is, **we already do that**. Until the end of last year, migrants were issued with Biometric Residence Permits: The Government stopped issuing these last year and, instead, replaced them with eVisas. eVisas are _very similar_ in function to the Right To Work system that the Government seems to envisage: a smartphone app allowing migrants to prove that they have the right to work in the UK. So, as a country, we **already issue a digital ID to approved migrants**. * * * #### What Is Going To Change? Let's lay out before and after: Group | Before | After ---|---|--- **Undocumented, working illegally** | No Digital ID reqd | No Digital ID reqd **Documented, working legally** | Digital ID reqd | Digital ID reqd **UK National** | No Digital ID reqd | Digital ID reqd The only group who see a meaningful change in requirements, is UK Nationals. It will, effectively, become mandatory1 for UK Nationals to have a digital ID. Approved Migrant workers already have one and black market workers will continue to not have or need one. It _might_ also become a little easier for employers to conduct Right To Work checks (of course, being a Government procured system, it _could_ also get more difficult and unreliable). Unless the Government's position is that the _Home Office are failing_ to properly control or vet official documents, it seems likely that the introduction of digital ID cards will do little to _nothing_ to prevent illegal working. Even that has it's own precedent: in 2005, the minister in charge of the original ID card scheme admitted that: > perhaps in the past the government, in its enthusiasm, oversold the advantages of identity cards [and] did suggest, or at least implied, that they might well be a panacea for identity fraud, for benefit fraud, terrorism, entitlement and access to public services It seems _almost inevitable_ that we're being subjected to a similar level of hyperbole now. * * * ### Learning From History Each of us _already_ exists in multiple Government databases (Driving License, HMRC & Passports etc etc). It might seem odd, then, to be concerned about adding one more. The answer to why lies, partly, in the past. Although the details of _this_ scheme have yet to be determined, looking at the _previous_ ID card scheme can help us see what it might eventually look like (especially given Blair's involvement in both). Under Blair, ID cards started out as just another form of ID (and a voluntary one at that). However, the scheme suffered from **significant** scope creep: * The underlying database (The National Identity Registry) could contain 50 different pieces of information about individuals * It was decided that Passports would be tied to the NIR - those applying/renewing would be entered into the database even if they didn't opt into an ID card * In order to reduce costs, the Government shifted from creating a completely new system to bastardising the DWP's Customer Information System, creating concerns around governance and accountability in the process * From November 2008, Identity Cards became compulsory for those applying to stay as students, or based on marriage * In order to address cost concerns, the Government intended to charge _other Government departments_ , driving up the cost that individuals and businesses would need to pay for other services Although they started out as voluntary and limited in scope, that scope grew quite quickly. As a result of limited uptake, the Government sought to find new ways to drive adoption. In January 2008 a leaked document was published in The Times, noting that > Various forms of coercion, such as designation of the application process for identity documents issued by UK ministers (eg, passports) are an option to stimulate applications in a manageable way. The Identity Minister at the time noted that ID cards presented a "convenient way for young people to prove their age when going to bars" (something, incidentally, that the current Government has _also_ referred to when responding to a petition against Britcard). **A month later** , while on a trip to promote ID cards, that same minister had to admit that even she had forgotten to bring hers with her. Ultimately, there was such a pervasively low level of demand/sign-ups that Civil Servants were _even_ encouraged to email friends and relatives in order to try and convince them to sign up. That Government spent _significant_ amounts of taxpayer money on a scheme that the public didn't want, use or even ultimately benefit from. Unfortunately, the current scheme _seems_ to be being driven by some of the same minds. * * * ### Digital Sovereignty Be Damned This is a topic that many non-techies are unlikely to care about... _for now_. The Government's stated intention is that the new Digital ID will "sit on people's phones" (they have similar plans for a digital Driving License, both of which will live in a Digital Wallet). This aspect of the scheme hasn't been well defined yet, but we can look at similar schemes in order to build expectations around how it's likely to work. Earlier this year, the EU's Age Verification App came into focus as a result of a stated intention to rely on the Google Play Integrity API (and, on iOS devices, Apple App Attestation). These APIs help apps to verify the device that they are running on. However, a hard dependency on them _also_ prevents apps from being used on devices which do not run Google or Apple's services. Essentially, it means that, not only do you need a smartphone, it needs to be running specific software. User running things Lineage and GrapheneOS are, obviously, very much a minority use-case. However, user population sizes are only _one small part_ of this2. The much more crucial aspect of this dependency is that it will tie _every single British citizen_ to the services of these US tech giants. * * * #### International Trends This is happening at a time when the US is still coming to terms with having installed a mad king, whose unpredictability has delivered repeated shocks to global markets and international relationships. Rather than forming part of the resistance, the US tech giants have been bending the knee in various ways. Earlier this year, Microsoft suspended services to a member of the International Criminal Court after Trump sanctioned ICC staff for investigating the role of Israeli politicians played in atrocities perpetrated against Palestinians in Gaza3 . That shot across the bows has not gone unnoticed and various Governments are responding by moving away from US providers: * Austria's Armed Forces are ditching Microsoft Office and will use Libre Office instead "to achieve greater digital sovereignty" * Denmark is also on a path towards moving away from Microsoft * A German state (Schleswig-Holstein) had already started its switch * Dutch parliamentarians passed measures to encourage the Government to move * The European Commission is being lobbied to take action (there's even talk of public sector orgs standardising onto an EU OS) These aren't simply cost saving measures: in every one of these examples, Digital Sovereignty has been listed as a motivating factor. It helps that moving to non-US providers is _far_ more viable than it's ever previously been. When I looked earlier this year, I found that there were a wide range of possible offerings. * * * #### Backend Services In the context of all this upheaval and uncertainty, it seems insane that the UK Government could consider a green-field project which ties citizens to providers that other Governments are increasingly seeking to escape from4. Unfortunately, we've _only_ talked about the app that users will need: there will be back-end services too. According to The Mail (ewww, I know), Ministers have been meeting with the Ellison Institute of Technology, suggesting that Oracle might well be in a position to win the contract to build and operate the Right To Work system. For the non-techies out there: Oracle are renowned for squeezing every penny possible out of their ~~victims~~ customers. They are an extremely aggressive company who've a reputation for deceptive licensing terms, contract lock in and expensive litigation. There are also serious issues of trust: Just this year, Oracle were accused of trying to cover up a cybersecurity incident which impacted customer data. Large databases, such as the one that would need to underpin the Digital ID, _do_ tend to make for a juicy target - the very least that taxpayers should be able to expect is providers who can be trusted to do the right thing when they screw up. Oracle's billionaire owner (Larry Ellison) is quite something too, having once said: > Citizens will be on their best behavior[sic] because we are constantly recording and reporting everything that’s going on. Oracle already have a _lot_ of fingers in the public sector pie, but it really is in no-one's interest to give them more opportunity to (contractually) pick our pockets. What all of this means is that there is a possible future where, not only will UK citizens need a phone made by a US provider, but their data will be held in a system controlled by another US provider (and potentially in reach of the US Government via the CLOUD act). All funded by our taxes, of course. * * * ### Conclusion The "Britcard"5 isn't much more than a proposal at the moment, and there is still **a lot** to be defined. However, the same was true of the scheme at the turn of the century, which grew from its early proposals into an expensive and privacy-hostile mistake. It should be more than a little concerning _for everyone_ that Tony Blair has apparently had a hand in **both** , especially given his close ties to Oracle's Larry Ellison. A quick skim of history suggests that the current Government is following a very similar playbook to Blair's government: both Governments have claimed that ID cards will save us from the _en vogue_ existential threat. Back then, it was terrorism, today it's illegal immigration. What history makes clear, though, is that ID cards do nothing to address these problems on their own. The only way that they can ever have even a tiny chance of doing so, is by being tied to onerous (and sometimes outright dystopian) systems. The previous Labour government's scheme floundered and their response was to try and require further adoption. This Government's statement makes it clear that they intend coercion from the start, by tying ID cards to the Right To Work scheme (even if also making weasely claims that it won't be illegal not to have one). There's is, though, a simple truth which underpins all of this: there are **no** simple solutions to systemic issues. Whether it's illegal immigration, political corruption, terrorism or inequality, anyone who promises that "this one thing" will address it is either mistaken or lying. If digital ID _is_ introduced, there's a significant possibility that it will tie UK citizens to providers that other Governments are currently working to _escape_. Whatever Trump's intentions for the future of the US might prove to be, we too will be tied to them and, as contracts unfold, taxpayer money will go not to local causes but to filling the coffers of US billionaires. * * * 1. The Government insists that it won't be illegal to not have one. They have had to concede though that it'll be pretty hard to exist without one, given that you won't be able to work (or potentially, rent) without using one to pass ID checks. So, not mandatory, but only as long as you're not opposed to starving out in the cold. ↩ 2. I do, though, have a lot of sympathy for the argument that introducing something mandatory means your solution **must** cover everyone ↩ 3. Trump's not done yet, either - he's reportedly considering further sanctions ↩ 4. Yes, I know... the UK and the US enjoy a "special relationship". What happens, though, when he unilaterally slaps tariffs onto lube? ↩ 5. Another lesson they haven't learnt from the past. The original ID cards were first branded as "entitlement" cards, but the name was dropped after focus groups said the name was weaselly. ↩

New #Blog: ID Cards Have Come Around Again
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/privacy/id-cards-come-around-again.html?utm_source=rss

#idcards #privacy #unitedkingdom

03.10.2025 21:30 — 👍 1    🔁 0    💬 0    📌 0
Preview
Building An Air Rifle Target Back-Stop I've been getting back into shooting recently. There's a fantastic outdoor centre just up the road, boasting a number of woodland ranges. But, although trips there are good fun, it's not exactly the most cost-effective way to shoot an air-rifle. We're pretty fortunate in that we have quite a long and wide garden so I decided to build a small backstop to allow me toshoot safely in the garden. This post describes the process of building a semi-portable backstop for air-rifle shooting. Note: the UK legal power limit for an unlicensed air rifle is 12ft/lbs - if you're elsewhere, with something like a 26ft/lbs rifle you _might_ want to double up a bit. * * * ### Pellet Catcher vs Backstop First, let's be specific about what it is that I'm building. A pellet catcher (or pellet trap) sits behind the target but **only** the target: it'll stop pellets that hit the target card, but not any that are off (mistakes can and do happen). I've got a couple, which look like this: These traps provide a convenient way to hold the target card **and** stop the pellet afterwards. However, they don't provide any security in situations where the pellet has gone wide. Just as importantly, these pellet traps do nothing to protect the target card from rain - despite having a sheltered area to shoot _from_ I can only really use them when the British weather deigns to allow it1. A backstop is wider than the target and can also be used with targets that aren't so easily mounted on a small metal box (like spinners). Although a backstop doesn't **have** to provide protection against the weather, mine is going to. * * * ### The Design I wanted a small wooden unit that targets could sit inside so that, when not in use, we'd have a small unobtrusive wooden cabinet in the garden. After a bit of browsing around, I settled on one of these: I selected this unit based on a number of criteria * Price: I'm likely to put holes in it, so I _really_ didn't want to spend a fortune * I couldn't go _too_ cheap: cheaper items tend to be made of chipboard which doesn't do well in the wet * It's made of Fir, which is pretty environmentally sustainable * It's got a strip of roofing felt on top to help keep the contents dry The last one might sound a bit "duh", but there was another fir unit at around the same price with a flat slatted roof - the reviews were full of people complaining about it not being waterproof. * * * #### Stopping Pellets No cabinet on the market, on it's own, is going to reliably stop air rifle pellets (particularly over time), so the plan was to build a new multi-layered back into the unit. That back would be mounted at an angle behind heavy curtain strips, with the middle shelf left in to create two compartments: The false back is mounted at an angle for a couple of reasons: * it helps to redirect any ricochets downwards * for pellets which do penetrate it, the distance needed to pass all the way through is increased The backboard is made up of 4 layers: The pellet first hits the rubber, before passing in the cork. Although the rubber is tough, both layers compress easily, sapping some of the pellet's energy. If the pellet manages to pass through those, it then reaches the much harder 9mm plywood. A sheet of steel could achieve a similar effect, but the aim of the backboard isn't _just_ to stop the pellet, it's designed to do so as quietly as possible: I have neighbours, so flinging lumps of lead at a noisy backboard is likely to be unwelcome. Because the backboard is mounted at an angle, there's space to add more material behind it to ensure that any pellets that make it through the plywood do not exit through the back of the cabinet. * * * ### The Build First things first, I built the cabinet (which came as a flatpack): I left the shelf out in order to see whether it would be feasible to have a single backboard running the height of the cabinet - I ultimately decided that it was a little too shallow for that and re-fitted the shelf. I drilled a couple of holes on either side of the unit and screwed a bit of timber across the top. To help reduce the noise made by vibrations I stuck (and then stapled) some adhesive backed foam across the top and back of it: I ran another piece of foam across the bottom of the compartment too, so that the backboard would only ever make contact with foam. It was time to make the backboard itself. I took a piece of 9mm plywood and cut it down to size (annoyingly, the damp seems to have got to my wood-stock) Having sanded the mould off, I sprayed the surface with spray glue and then stuck down a layer of cork tiles: Then, I added a second layer, shifting the tiles up slightly to help make sure that the tiles met in different places within each layer. Finally, I wrapped the whole thing in rubber flooring The rubber flooring comes on a big roll and is only about 3mm thick. Had it been much thicker, though, I'd have struggled to get it to wrap around the edges of the board. * * * #### First Test I was _fairly_ sure that my design was sound, but I wanted to be certain: I didn't want to waste my time finishing the build only to find that the backboard didn't stop pellets as well as intended. So, I slotted the backboard into place and put out two of the pellets that I commonly use: one spiked, the other a diablo: For the first test, I used my air pistol. The manufacturer _claims_ that it sits just below the UK legal limit for air pistols (6 ft/lb). However, my (admittedly cheap) chronograph suggests that it's delivering more like 4.5 ft/lbs. From 5 metres away, I shot the backboard with the diablo. I didn't penetrate the rubber and instead left a little impression of itself This wasn't necessarily a good thing: although the backstop went undamaged, the pellet had _bounced_ (albeit in this case downwards). Next, I loaded the spiked pellet, which also left a little impression in the rubber. I dispensed two more pellets and set the air-rifle up at the 5 metre mark. This is **much** closer than I'd ever normally shoot, but the aim was to ensure that the backstop did actually trap and stop the pellets. UK law allows air rifles to have more power than pistols. Again, the manufacturer claims that it's just under the limit (12 ft/lbs3), but my chronograph puts it at more like 10.5ft/lbs. Both pellets went through the rubber and stopped inside the cork layer * * * #### Making a Curtain Happy that the backboard was able to effectively trap pellets, the next step was to make a curtain to sit in front of it. I'm sure that this isn't the _actual_ name for it, but I know the style as a butcher's curtain (I guess that's probably where I first saw one): a curtain made up of thin strips that you walk straight through. The curtain serves two main purposes. Firstly, it helps trap rebounds: we want to ensure that bouncing pellets do not find their way out of the cabinet because they may ultimately end up hurting somebody. Secondly, it steals energy from pellets as they come in. They're not capable of stopping a pellet on their own, but as the pellet passes through it, the curtain strands will move, converting some of the pellet's inertia into curtain movement. To make the curtain, I ordered a cheap rubber doormat: The fabric attached to it is intended for wiping boots, but helps by adding weight. I turned the mat length-ways and cut strips into it, before stapling it along the piece of wood at the top of the unit. As an idea of scale, the pellet trap in this photo holds a 17x17cm target card: * * * ### Testing It was time to give the unit a proper test and see whether any pellets escaped through the backboard into the gap at the back. Obviously, there's not a **huge** amount of value in testing by shooting at a metal pellet trap, so I stood a wood block with some spinners on the shelf too, and then shot from 20 metres away. I used both the rifle and the pistol and shot _a lot_. When I checked it, the backboard was pretty well peppered The photo shows a number of marks where pellets have bounced rather than penetrating. The pile of spent pellets at the bottom suggests that the curtain did it's job. But... did any pellets make it into the back? One did, although it seemed to have taken an unusual route: There was no visible hole in the backboard's plywood. However, on the right hand side, there's a chip in one of the wood panels which make up the cabinet's side. The pellet in question appears to have managed to squeeze between the backboard and the side. That should be a pretty rare occurrence, though, and the cabinet's walls are thick enough that a pellet's unlikely to make it through them that way. The curtain, for it's part, was riddled with holes but holding together just fine * * * ### Conclusion The backboard won't last forever (nor is it designed to), but I now have a cabinet which can be used to provide an extra layer of safety should a shot veer off course (or, of course, be aimed at a backless target like the spinners). Our garden isn't nearly as tranquil2 and idyllic as the woodland range down the road: But, it _is_ a lot closer and cheaper to use. Currently, only the top of the cabinet has a backstop installed. My intention _is_ to build one into the lower half, but I thought it best to wait and see how I get on with this build before duplicating it: the curtain already sits a little close to the front for the spinners, so it **might** be that I try a different design for the bottom. The only thing that's sure, is that it'll probably involve rubber flooring: I've still got more than half a roll left! * * * 1. In fairness, so far this year the weather seems to have been focusing on trying to melt rather than drown us ↩ 2. OK, maybe tranquil isn't quite the right word: it's not exactly _quiet_ with shotguns going off and bits of clay dropping out of the sky. But it's a _different type_ of background noise and no-one is going to interrupt you. ↩ 3. Air rifles with higher power output can be obtained with a firearms license, but then you end up with all of the drawbacks of shooting rimfire but without the benefit of the cordite smell. ↩

Building An Air Rifle Target Back-Stop
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/house-stuff/building-an-air-rifle-target-backstop.html?utm_source=rss

#building #diy #housestuff #shooting

26.08.2025 21:00 — 👍 0    🔁 0    💬 0    📌 0
Preview
Feedback on The Online Safety Act (an email to my MP) Now that the Online Safety act has come into effect and the _inevitable consequences_ are starting to play out, I decided to email my MP to try and find out what the Government intends to do to unpick the mess. Doing this felt particularly important in light of the divisive comments made by Peter Kyle MP (I feel _extremely_ dirty agreeing with anything that Farage has said, but even a stopped clock is right twice a day). Although I didn't want my email to focus on Kyle, it _is_ a topic which merits some discussion. Kyle pointed to the cases of Nicholas Hawkes and Tyler Webb as examples of the Act's benefit. With an act as broad as the Online Safety Act, there will always be _some_ successes. However, both are **also** examples of cases that were pursued without needing _any_ of the act's more draconian measures: Both Hawkes and Webb were caught because of the courage of their victims, who took screenshots and reported the behaviour to the police. Requiring adults across the country to share sensitive personal information had _absolutely_ no bearing on either case - the only relevance of the Online Safety Act is that it defined some offences which did not previously exist. As a society, we tend to understand that the ends do not always justify the means - some limited measure of success does not (and cannot) automatically outweigh consequences. Even if it does _some_ good, the Online Safety Act is _still_ bad law. * * * #### Email Hi, As one of your constituents impacted by some of the (entirely predictable) consequences of the Online Safety Act, I thought I'd reach out with some concerns and questions. **Background** I'll start by saying this upfront: The Online Safety Act (OSA) does **not** protect children and was never going to be able to. Instead, it pushes them away from regulated spaces towards entirely unregulated ones, whilst also exposing adults to increased levels of harm. This is largely Parliament's fault: Ofcom had to work with the cards that they were dealt, and the lack of _meaningful_ definition within the OSA provided no clarity on **who** is affected by **what**. The inevitable result of pairing a lack of clarity with strong penalties is overcompensation: operators have had no choice but to assume "worst case" scenarios, leading to the chilling effects that are currently playing out across the UK web. As you probably recall, the Online Safety Act was preceded by the Online Harms bill. At the time, the (then Tory) Government's own analysis noted that it's age-restriction measures may well push minors onto using VPNs (and more concerningly) the dark web, removing the benefit of any protections which had previously existed and exposing minors to more harmful content. That original attempt to impose an age verification (AV) regime eventually faltered and failed, largely as a consequence of its "ideals" coming into contact with reality. However, important lessons were apparently not learnt, and the OSA eventually came into being. As well intentioned as the OSA's introduction might have been, it has had exactly the effects that were predicted years before. As a few examples: * UK VPN usage has rocketed (ProtonVPN reported over 1000% increase in UK signups * Content in grey areas now requires AV - depriving teens of access to sexual health advice and (unbelievably) requiring sexual assualt survivors to provide ID to continue access to support groups (you can find an incomplete list of censored Subreddits here) * The OSA was supposed to target "irresponsible" providers carrying harmful content who, being irresponsible, have continued to serve content. With "responsible" providers now requiring AV, the "irresponsible" ones are now the only outlets available to minors (see Ofcom's reporting etc) The consequence that we haven't yet experienced (but inevitably will) is a breach of one or more of the Age Verification Providers systems. We don't need to look far, though, to see what that will look like because a US dating app (Tea) has just had a breach of data collected for "safety" purposes: https://www.nytimes.com/2025/07/26/us/tea-safety-dating-app-hack.html / https://www.bbc.co.uk/news/articles/c7vl57n74pqo As a blogger and fediverse server operator, I have also been personally affected by the vagueries of the Online Safety Act. I wrote about performing an OSA assessment (as well as some of the questions that the OSA raises). You can see that it was difficult to reach any hard conclusions - I did try contacting Ofcom for advice, but they were unable to help. I'm also aware of impact on a number of other bloggers. One of those bloggers maintains a popular blog on the topic of their sex life. The stipulations of the OSA have put them in a truly ridiculous position: * Their writing is not covered: the OSA doesn't apply to text only content * They had previously been providing audio recordings of someone reading their posts. But the OSA _does_ apply to audio. The audio contains _exactly_ the same content, but cannot be provided without introducing (expensive and invasive) age verification. The result is that the Online Safety Act has done nothing to protect children (assuming that you believe this was content that needed protecting from in the first place) and instead made the web less accessible to users with accessibility needs. In fact, without going off on too much of a tangent, the OSA is _weirdly_ prescriptive about formats: Its stipulations around pornographic content call out specific image formats as being covered, leaving open the possibility that a provider may not be covered if they've used a WebP instead of a GIF. I'm sorry, but the introduction of that kind of uncertainty really is just a sign of bad lawmaking. Parliament seems to have taken the view that operators "just" need to add age verification. Unfortunately, this is ignorant of the realities on the ground: * AV solutions present significant privacy challenges for users * AV solutions tend to be extremely expensive for small operators * There's significant potential social harm in getting users accustomed to random third parties requesting sensitive details "for age verification purposes" For many operators, the simplest option has been to no longer consider the UK a viable market - geoblocking is cheap, easy and removes all of the overhead associated with assessing and maintaining OSA compliance. Even I, primarily a technical blogger, have ultimately had to take the decision not to serve some content to UK users. You can read the OSA assessment that led to that decision at https://www.bentasker.co.uk/pages/legal/osa/osa-assessment-photosarchive-bentasker-co-uk.html but it came as a direct consequence of Parliament (and Ofcom) being unable to define what is and is not covered. I am **far** from alone. I expect you'll have heard the news that Wikipedia are considering blocking the UK, but here's an (incomplete) list of smaller providers who have shut down or geoblocked the UK, citing the Online Safety Act in the process: * AWSCommunity: A Mastodon instance for discussing AWS technologies * Carlbury debate forum * Dads with Kids: a forum for single and seperated Dads * Eastleigh online discussion forum: a forum for locals to discuss local issues * Furry Energy: A server for UK furry and LGBTQA+ communities * Gaming on Linux discussion forum * The Green living forum: had over half a million posts about sustainable living * The Hamster forum: a community about rodent care * Hexus: A gaming and discussion forum * Lemmy.zip: an IT centric discussion forum * London Fixed Gear and Single Speed: A forum for cycling enthuisiasts * Ready to Go: A discussion forum for Sunderland fans * Red Passion: A discussion forum for Wrexham fans * Renault EV Club: A discussion forum about Renault Electric Vehicles * Sinclair QL: A discussion forum for fans of Sinclair QL computers * Urban Dead: A MMORPG based on a zombie apocalypse As you can see, what the Online Safety Act appears to have "protected" the population from is people seeking out others who are passionate about (sometimes important) subjects. Freedom of expression has been taking a bit of a kicking lately. **Questions** The background above will almost certainly be outdated by the time that you get to read (let alone respond) to this, but with it in mind, my questions are: * What does the Government intend to do to address this mess? * At what point will the Government recognise the need to repeal (or partially repeal) the OSA? Are they waiting for generational change, or are we capable of recognising the negative impacts here and now? * Does the Government instead intend to double down and attempt a crackdown on VPNs, or to require (the now mooted) digital ID? * Has the Government yet commissioned any independent analysis into the effects (harmful and otherwise) of the OSA? Do they intend to? * What measures does the Government intend to pursue to ensure that minors are actually prevented from harm? * As a small site operator, can I expect that the Government will recognise the harm involved in destroying small communities and driving them onto platforms like Facebook (which, as we know, are also something of a fertile hunting ground for predators)? * As a small site operator, can I expect the introduction of more meaningful definitions (and/or small site exemptions)? * What further reassurances can the Government offer small site operators that they actually have this in hand? * What measures does the Government intend to enact in order to protect and encourage freedom of expression? I am both a parent and _extremely_ technical: I fully recognise the challenges involved in trying to make the web safer for kids. My concern is (and has always been) that the Government seems to lack the technical aptitude necessary to understand the impact of these changes. With the resources available, that lack of aptitude and insight cannot have been anything but a _deliberate choice_ , made whilst pursuing idealism over reality. The UK population has started to pay the price, and will likely continue to do so. The Ashley Madison breach was a decade ago this month and, yet, the Government has still _actively chosen_ to create new silos which link ordinary people's identities to some of their most sensitive and personal behaviours. In the process, the Government have deprived minors of access to important advice and support on sexuality, sexual health and even current affairs, pushing them towards much more unsavoury sources instead. Whatever the purity of intention behind it, the Online Safety Act is an actively harmful failure, exposing web users (young and old) to increased levels of harm. It should never have been introduced and should now be repealed. Unfortunately, those are not the noises that the Government is currently making - instead they seem more inclined to try and tighten the grip (for example: https://www.theguardian.com/society/2025/aug/03/uk-pornography-taskforce-to-propose-banning-barely-legal-content-after-channel-4-documentary-airs). The OSA really is an embarrassment on the world stage. Thank you for your time.

New #Blog: Feedback on The Online Safety Act (an email to my MP)
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/the-internet/feedback-on-the-online-safety-act.html?utm_source=rss

#internet #onlineharms #onlinesafetyact #safety #theinternet #thoughts #unitedkingom #web

05.08.2025 13:15 — 👍 0    🔁 0    💬 0    📌 0
Preview
Disinformation: When To End A Friendship The events of last week presented quite a challenging question. **Should** you attempt to address the issue of friends consuming and accepting misinformation, or do you simply accept it as a sign that it's time to break contact? Humans are _very_ good at ignoring points of contention, particularly if addressing them might be inconvenient, but there are also things which cannot easily be brushed over. Last week presented one of those, inaction wasn't a (realistic) option and we _had_ to make a conscious choice on direction. This post was originally written in order to help work through my thoughts and arrive at a decision. The intention of this post _is not_ to lambast the friend, so (even though no-one reading this blog would know them), I'm going to be quite deliberately vague at points. * * * ### The Situation We regularly meet up with some other dog owners so that our dogs can have a run-around and play with one another. It's a good energy outlet for them and helps to keep them all well socialised (which is, in no small part, how we've reduced some of Bonnie's anxiety). It does, of course, means that they're regularly in close contact with each other, even swapping balls (and... _gag_... saliva) as they play. My partner works with dogs and was asked to look after one the play dogs (Bee1) while the owners are on holiday. **However** , while preparing for this, it came to light that Bee was a few years behind on his vaccination boosters. That was problematic for my partner, who can't work with an unvaccinated dog because of the risk of transferring something onwards, impacting other customer's pets2 and causing unnecessary suffering. Aside from the _severe_ emotional impact that communicating a disease would have on someone who _actively chooses_ to work with dogs, word getting out about such an incident would likely **destroy** the business. On top of all of that, we might _even_ face personal liability: it could be argued (and the insurance would certainly try) that **knowingly** interacting with an unvaccinated dog was an _act of negligence_. Vet bills are expensive, **other peoples** vet bills all the more so. However, Bee's owner said that they don't intend to bring the jabs up to date (because older dogs don't need them3) and that they'll make other sitting arrangements instead. * * * ### The Problem The pet-sit was only really the tip of the iceberg that our faces were now pressed flat up against. Bonnie is out playing with Bee regularly and (because Bee's a very cuddly and kissy dog), **we** are in regular contact with him too. Although it's something that we were working on, it's not _particularly_ unusual for Bee to try and mount Bonnie - he's somewhat lacking in aim (ends up thrusting into the air), but if we aren't quick enough to move him, he grabs her by the hips and pulls her back towards him. Between that and the swapping of saliva covered balls, there's more than a bit of potential for contamination and transmission. The risk would be much the same even if my partner stopped attending these meetups. Even if _I_ avoided physical contact with Bee6, any bacteria or viruses that he communicated onto Bonnie would soon be all over the inside of _our_ house. * * * ### Options There only really seemed to be a couple of realistic options. #### Stop Going? I had more than a life's fill of confrontation when I was young, so I have a strong impulse to avoid it if possible. But, it didn't _feel_ right that we should have to stop going because of someone else buying into disinformation. I also wasn't particularly comfortable with the idea of ghosting. Even though they'd be able to work out why, it still felt wrong to just go incommunicado. The _logistics_ of avoiding them were also a little complex: the owner doesn't reliably turn up and sometimes comes out at a different time, so it'd be quite difficult to fully avoid them without some form of prior communication. We also couldn't rely on leaving if we saw them arriving - Bon can still be anxious on a lead around other dogs, so getting her hooked up whilst all the others are still running free could only end in trouble. The only way that unannounced avoidance could work would be for us to forsake the convenience of the field entirely and start routinely walking the dogs somewhere else. * * * #### Say Something? I was not altogether convinced that saying something would achieve much. But, as well as being the _adult_ thing to do, it was the only other viable option. If it went wrong and we ended up parting ways, the reason behind it would at least be clear, saving us from any awkward "oh, haven't seen you in _ages_ " moments in future. * * * ### The Facts I didn't want to rant in my communications with the owner, because it could only be counter-productive and increase the likelihood of them feeling attacked and refusing to act out of stubbornness. But, at the same time, I just **do not** fucking understand how we've all ended up here. It's not _just_ dogs that are running around unvaccinated, there are _kids_ in the community without them too. Somehow we've gone from acting as a semi-cohesive society to one which tolerates parents putting their children (and others) at risk under the guise of personal choice. The result is a record high number of measles cases in the US, with the UK also trending upwards. This is what happens when you delegate medical decisions with a society-level impact to the individuals in a society which includes parents who spike their autistic kids with bleach. Vaccinations don't guarantee that you won't ever _catch_ the disease, but they do tend to make the symptoms less unpleasant and reduce the chances of it being fatal. Getting back to dogs, though, let's lay out exactly what we routinely immunise against: * `Distemper`: a **highly** contagious viral disease with a mortality rate of **50%** in adult dogs. In puppies it's **eighty fucking percent** * `Hepatitis`: A highly contagious disease which damages spleen, kidneys and eyes. Symptoms include bloody vomit, diarrhoea and eye problems. **There is no cure** * `Parvovirus`: So highly contagious, that unvaccinated puppies aren't even allowed to _walk outside_. Incredibly hard on the pup4, even assuming that they manage to survive it * `Leptospirosis`: Bacterial disease that can be mild or life threatening. It's zoonotic and can infect humans (Weil's disease). It's also carried by rodents who tend to leave trails of infected urine or swim through water. Initial symptoms look quite a bit like any other illness, so it's easy to miss * `Kennel Cough`: Not actually a single illness, but a group of them. It's basically bronchitis and some of the bacteria and viruses that cause it also like _human_ bronchi. Despite the name, can be caught and spread anywhere. Basically, we're not talking about minor ailments here. We're talking about lucky-if-it-only-fucks-standard-of-life stuff and, worse, we're talking about a human gambling that they won't be inflicted on **a dog**. _Fuck that noise_. * * * ### Crafting a Message After I'd let off a bit of steam by ranting into my psuedo-blog, I started to draft the message that I'd send. It needed to be short enough that they'd actually read it, but long enough to convey the **important** points. They're not necessarily all essential, but the points that I came up with were: * We'll have to stop coming out * Partner works with puppies and dogs with health complications * Partner works with elderly owners, so zoonotic infections like Lepto have an additional concern (apparently Kennel cough can be zoonotic too) * If went wrong, it could fuck the business and our finances * Parvo is infectious before symptoms appear and for weeks after (in fact Cornell note that it's almost impossible to disinfect an area contaminated by an infected animal) * Kennel cough is infectious before symptoms appear I then distilled them down into a SMS: > Hey, have you thought any more about Bee's jabs? > > Ultimately its your choice, but we have to stop coming out otherwise > > [Partner] visits puppies & dogs with health conditions, if he ever did pick somethin up she'd transmit it on. > > Some customers are old too & lepto/kennel cough can infect humans. Lots of it is infectious before symptoms so there'd be 0 warning > > Not only would she be devo'd but it'd destroy the business. It'd double-fuck us financially because insurance would say she knew. > > Even if I could somehow ignore his face & not give him a cuddle, he'd still get stuff on Bon who'd then spread it round the house Ultimately, I decided that it was better not to overthink it and so hit `send`. * * * #### The Response It took nearly 24 hours for the reply to arrive and, unfortunately, it seemed that my message hadn't had the desired effect: > I'll be working, so we won't be out so you can have mornings with the others and I'll stick to evening time I struggled a little to decide how to respond to this. On the one hand, it was a little jarring and felt like I was being told when we could and could not go out. On the other hand, it didn't really seem worth arguing about because there wasn't anything of substance to engage with: no excuses, no attempt at justification, just a blunt suggestion of how we could best avoid one another. I debated replying with a link that suggested that, although immunity to parvo etc _can_ persist in older dogs, the vaccination needs to be replaced with tests and Lepto boosters are still required yearly. But, I realised that, even in the unlikely event that they listened, it wouldn't really help with the concerns about it affecting my partner's work. So, I replied to say that, if that was the only route they saw for moving forward, it's the one that we'd have to take. * * * ### A Clean Break Although not the outcome that I hoped for5, the response did at least result in a clean break. Although I've tried to describe it as neutrally as I can, there's undoubtedly some hurt on both sides - we were hurt to learn that _a friend_ knowingly put us at risk for years and they're (presumably) hurt that we'd walk away over something that they clearly view as being trivial. If the outcome had been different, we could _perhaps_ have convinced ourselves that they hadn't realised the potential impact, but refusing to change course after the impact has been described makes it hard to be quite so generous. I imagine, though, that it's a little easier for them to rationalise: they believe that Bee's age gives him immunity (removing the need for a jab), so it follows that there's no real risk to us in the first place. I mentioned personal choice earlier, and that's where this gets quite complicated. Rightly or wrongly, _it is_ their choice to make. However, in not telling us, they took away **our** right to choose (as well as that of others in the group). * * * ### When To End The Relationship This post started with a question: > **Should** you attempt to address the issue of friends consuming and accepting misinformation, or do you simply accept it as a sign that it's time to break contact? Even after this experience, I still believe that the answer is to _attempt_ to address it. However, you need to do so prepared for the possibility that your words may not land, and that the friendship may dissolve (or spontaneously combust!) as a result. The death of the relationship could come about because they over-react, or because they refuse to budge on something that you yourself cannot (or will not) compromise on. Beyond a certain point, though, the _why doesn't matter_. What matters is that the relationship is no longer viable and it's time for both sides to move on. Finally, however puerile and uninformed the view that you're responding to might seem, it's important to remember that we're all adults and to communicate with respect. If nothing else, it deprives them of an opportunity to portray you badly ~~if~~ when they start showing your messages to other people. * * * ### Conclusion It's been a sudden and quite emotionally charged change in routine, but I feel like we made the right decision: we engaged calmly and reasonably. We also tried to ensure that our other friends didn't feel like they were caught in the middle. I'd have preferred a solution which didn't involve burning a friendship to the ground, but the "good" outcome wasn't really ours to deliver. Hopefully, Bee's lack of vaccination will never affect his health, but _if_ it does, we won't unknowingly become a part of others suffering as a result. We were brought together by a common interest (dogs), so it hurt all the more to have it fall apart _because_ of a choice which increases the risk of a dog coming to harm. Once messages had been exchanged, the choice that we faced was binary: walk away, or rely on the hope that nothing ever went wrong. It wasn't even _really_ a choice. I'm glad that I said something, even if the outcome was ultimately the same. * * * 1. Not his real name ↩ 2. This isn't a particularly unusual requirement, I've yet to encounter any doggy daycare or kennels that'll accept unvaccinated pups. ↩ 3. As far as I can make out, their advice came from _the son_ of a (now deceased) vet. ↩ 4. Someone I once worked with bought a puppy only to find it had Parvo. It was a long, drawn out (and expensive) recovery involving multiple points at which he (and the vet) thought the poor thing was going to die. I absolutely do not recommend that experience. ↩ 5. The poor bugger, after all, is still running around unvaccinated ↩ 6. I did try, early on, but it was heart-breaking to have to try and ignore him as he grinned up at me wanting cuddles ↩

New #Blog: Disinformation: When To End A Friendship
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/opinion/friends-and-disinformation.html?utm_source=rss

#dogs #thoughts

01.08.2025 20:00 — 👍 0    🔁 0    💬 0    📌 0
Categorising My Daily Todo List with Deepseek-r1 I've made no _particular_ secret of the fact that I am **not** a fan of large language models: they're overhyped, under deliver and were built off the back of a ripping off of others work at a scale that (for anyone else) would have resulted in jail and/or bankruptcy (even before the more incontrovertible acts of copyright infringement are considered). But, I also work in a industry which is going through a phase of significant AI use (even if a lot of companies do come to regret being over-eager). As a generalist, still decades from retirement, I _have_ to keep my hand in. If nothing else, doing so means that I'm better positioned to explain _why_ and _when_ they seem to be ill-suited. Moving on, I've finally got around to reading the book that I mentioned while procrastinating recently. In one section, the author talks about reviewing how their time is spent: > Every quarter, I spend a few hours categorizing my calendar from the past three months to figure out how I’ve invested my time. This is useful for me to reflect on the major projects I’ve done, and also to get a sense of my general allocation of time. This struck me as a _particularly_ sensible thing to start doing. However, I needed a slightly different approach: although it plays a part, my calendar is not the main organiser of my day because I work from a daily todo list in Obsidian. Although small/quick tasks might sometimes get omitted, they're still far more likely to appear in my TODO list than in my calendar. The trade off, of course, was that I would only able to collect the numbers of task completed rather than a measure of time spent. Pivoting back to AI, I decided that categorising TODO list items **should** be a piece of cake for any reasoning model. This post discusses how I developed metrics about my todo list along with where (and why) I found the LLM unreliable. * * * ### Requirements I had a simple (but strict) set of requirements 1. **The LLM must be local** : I'm a manager and my todo list sometimes contains references to team members. I'm not comfortable sending _any_ of that to one of the hosted providers1 2. **The script should be cron-able** : the whole point is that it should just run without me needing to do anything 3. **It must use my existing format** : Shuffling the TODO list structure around is OK, but changing the way that I write TODOs is likely to be unrealistic - it's supposed to be somewhere to quickly note a need to do something, rather than a document which requires a linter 4. **Tasks should be sorted into a finite set of categories** : LLMs are quite inconsistent in their output, so it'd be unwise to rely on the LLM to also define the categories: I'd end up with tasks spread over "discussions", "email" and "correspondence" 5. **The output should be simple counts** : The aim here wasn't to create a second copy of my TODO list, but instead to express it in terms of simple metrics * * * ### TODO List Structure My list uses a simple markdown template: # Daily Notes: {{date}} ### TODO - [ ] Backlog item ### Today - [ ] Incomplete task - [x] Complete task - [ ] Morning Sync (09:00) **Calls** **Morning Routine** - [ ] Slack - [ ] Email .. etc .. **Follow Up Tomorrow** The `TODO` section contains my backlog which is rolled into the current days notes by the Daily todos rollover plugin (along with any incomplete items under `Today`). As I work through my morning routine, I populate the `Today` section with calls broken out under their own subheading. * * * ### Processing Because they're stored as markdown, the lists are easy to process and the headings make it easy to see which section an item falls under. I've used a similar format in previous roles and previously wrote a simple burndown stats script which built metrics by walking through line by line: for line in fh.readlines(): # Process the line if "### TODO" in line: in_section = "main" print("Started") continue if "**Calls**" in line: in_section = "calls" continue if "**Morning Routine**" in line: in_section = "morning-routine" continue elif in_section and line.startswith('- [ ]'): counters["items"] += 1 elif in_section and line.startswith('- [x]'): counters["items"] += 1 counters["complete"] += 1 For this project, I added a function call to the conditional block for completed items: category = categorise_item(line, in_section, ai, categories) * * * ### Adding AI I span up an ollama docker container and loaded Deepseek-r1 into it ollama run deepseek-r1:7b Use of the 7 billion token model was a compromise: the solution had to run on a laptop (I actually started with the much smaller `phi3:mini-4k` but it _really_ struggled to give accurate responses). I have a python wrapper that I previously wrote to provide a command-line REPL for interacting with Ollama3, so I imported that. Although there's a bit of class initialisation to do, it makes sending a message simple: response = ai.chat_with_model("Hello, are you there?") Next, I needed to write a prompt to tell the LLM what to do with the messages that would be sent to it: You are a task categorisation specialist and are being employed in order to identify what categories of work each of the user's completed todo list items falls into. The user will provide you with a file in markdown format which consists of a list of checkboxes with todo list items next to them. Using the rules below, you are to categorise any task items that the user passes you. **File Structure** * An item starting with - [ ] indicates a task that has not yet been completed, ignore these. * An item starting with - [x] indicates that the task was completed, you should categorise the item * Items that follow the text Calls are calls - these need further classification (see the calls selection below) * Items under Morning Routine are routine items, these should be classified as "daily admin" You may encounter a heading with the text Defunct. If so, ignore everything that follows it. **Calls** There are two types of call classificiations: call-meetings and call-11 If the todo item includes the text 1:1, it's a call-11, otherwise it should be classified as call-meeting. **Categories** ALL completed items must be categorised but must ONLY be categorised into one of the following categories * call-meeting * call-11 * daily admin * incident response * PR work * documentation and planning * other To test, I passed it a single item2 - [x] Swear at AI The response and the reasoning seemed in line with what I wanted > Okay, let's tackle this problem step by step. The user has provided a task list with several items marked as completed (indicated by - [x]). My job is to categorize each of these items into one of the specified categories based on the given rules. > > First, I need to go through each item and determine its category. Let's look at each item: > > 1. "- [x] Swear at AI" This task clearly states a negative action but doesn't provide any specific context like calls or routine tasks. Since it's under "Other" as per the categories list, this should be classified as "other". > > > I don't see any other items provided in the example besides this one. Each item must be categorized without inventing new ones that aren't present in the user's message. > > So, the only category needed here is "other". > > { "category": "other" } I threw it a few more items and all came back correctly. * * * #### Problems However, issues soon began to creep in. Despite having been passed a list of acceptable categories, the model invented a new one: security. Expanding my prompt to include explicit rules (you MUST NOT return a category not listed above) helped a little, but not much. To improve the chances of getting a successful response, I adjusted my script to add the ability to trigger a reprompt: def get_ai_category(item, ai, section, ffs=False): ''' Ask the AI to categorise We often need to strip the response though, because many reasoning models include their reasoning too ''' msg = [] if ffs: msg.append("Reminder: you MUST stick to the categories provided in the prompt. You're a very naughty LLM, your previous response made changes - please try again but stick specifically to the rules in the prompt") msg.append(item) response = ai.chat_with_model("\n".join(msg)).split("\n") However, it didn't help much: the LLM often replied with the same thing the second time. Taking a closer look at the responses revealed something even more frustrating though: sometimes it selected the correct category but then fucked up the spelling. For example, instead of `documentation and planning` it returned the following * document and planning * documentaton and planning * documenta... * documentations and planning Sometimes the response would contain a valid category name but prefixed by some other text *category*: documentation and planning *final decision*: calls-11 I messed aroumd with making it respond with JSON, but nothing _really_ helped. In the end, I created a function called `unfuck_ai_response` and started adding string processing to account for its common mistakes. As a side note, the AI also **staunchly** refused to allow me to use non-words as category names. I defined a category with the name `managering` but it insisted on returning `managerial`. That behaviour makes sense, because _actual words_ are obviously going to have more weight, but it doesn't help with the feeling that AI is a fun sponge. * * * #### Taking Load Off the AI Even with the new function, there were still far too many items that weren't being categorised correctly. Quite a few of the regular items in my todo list can be categorised by looking for specific strings, so I adjusted the script to start handling those simpler items itself: if in_section == "calls": if "1:1" in item: return "call-11" elif "adhoc" in item: return "call-unplanned" else: return "call-meeting" I adjusted the script so that all todo items categorised at `other` would be written to a file before triggering a full run across all of my notes. Even if it didn't make sense to adjust the script for every `other` item, the list allowed me to spot patterns and build the ruleset out. * * * ### Collecting Metrics With the script now capable of emitting semi-reliable metrics, I needed to collect the data somewhere. For this project, I decided to give VictoriaMetrics a try: it accepts a variety of input formats, including Influx line protocol, so making the script generate LP gave me the ability to try VictoriaMetrics whilst retaining the ability to pivot back to InfluxDB if needed. * * * #### Standing VictoriaMetrics up Using the VictoriaMetrics docs as a guide, I was able to come up with a `docker-compose` snippet to stand a container up: victoria_metrics: restart: always image: victoriametrics/victoria-metrics:v1.122.0 container_name: victoria_metrics command: - "-selfScrapeInterval=5s" - "-storageDataPath=victoria-metrics-data" - "-retentionPeriod=100y" - "-disablePerDayIndex" ports: - 8428:8428 volumes: - /home/ben/docker_files/victoria/data:/victoria-metrics-data Writing line protocol in was easy: curl \ -d 'metric,foo=bar field1=3.4' \ http://127.0.0.1:8428/api/v2/write?db=bar So I updated my script: print("Submitting to VictoriaMetrics") res = requests.post( f"{vm_url}/api/v2/write?db=workload_stats", data = '\n'.join(lp_buffer) ) print(res.status_code) print(res.text) Note: VictoriaMetrics doesn't have the concept of separate databases and the name of the DB provided instead gets added as a label. After the next script run, I was able to query data: curl -s 'https://vm.83n.uk/api/v1/series?match[]=todo_list_completed' { "status":"success", "data":[{ "__name__":"todo_list_completed", "db":"workload_stats", "hostname":"flibble" }] } I could also delete that series - something that I used _quite a bit_ whilst continuing to refine the script's output curl -s 'https://vm.83n.uk/api/v1/admin/tsdb/delete_series?match[]=todo_list_completed' * * * #### Visualising I installed the VictoriaMetrics plugin for Grafana and started to build a dashboard Historically, I've written quite a lot of Flux on this blog, but VictoriaMetrics uses MetricsQL. It's a PromQL-alike language, which itself is similar to LogQL, so I didn't experience any real issues writing queries to drive the graphs: todo_list_items_count{db="workload_stats", category!="error", category!="scheduled-for-tomorrow"} The visual query builder is also pretty good. * * * #### Metrics: use of AI The TODO processing script also writes metrics about itself into VictoriaMetrics, so the dashboard includes a section specifically for script run stats. One graph indicates the proportion of TODO items categorised by the script vs categorised by AI: Although I'd been careful to only add general rules (rather than targetting specific items), the ruleset embedded within the script accounted for two thirds of all categorisations. I also wanted to understand how often AI was misbehaving, so I adjusted the script to calculate an AI reliability score: # Calculate a reliability score for the AI # for every retry we had to send the count goes down. if source == "ai": # So the logic here is that we want to generate # a percentage reliability score based on # comparison to the max retry depth. # # We add 1 onto max_retry_depth to account for # the first request (which doesn't get treated # as a retry) # # So, if max_retry_depth is 4, the max becomes 5 # and scores would look like this: # # * we didn't retry at all: 100% # * we retried once: 80% # * we retried 4 times: 20% # * we burned through the retries: 0% # score = (1 - (RETRY_DEPTH / (max_retry_depth + 1))) * 100 For each TODO list run, the script reports mean, min and max reliability scores. It seems that it _generally_ behaves, but when it misbehaves, it does so quite stubbornly: * * * ### Conclusion I now have a script which can _roughly_ categorise my TODO items from each day ready for visualisation in a dashboard to shows where the majority of my efforts are spent. My initial hopes for AI making it quick and easy were quickly, if predictably, dashed: the LLM does a "good enough" job most of the time, but only because substring matching within the script handles ~66% of items. One of the **huge** benefits of having opted for a local model is that I don't have to worry about results suddenly drifting as the result of a provider deploying an update which results in the model responding differently. That consistency has its limit though: the dashboard has looked slightly different every time that I've triggered a full re-run - the LLM is rarely consistent in the categories that it selects. Equally, though, there probably isn't any _reasonable_ way that I could have script logic correctly categorise the 1/3rd that ends up being passed off to the LLM. Although I could either limit the language used or include a category alongside items, the likely result of that added burden is that the list would fail to include so much of my workday. * * * 1. It doesn't _matter_ what their terms say. Data protection by design and by default, if it can be done without sending to OpenAI/Google/Musk then it should be ↩ 2. It's possible that I've skipped a few iterations in this post and was growing fed up at this point ↩ 3. In case you haven't picked up on this: the privacy benefits mean that local AI will always be > remote AI ↩

New #Blog: Categorising My Daily Todo List with Deepseek-r1
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/software-development/ai-todo-list-categorisation.html?utm_source=rss

#ai #deepseek #genai #llm #managering #ollama #softwaredevelopment #victoriametrics

31.07.2025 07:30 — 👍 0    🔁 1    💬 0    📌 0
Preview
Bonnie's Lap Spay Diary Bonnie came to live with us at the beginning of last year. She joined our family unspayed and so was still having regular seasons. I don't _think_ that her previous owner had tried to breed her but, as he had a breeder in the family, I'm guessing that it was an option that he wanted to keep. Intact dogs aren't just exposed to the risk of accidental pregnancy, they also have an increased incidence of cancer and Pyometra (on average, 19% of intact females will suffer this before age 10). On top of that, of course, is the disruption to routine that having a season brings: although the dog will still want to play, there's an obvious need to keep them away from males (intact or otherwise: neutered males can still end up fighting over a hormonal bitch). Getting Bonnie spayed was the _responsible_ thing to do. However, Bonnie joined us with some behavioural issues and the early relationship between her and Pepper was sometimes a little shakey. We felt that their developing relationship might be negatively impacted by a surgery recovery period. Over the last year or so, though, Bonnie has made great progress and we decided that the time had come to have her spayed. In order to supervise the dogs while Bonnie recovered, I took some time off work and, frankly, got quite bored and restless, so thought I'd write about her recovery. * * * #### Visiting The Vet When Bonnie first joined us, she suffered quite badly from anxiety, particularly whilst on a lead. Although it's improved considerably, it's very much still a thing. It should come as no surprise then, that she **does not** enjoy visits to the vet. Bonnie's first visit to her new vet was for a vaccination booster and, although she received the jab, it was clearly quite a stressful experience for her (and everyone else!). After that, the vet provided a small supply of anti-anxiety medication (Gabapentin and Trazodone) for me to give to Bonnie ahead of any visit. The next visit to the vet was unplanned: Bonnie had caught her nail and torn it most of the way off. I gave her some tablets a few hours before the appointment and, once we were in, she _even_ let the vet stroke under her chin. The tablets _definitely_ work, but they're not a panacea: they buy a little bit of time for the vet to do what needs to be done, but Bonnie can still eventually get worked up. * * * #### The Morning Of The Procedure In order to minimise stress for Bonnie, the plan was simple: * Dose her with anti-anxiety meds the night before (and morning of) the procedure * Take her into the vets before anyone else might arrive * Do a quick bit of paperwork and then sedate her Just before the sedative was injected, for the vet's safety, I put a plastic muzzle over Bonnie's snout. This was quite new to her, but she let me put it on and take it off without any fuss - it turned out not to be needed, she didn't even growl when the needle went in. Having injected the sedative, the vet and nurse left the room to allow Bonnie to settle. Within about 15 minutes of arriving, Bonnie was curled up unconscious at my feet: The nurse came back in and, after taking a list of additional requests (trim her nails etc), carefully picked Bonnie up and carried her through. I went home, made a cup of tea and then washed her bedding to make sure she'd that have a clean bed to come home to. * * * #### Laparoscopic Procedure We opted for a laparoscopic spay1. A traditional spaying (like the one Pepper had) involves a long incision down the abdomen, giving the vet access to their ovaries and uterus. A laparoscopic procedure, instead, involves a couple of small incisions. The vet pushes a camera through one to help them see (hence "keyhole" surgery) whilst operating through the other. The procedures are also quite different in terms of the impact that they have on the dog's body. Lap spays only remove the ovaries (an _ovariectomy_), whilst traditional spays tend to remove the uterus as well (an _ovariohysterectomy_), inevitably disturbing more of the remaining tissue in the process. For those don't mind a bit of inside imagery, there are videos of Lap Spays on Youtube. Key-hole procedures are more expensive2, but bring a real advantage in terms of recovery time: | Traditional | Laparoscopic ---|---|--- **Recovery Time** | 14 days | 3-4 days The crucial difference is that Bonnie was allowed to run and play again after just a few days (although she'd need to be kept from licking the wound for 10 days). This is a **huge** improvement over the two weeks that Pepper spent on lead and wearing a t-shirt (the vets couldn't get her to stay in a cone, even before we arrived): I dropped Bonnie off at the vets at 08:40. By 11:30 they had phoned me to say that everything had gone well and that she was awake enough to be collected. * * * #### Coming Home I wasn't sure how awake "awake" was actually going to be. Bonnie had been sedated during the nail-tear visit and, that time, "awake" had meant an eyes-wide, tongue hanging, spaced out dog who needed to be carried because she couldn't walk more than a few steps. This time, though, Bonnie trotted into the waiting room wearing a snazzy pink recovery suit: The suit proved to be pretty good: rather than having a buttoned up flap that I'd need to undo whenever she needed the toilet, it was elasticated and so stretched out of the way whenever she squatted. * * * #### Day 0: Dopey Dope Dope Although able to walk on her own, Bonnie was quite clearly still feeling the effects of the methadone that she'd been sedated with. She was slow to respond, easily confused and had pupils like spaceships: Although it was one that I had been prepared for, her lack of re-activeness did pose something of a problem. The last time that Bonnie came home doped up, Pepper was _beside herself_. She couldn't seem to understand why her buddy wasn't responding to her and responded by getting really close and barking anxiously at her 3. This time, although Bonnie was mobile, she still wasn't particularly responsive and Pepper started worrying again. To allow Bonnie to recover in peace, we took Pepper up to littlun's room and left her to ~~be entertained~~ watch littlun play on the Xbox for the evening. Around 1930, we brought Pepper back down and, keeping her on a lead, I got both dogs to settle down next to me and go to sleep. A few hours later, I moved them to their crates and said goodnight. * * * #### Day 1: Hooman, PLAY The next morning, I got up and made their breakfast, squirting a pre-prepared dose of Rheumacam onto Bonnies. When I went through to let them out of their crates, it was clear that Bonnie had slept the remainder of the sedation off - if she hadn't been wearing a bright pink reminder, you could _easily_ have forgotten that she'd had surgery at all. After feeding them, I clipped their leads on and took them for a short walk. During that walk, Bon had a wee and a poo - both are something that you're supposed to get excited about post-sedation4. In preparation for having a post-operative dog in the house, I had put a step-stool in front of the sofa (and another by our bed). Unfortunately, telling a Border Collie not to jump is a bit like telling a toddler not to eat sweet things. Bonnie ignored the step and jumped straight up... _sigh_. The _real_ challenge, though, lay in keeping the two dogs from playing. _This_ was the reason that I needed to take time off work: both dogs _love_ a game of bitey-face and my primary role was to pour a (metaphorical) jug of cold water on any sign of play. Using a precise combination of exasperation, darkened rooms and the words "settle down", we got through the day without any rough play. * * * #### Day 2: Grrrr Day 2's morning walk didn't end as well as the previous one. Normally, once we get back to the house, the dogs sit and receive a small treat (which is used to reinforce the command "come home" in case they ever do get out of the garden). But, after swallowing hers, Bon nipped jealously at Pepper's muzzle. I've written before about Bonnie food guarding, but it's _very_ rare for it to show up when they're both receiving a treat (particularly one that both dogs can finish quickly). There's no way to be sure, but I _think_ the issue was that Bonnie was in some discomfort and felt less tolerant and reasonable than normal: her painkillers were administered once daily, so it's quite likely that the previous day's dose had worn off and the next hadn't kicked in yet. It's equally possible, though, that it was temperature induced grumpiness. The UK was going through (yet) another heatwave, which wasn't exactly ideal timing for a double-coated collie to be wearing an outer body suit. Whatever the cause, with a quick verbal correction, Bonnie went back to being nice and the day continued. However, Bonnie was also fully back to her old self and jumped5 on every possible opportunity to release some of her immense Collie energy. That included trying to chase pigeons in the garden, running up and down the stairs and, of course, finding ball-shaped things to drop in my lap in an attempt to try and tempt me to throw them: * * * #### Day 3: Check the wound... but how? It was time to check the wound. With a less anxious dog, the vet arranges a post-operative appointment to check that the wound is healing and isn't showing any signs of infection. Those appointments can be _extremely_ stressful for an anxious dog, though, so they asked me to email photos instead. Day 3 was _the_ day, but the question of **how** had been running through my mind _for days_. In order to even _see_ the wound, we'd need to do the following: Bonnie seems to trust us, but it's very much on her terms. The chances of success seemed _incredibly_ remote and brought some associated risks: * If the suit came all the way off of her hindquarters, there was no chance we'd be able to get it back on * If she stressed or wriggled too much, she could end up tearing the stitches and re-opening the wound * If she got too stressed, she might bite (crossing a line that she hasn't yet) This was complicated further by the fact that it would need to be _me_ holding her head: much to my family's annoyance, she has the best relationship with me and so was less likely to overreact or _hold a grudge_. Even with the difficulties, though, the alternative was to leave her health to chance, so we **had** to try. * * * ##### Checking The Wound I knelt next to Bonnie, holding her lead short (but not tight). I pressed the flat of my hand against her cheek, sandwiching her head lightly between my palm and my body, effectively holding her in place. The _moment_ that the suit's zip was touched, she growled and tried to flick her head round. Talking soothingly, I tightened my grip a little and nodded to continue. With the zip undone half-way up her back, the back of the suit was pulled gently down her legs. It became clear that there was problem: although a shaved patch of her underside was visible, it wasn't really possible to see her wound and certainly wasn't going to be possible to get a _useful_ photo. I carefully released her head and lent down to have a look myself - there really was just no way that it was going to work: * The suit was elasticated and tight, so we couldn't move it to increase visibility without taking it off her legs (at which point, it'd never go back on) * If we poked a phone into the suit and she wriggled, we'd likely end up rubbing whatever bacteria was on the phone over her wound, potentially introducing an infection that hadn't been present before With my head at an uncomfortable angle, I _thought_ I could occasionally see half a glimpse as she shifted, but she was starting to get quite agitated - we were out of time. I held her head again and we zipped the suit safely back into place before releasing her and rewarding with a treat. * * * ##### Reporting Back I emailed the vet, noted that I hadn't been able to get a picture and provided the evidence that I _had_ been able to collect: * The shaved area was all a normal colour * There was no obvious sign of swelling * There was no malodour * She was mobile and seemed her usual self * There was no evidence of discharge or dampness on the suit itself * Apart from that first day, Pepper had not been sniffing around her Being _a little bit_ of a nerd, I also pointed my thermal imaging camera6 at Bonnie's belly (and, yes, I forgot to set the clock the last time that the batteries died): I'm more than a little out of my lane here but, if there were any infection, I would expect the associated vasodilation to cause a hot/warm spot around the wounds. Although the numerical values can't be taken literally they _were_ about the same as when I pointed it at Pepper's belly (suggesting that Bonnie wasn't feverish). The vet _didn't_ comment on my (potentially) questionable application of science, but agreed that it sounded like things were healing as they should. * * * #### Day 4: Play? Play Day 4 was the point at which Bonnie _could_ be allowed to be more active. However, as we hadn't been able to _physically_ check the wound, I decided that it would be wise to wait a little longer so that we could be reasonably sure that she wouldn't tear it (particularly as there was no way to know whether her earlier pigeon chasing and sofa jumping antics had already done so). Obviously, Bonnie had no way to know that day 4 was _supposed_ to be the day and so also wouldn't _really_ know that the recovery period had been extended. Unfortunately, neither did she know _why_ she wasn't allowed to play or even really _that_ she wasn't supposed to. Inevitably, our lunch time spell out side included more attempts to make me acknowledge the presence of a ball: * * * #### Day 5: Extra Time After five days of me disrupting their play and insisting on calm, the dogs tried a new tactic: one dropped a toy into my lap just before the other launched at my face to kick off a human-dog game of bitey-face. In life, there are moments when it's impossible to be strict and one of those is when your dogs seemingly collaborate to launch a sneak attack on you. A three way game of bitey-face ensued, before one dog decided that I needed the handicap of playing tug-of-war whilst still playing bitey-face with the other. Nearly a week's worth of pent up energy was essentially unleashed on me by both dogs. As the day cooled off, I decided to let Bonnie burn a little more energy off and took her out to the field for a short play with the ball Day 6 had another, slightly longer, trip out to play ball. * * * #### Day 7: Poooooooooooo One the morning of Day 7, I went to get the dogs up, but was hit by a strong smell as soon as I entered the room. As I turned the corner, I saw Bonnie in her pen: had I forgotten to lock the crate last night? I was _sure_ I'd closed it. She was stood, grinning up at me, with the carpet around her covered in shit. Diarrhoea is one of the possible side effects of Rheumocam. Of course, it's _equally_ possible that she found and ate something nasty while in the garden. Whatever the cause was, it made her shit hard enough that, for the first time ever, she'd escaped her crate, breaking the lock in the process: There wasn't any call for immediate concern though: she seemed fine and happy, so I set about cleaning it all up. The prior day's dose had been her last round of painkillers so there were, thankfully, no repeats of this. * * * #### Day 8: Fun With Friends By day 8, Bonnie had been out for a few runs on her own with no sign of discomfort afterwards. But, as she often does when we're out alone, she'd spent a chunk of the time looking around to see if her mates were coming in through any of the field's entrances. So, on day 8, we went back to meeting up with her doggie pals for a run-around. * * * #### Day 10: Black is the new Pink The day had _finally_ come to remove the pink recovery suit. As usual, we put Bonnie on the lead to go for a run, but then I held her head while the suit was unzipped (prompting the expected growling). The suit was carefully pulled down and off her back legs before we released her so that she could step her front legs out herself. Later that night, Bonnie lay on her back on the sofa and we were _finally_ able to get a photo of her (now healed) wound It _really_ is tiny compared to Pepper's scar. * * * #### Day 13 Towards the end of day 13, Bonnie's mood shifted a bit - she became _very_ clingy and cuddly. When Pepper got a bit FOMO, Bonnie stopped "speaking" to her for the rest of the day - when Pepper tried to initiate play, Bonnie turned her back on her and went and curled up in her crate. She wasn't showing any signs of pain or discomfort, so our assumption was that this was the result of hormonal changes. Although Pepper never showed any sign of it, apparently it can take 2-4 weeks for their hormones to settle. * * * #### Day 14 The dogs are back to being friends again, games of bitey face and all. * * * #### Conclusion So, there we have it: Bonnie was able to play again 3-4 days from spaying and was fully recovered within 10. She's no longer at risk from the range of things that can affect intact females (including males breaking in to find her when she's in season) and no longer has to miss out on a month of playing with friends twice a year. Pepper's recovery period had been _much_ longer and included a period where she wasn't even allowed to go on walks. Bonnie's recovery period, whilst exhausting for me7, was **much** better in comparison. If we ever got a third dog (no!), despite the additional cost, I think we'd _definitely_ go for the laparoscopic spay again. We'd also **definitely** go for the surgery suit over a cone-of-shame or (as Pepper had) a loosely tied t-shirt: * Although the suit did make it harder for us to check the wound, a large part of that difficulty came from Bonnie herself - I think we'd have been able to check it with Pep. * Unlike a cone, the suit acted as a barrier to help keep detritus and crap away from the wound. * It also didn't need constant adjustment like Pepper's t-shirt had Apart from day 3 and 10, were were essentially all able to act as if it wasn't even there. We're now a few days beyond day 10 and the only sign that anything even happened is Bonnie's shaved patch. * * * 1. Hence the procedure being known as a "Lap Spay" ↩ 2. Apparently the associated equipment costs between £20-30K and the setup time is quite a bit longer, so it's not like the vet is charging extra just for the sake of it ↩ 3. Awwwwww sweeeeeet ↩ 4. In truth, most dog owners enthuse about any toilet activity that happens outside the house, because that psuedo-excitement helps reinforce the positive behaviour ↩ 5. Sometimes, quite literally ↩ 6. there's a certain irony in having to conduct a _heat_ survey on a dog that's just been spayed. ↩ 7. It would have been worse with Pep, but we only had one dog at the time so there was no need for me to supervise as closely ↩

New #Blog: Bonnie's Lap Spay Diary
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/house-stuff/bonnies-laparoscopic-spay-recovery.html?utm_source=rss

#dogs #housestuff #vets

24.07.2025 07:45 — 👍 0    🔁 0    💬 0    📌 0
Preview
Looking At Changes In The UKs Solar Capacity Yesterday, I read an interesting article about this spring's UK solar output being 42% higher than in the same period last year. The article notes that, although the sun shone a **lot** more, increased sunlight wasn't the only driver of this increase in generation: capacity also expanded. I was curious, though, to see what kind of increase (if any) our solar had achieved over that period. This post looks into our levels of production as well as going down some rabbit holes looking at changes in the UK's solar capacity. * * * ### Performance Of Our Install I track solar generation metrics in InfluxDB, so comparing yields for the two periods was just a case of writing a Flux query to grab the totals: twenty4 = from(bucket: "Systemstats") |> range(start: 2024-01-01T00:00:00Z, stop: 2024-06-01T00:00:00Z) |> filter(fn: (r) => r._measurement == "solar_inverter") |> filter(fn: (r) => r._field == "todayYield") |> aggregateWindow(every: 1d, fn: max) |> sum() |> set(key: "year", value: "2024") twenty5 = from(bucket: "Systemstats") |> range(start: 2025-01-01T00:00:00Z, stop: 2025-06-01T00:00:00Z) |> filter(fn: (r) => r._measurement == "solar_inverter") |> filter(fn: (r) => r._field == "todayYield") |> aggregateWindow(every: 1d, fn: max) |> sum() |> set(key: "year", value: "2025") union(tables: [twenty4, twenty5]) |> keep(columns: ["year", "_value"]) |> group() We **did** see a non-negligible increase, though it was still quite some way away from 42%: Year | kWh | Change ---|---|--- **2024** | 996 | **2025** | 1200 | _+ 20.5%_ Although I'd have _loved_ to to compare the same period in 2023 our panels weren't installed until May that year. As a rough indicator, though, we can compare from the beginning of June to early July: twenty3 = from(bucket: "Systemstats") |> range(start: 2023-06-01T00:00:00Z, stop: 2023-07-12T00:00:00Z) .. etc .. Year | kWh | Change ---|---|--- **2023** | 598 | _-_ **2024** | 550 | _- 8.1%_ **2025** | 605 | _+ 10%_ Six weeks isn't _exactly_ a representative sample, but 24-25 effectively reversed the previous year's loss, which suggests that this spring's year-on-year (YoY) increase _could_ partly be a regression to the mean. Even _if_ that _were_ the case, though, we're still quite a way short of the claimed 42% increase. * * * ### Capacity The original article notes that capacity increases played a role and adding additional panels will lead to an increase in overall output, even if the amount of sunlight remains exactly the same. We can check the extent to which capacity expansion is likely to have contributed, because the Government publishes data on UK solar photovoltaic capacity. The dataset includes a table of capacity and installation sites going back to 2010: The file includes a cover sheet which explains that this data has been pulled from a number of sources: * Ofgem's REGO data set * Microgeneration Certification Scheme * The Central FiTs register Although the cover sheet doesn't _explicitly_ say it, the reliance on these registers implies that the values in the dataset won't account for * Households that haven't applied for a MCS cert (i.e. aren't being paid for exports) * Capacity that has since been disabled * Off-grid capacity But, those factors _probably_ don't change the analysis that we need to do: even if the absolute numbers aren't _quite_ correct, what really matters is the difference between them. The sheet contains the following values for the time periods that we're interested in: | Capacity (MW) | Num Installations ---|---|--- **Jan 23** | 15502.4 | 1326180 **Feb 23** | 15618.5 | 1344094 **Mar 23** | 15712.0 | 1364450 **Apr 23** | 15811.9 | 1380711 **May 23** | 15968.1 | 1399067 **--- break ---** | **--- break ---** | **--- break ---** **Jan 24** | 17003.2 | 1519762 **Feb 24** | 17093.0 | 1533829 **Mar 24** | 17281.8 | 1548582 **Apr 24** | 17417.3 | 1564146 **May 24** | 17493.4 | 1580496 **--- break ---** | **--- break ---** | **--- break ---** **Jan 25** | 18574 | 1714838 **Feb 25** | 18654 | 1735259 **Mar 25** | 18747.9 | 1759978 **Apr 25** | 18829.6 | 1780380 **May 25** | 18919 | 1803478 Because the values are effectively cumulative, May's figures are sufficient for us to calculate the level of YoY change: | Capacity (MW) | YoY ---|---|--- **May 23** | 15968.1 | **May 24** | 17493.4 | _+9.6%_ **May 25** | 18919 | _+8.15%_ In percentage terms, installed capacity grew by less in 24-23 than it had in the prior year. Things get interesting, though, when we look at the year-on-year change in the number of installs: | Installs | YoY ---|---|--- **May 23** | 1399067 | **May 24** | 1580496 | +13% **May 25** | 1803478 | +14.1% Despite the lower _capacity_ growth, the number of installations grew at a _higher_ rate than in 23-24. Is this, perhaps, a sign that there were more small-scale (i.e. domestic) installs? * * * #### Domestic Installs The government's data can help here too, because it includes a second set of totals: The figures for _domestic_ capacity and installs are as follows: | Capacity (MW) | Installs ---|---|--- **May 23** | 4372.5 | 1242413 **May 24** | 4976 | 1392110 **May 25** | 5616.5 | 1552266 Using these figures, the year on year change in UK domestic solar is as follows: | Capacity | Installs ---|---|--- **May 23** | - | - **May 24** | +13.80% | +12.04% **May 25** | +12.85% | +11.50% Interestingly, these numbers **don't** seem to suggest that install count bump was driven by domestic users, because growth in 24-25 was slower than in 23-24. * * * ##### Tangent: Proportion of Domestic Generators As a quick aside, the proportion of overall UK capacity provided by domestic installs is quite surprising: | Domestic | Total | %age ---|---|---|--- **Capacity (MW)** | 5616.5 | 18919 | _29.68%_ **Installs** | 1552266 | 1803478 | _86.07%_ Domestic capacity makes up **nearly 30%** of the UK's installed solar capacity! Domestic forming a high proportion of installations is somewhat less surprising - there are a **lot** more rooftops than there are solar farms - but I'm still quite surprised that it's as high as 86%. * * * #### What drove the increase? The numbers above suggest that increase in the number of installs wasn't driven by domestic. It'd probably be quite easy to shrug, say that it must be driven by commercial capacity and walk away. But, we're down the rabbit hole and the Government's data includes additional information that we can use. So far, we've worked with totals, but the dataset also provides bucketed totals: These group the install count down into capacity bandings, allowing us to see which was most responsible for the YoY increase: | May 23 | May 24 | May 25 ---|---|---|--- **0 to 4kW** | 1236768 | 1360486 | 1522128 **4 to 10kW** | 115489 | 166057 | 219278 **10 to 50kW** | 41645 | 48584 | 56633 **50kW to 5MW** | 4683 | 4860 | 4915 **5 to 25MW** | 428 | 446 | 451 **> 25MW** | 54 | 63 | 73 This translates into the following YoY changes: | May 23 | May 24 delta | May 24 % | May 25 delta | May 25 % ---|---|---|---|---|--- **0 to 4kW** | - | 123718 | +10% | 161642 | +11.88% **4 to 10kW** | - | 50568 | +43.78% | 53221 | +32.09% **10 to 50kW** | - | 6939 | +16.66% | 8049 | +16.56% **50kW to 5MW** | - | 177 | +3.77% | 55 | +1.13% **5 to 25MW** | - | 18 | +4.20% | 5 | +1.12% **> 25MW** | - | 9 | +16.66% | 10 | +15.87% **Total** | | 181429 | | 222982 | These figures point to the answer. There were more micro (`0-4kW`) installs in 24-25, but growth in all other capacity bandings was lower (with growth in the `4-10kW` range quite dramatically so). In percentage terms, the difference is small but the sheer volume of micro installs means that it's still a significant increase in absolute terms. The increase in micro installs was more than sufficient to offset the slowed growth seen in the higher bandings and the combined growth across all bandings was still enough to outpace the previous year. There isn't a corresponding increase in generation capacity because micro installs deliver less bang for their buck: it takes a _lot_ of micro installs to offset even one 50kW install. Although all bandings still show _growth_ , most grew at a lower rate than the year before. The capacity disparity, though, may _already_ be gone as a result of the 373 MW Cleve Hill site coming online. At the smaller end of the scale, there might be a number of possible reasons for the drop in growth: * Increasing market saturation: it might be that most of the households likely to go for a 4-10kW install already have one * Local constraints: Installs over 4kW require additional approval from the DNO, which needs to ensure that the local grid can support the input. The drop in registration rate could be a sign that this approval is getting harder to get * Cost: although the cost of panels has reduced, the cost of living in the UK continues to rise - households might be unwilling to commit to spending the extra for a larger install (or perhaps invest the difference in a battery) * * * ### Where did the 42% come from? The various pages that I could find quoting the 42% figure all referred back to a Carbon Brief analysis: > The figures, revealed in new Carbon Brief analysis, show that the nation’s solar sites have generated a record 7.6 terawatt hours (TWh) of electricity during January to May 2025. > > This is some 42% higher than the 5.4TWh generated in the same period last year Unfortunately, other than opaquely referring to "figures from the National Electricity System Operator", the article doesn't provide much information about the underlying data. However, there is a passage which suggests that _some_ of the difference in my figures might boil down to a misuse of nomenclature: > This article refers to the UK throughout, but the electricity generation data only covers the island of Great Britain. Northern Ireland is part of the all-Ireland electricity system. Carbon Brief's numbers are **not** based on the UK figures that are examined above, but on a _subset_ of them. _Naughty_. A quick YoY analysis shows that Great Britain's growth _was_ slightly higher than when Northern Ireland was included: | Capacity (MW) | YoY | Installs | YoY ---|---|---|---|--- **May 23** | 15591.7 | - | 1351065 | - **May 24** | 17108.8 | _+9.73%_ | 1548780 | _+14.63%_ **May 25** | 18562.2 | _+8.49%_ | 1769494 | _+14.25%_ Although it makes a difference, it still isn't enough. But, _it is_ possible to verify and repeat the 42% figure by looking at energydashboard.co.uk: 2024: 2025: There's a total in the table on the right of the screenshots: | Solar Total (GWh) ---|--- **Q1 2024** | 1986 **Q1 2025** | 2823 This represents a ... _drum roll_... 42% YoY increase. Back at the beginning of this (somewhat excessively long) post, I mentioned regression to the mean. So, before I wrap up, let's see whether this year's jump was anomalous or a sign of things to come. The following figures are for the first 90 days of each year, again taken from energydashboard: Year | GWh | YoY ---|---|--- **2015** | 958 | _-_ **2016** | 1442 | _+5.13%_ **2017** | 1516 | _+6.9%_ **2018** | 1621 | _+19.5%_ **2019** | 1938 | _+19.5%_ **2020** | 1909 | _-2.5%_ **2021** | 1745 | _-9.1%_ **2022** | 2169 | _+24.3%_ **2023** | 1887 | _-13.1%_ **2024** | 1986 | _+5.2%_ **2025** | 2823 | _+42.1%_ So, the first quarter of 2025 saw the biggest YoY increase in UK Solar production for **at least** a decade. With Cleve Hill online, Longfield in the works and numerous others going through planning, those figures will probably see another sizeable jump over the next few years.

New #Blog: Looking At Changes In The UKs Solar Capacity
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/house-stuff/looking-at-uk-solar-generation-changes.html?utm_source=rss

#analysis #electricity #solar

14.07.2025 09:30 — 👍 0    🔁 0    💬 0    📌 0
Preview
Configuring Split Horizon DNS with Pi-Hole and Tailscale I've long had some form of VPN for my devices to use when I'm out and about. Although I used to run OpenVPN, I moved to Tailscale a little while back. Tailscale builds a mesh network using Wireguard protocol and so is able to connect and run quite a bit faster than OpenVPN. Side note: for those wondering, Tailscale is _Canadian_ and can't see the content of connections (although if you're worried about this it's also possible to self-host using Headscale). Although the tailnet has been up for some time, I hadn't got around to setting up split horizon DNS for clients on the tailnet. I was in a bit of a hurry when first setting up and so configured my reverse proxy box to advertise a route to it's own LAN IP. This post talks about configuring my Pi-hole to implement a split horizon: returning the tailnet IP to tailnet clients and the LAN IP to LAN clients. * * * #### Splitting my Split Horizon Many of the DNS names that I wanted to do this for _already_ had a split horizon: Clients on both the LAN and the wider internet connect to the same reverse proxy in my DMZ, but LAN clients connect using the proxy's local IP. The reverse proxy fronts multiple services, most of which have authentication built in. However, it also requires that outside connections pass a separate (and valid) set of authentication credentials before it'll pass their connection on. Having to authenticate twice is a little annoying though, and the split horizon makes it easy to disable the additional authentication when LAN clients connect: satisfy any; allow 192.168.3.0/24; deny all; auth_basic "Authenticate you must"; auth_basic_user_file /etc/nginx/wanaccess.htpasswd; This extra authentication means that I'm not exposing any element of the backing service's authentication stack to the outside world. The underlying idea is that it _shouldn't matter_ that there's an auth bypass zero day in (say) Grafana, because the wider world needs to get past my auth prompt before they can try to detect or exploit it. * * * #### You've Got Access: Why Make The Tailnet Special? Given that there's an ability to access services via the WAN, you might be wondering why it is that I felt that I needed to do something specifically for the tailnet. Unfortunately, the proxy _can't_ enforce additional authentication for some services because those services clients don't support it. Nextcloud is a great example of this: the Nextcloud Desktop sync client authenticates with Nextcloud, but * It uses the `Authorization` header to present it's bearer token, so the reverse proxy will see an unexpected (and, to it, invalid) set of credentials * The client doesn't expose a way to add custom headers to the requests that it makes, so I can't simply send a shared secret and have the proxy check a different header Having the reverse proxy require additional auth breaks off-net Nextcloud clients (and Nextcloud isn't the only service with this issue). * * * ##### Geoblocking Originally, I left the affected services accessible to the world. Unfortunately, I _sometimes_ seem to upset people enough to trigger prolonged attempts at compromising my services. After one such attempt, I decided to reduce attack surface by adding geo-blocking to my reverse proxy, essentially restricting access to areas that I thought we'd be likely to connect from (or _at least_ appear to). This, of course, comes at a cost in flexibility, with access failing if any of the following are true: * We connected from an IP that doesn't have a location in the GeoDB (or is mislocated) * The ISP that we're connecting from does funky routing stuff and/or uses CGNAT * We've travelled somewhere that we wouldn't normally Adding split horizon DNS to the tailnet allows me to avoid these scenarios, because the tailnet subnet can be special cased in _exactly_ the same way that the LAN is. It also increases the likelihood that I can close WAN access off and require that a client be on either the LAN or tailnet. * * * #### The Plan The idea was that a tailnet client would also speak to the Pi-hole, but that names would resolve to a tailnet IP: This is possible because Pi-hole is underpinned by a fork of `dnsmasq` called `pihole-FTL` which has inherited the setting `localise-queries` (in Pi-hole, this is enabled by default). The man page for dnsmasq describes the setting as follows (line breaks mine): > Return answers to DNS queries from /etc/hosts and **--interface-name** and **--dynamic-host** which depend on the interface over which the query was received. > > If a name has more than one address associated with it, and at least one of those addresses is on the same subnet as the interface to which the query was sent, then return only the address(es) on that subnet and return all the available addresses otherwise. > > This allows for a server to have multiple addresses in /etc/hosts corresponding to each of its interfaces, and hosts will get the correct address based on which network they are attached to. > > Currently this facility is limited to IPv4. This means that we can create the following record set in `/etc/pihole/custom.list`: 192.168.3.33 foo.example.com 100.100.3.2 foo.example.com If a query is received over an interface in one of these subnets, only the matching record will be returned (otherwise, both will be returned): Receiving Interface IP | Response ---|--- _192.168.3.13/24_ | 192.168.3.33 _100.100.3.13/24_ | 100.100.3.2 _10.8.0.0/24_ | 192.168.3.33, 100.100.3.2 One small drawback with this is that the records must be in the `hosts` format file - most of my records were in dnsmasq format files, so I had to migrate the ones that I wanted to split. * * * #### Re-Jigging My Docker Container There was, however, a catch. When I first created my pihole container, the docker invocation looked something like this: docker run \ -d \ --name=pihole \ --hostname=pihole \ --restart=unless-stopped \ --e ServerIP=0.0.0.0 \ --e WEBPASSWORD='NotMyRealPass' \ -v $PWD/pihole/conf:/etc/pihole \ -v $PWD/pihole/dnsmasq.d:/etc/dnsmasq.d/ \ -p 53:53 -p 53:53/udp \ -p 8080:80 \ pihole/pihole This meant that the container was using bridged networking, depriving Pi-hole of the means to see which physical interface a query arrived on: it simply saw the other side of a single bridge interface. So, I killed the container and started a new one using host networking: docker run \ -d \ --network=host \ --name=pihole \ --hostname=pihole \ --restart=unless-stopped \ -e ServerIP=0.0.0.0 \ -e WEBPASSWORD='NotMyRealPass' \ -v $PWD/pihole/conf:/etc/pihole \ -v $PWD/pihole/dnsmasq.d:/etc/dnsmasq.d/ \ pihole/pihole However the container failed to start: Pihole's web interface was trying to bind to port 80 which already had something bound to it. As I'd previously mapped 8080 into the container (`-p 8080:80`), I used the environment variable `WEB_PORT` to tell Pi-hole to bind to that port instead: docker run \ -d \ --network=host \ -e WEB_PORT=8080 \ --name=pihole \ --hostname=pihole \ --restart=unless-stopped \ --env=ServerIP=0.0.0.0 \ --env='WEBPASSWORD=NotMyRealPass' \ -v $PWD/pihole/conf:/etc/pihole \ -v $PWD/pihole/dnsmasq.d:/etc/dnsmasq.d/ \ -p 53:53 -p 53:53/udp \ -p 8080:80 \ pihole/pihole * * * ##### DNS Outage Pi-hole came up, but it wasn't responding to queries. Netstat showed `pihole-FTL` listening and bound to all interfaces: $ sudo netstat -lnp | grep :53 tcp 0 0 0.0.0.0:53 0.0.0.0:* LISTEN 2653543/pihole-FTL tcp6 0 0 :::53 :::* LISTEN 2653543/pihole-FTL udp 0 0 0.0.0.0:53 0.0.0.0:* 2653543/pihole-FTL udp6 0 0 :::53 :::* 2653543/pihole-FTL Packet captures showed that queries were coming in, but no responses were being sent. $ sudo tcpdump -i any port 53 21:54:02.345555 enp0s25 In IP 192.168.3.163.32273 > 192.168.3.5.53: 57965+ A? n-deventry.tplinkcloud.com. (44) 21:54:02.512870 enp0s25 In IP 192.168.3.44.63761 > 192.168.3.5.53: 26967+ AAAA? lycraservice-pa.googleapis.com.home. (53) 21:54:02.524346 enp0s25 In IP 192.168.3.44.1270 > 192.168.3.5.53: 2692+ A? lycraservice-pa.googleapis.com.home. (53) 21:54:02.767189 enp0s25 In IP6 2001:820:aa1a:c443:b9c4:44b:df15:bd8e.36925 > 2001:820:aa1a:c443::2.53: 28460+ A? a.nel.cloudflare.com.home. (43) 21:54:02.767189 enp0s25 In IP6 Queries weren't triggering any activity in Pihole's logs either. To restore service to the LAN, I killed the container and brought it back up with bridged networking - DNS sprang straight back to life. It took me a while to figure out what the issue was, but eventually I spotted this setting in Pi-hole's web interface: Pi-hole was configured to only respond to queries received from interface `eth0`. Resolution stopped because the box that I run pihole on doesn't have an `eth0` (it's a `udev`'y style `enp0s25`). I switched this to `Permit all origins` and restarted the container with host networking. This time, queries were answered. * * * #### Configuring Tailscale The box hosting pihole was already part of the tailnet, but I wanted to remove the previous route advertisement. So I ran sudo tailscale down # Previously this was # --advertise-routes=192.168.3.33/32 sudo tailscale set --advertise-routes= sudo tailscale up Then, from another tailnet client (my laptop), I tried resolving a name via both the LAN and tailnet address: $ dig +short foo.example.com @100.99.55.55 100.100.3.2 $ dig +short foo.example.com @192.168.3.13 192.168.3.33 All that was left was to have tailnet clients actually use Pihole. I logged into Tailscale's web interface and added a Split DNS entry: When bringing tailscale up on my Linux laptop, I had to explicitly pass a flag to allow it to use the advertised server sudo tailscale up --accept-dns The android app has a toggle for this, but it was already on. * * * #### Conclusion My devices now have transparent (and slightly more privileged) access to services when I'm out and about. Because Tailscale acts as a mesh network, I don't need to worry about automatically turning the VPN off when I'm at home - devices in the same segment can direct connect to one another rather than making a round-trip via a remote coordinator. As a result of getting this up and running, I've been able to close off WAN access to a number of services (although I still can't can't do that for any service which hosts something I might try to cast, because Chromecasts ignore local DNS... grrr). It all works well enough that I've been able to write, proof-read and publish this post whilst off net. As an added bonus, Tailscale seem to have partnered with Mullvad, so if I'm ever _travelling_ travelling, I can have my devices route all connections via Mullvad and my tailnet.

Configuring Split Horizon DNS with Pi-Hole and Tailscale
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/general/configuring-pihole-to-serve-different-records-to-different-clients.html?utm_source=rss

#dns #pihole #privacy #tailscale #vpn

05.07.2025 16:15 — 👍 0    🔁 0    💬 0    📌 0
Preview
Syndicating Content Into A Telegram Channel With Python A couple of years ago, as part of adopting a POSSE (Publish On Own Site, Syndicate Elsewhere) approach, I started to automate syndication of my content into other networks. In the time that's passed since, I've added and removed automation for various networks (at one point, managing to get myself banned from Threads in the process). This post is about adding support for syndicating into a Telegram channel, allowing Telegram users to receive notifications when I publish something new. It covers how to create a bot account and a channel before using Python to submit posts. * * * #### Creating a Telegram Bot Account Telegram bot accounts are provisioned by a bot called the BotFather (`@BotFather`), so I used global search to find it and open a conversation. Once everything was ready, the bot provided me with an authentication token for my bot to use: While I was there, I used some other commands to set the bot's name and profile picture: * * * #### Creating a Channel With the bot created, I needed a channel to post new feed items into. A telegram channel is effectively a one-way group: users can subscribe, but only administrators can post into it. So, to be able to post, the bot needed to be added to the channel as an administrator. In Telegram (for convenience, I used the desktop client), I * Opened the menu * Clicked `Create Channel`. I created a channel for testing, called `bentest`. I then browsed into the channel and clicked the 3-dot menu and selected `Manage Channel`: A little way down the channel management dialog was the option `Administrators`. After clicking that I hit `Add Administrator`, which allowed me to search for my bots username: Clicking the name presented a permissions dialog, submission of which added the bot as a channel administrator. * * * #### Scripting I've written in the past about how my POSSE bot walks through my RSS feed, so I won't focus too much on that here, but as a quick overview: * It fetches my RSS feed using `feedparser` * It iterates through each item in the feed As it works through, the script checks whether it's previously successfully submitted each item, so I added a new conditional block to do the same for Telegram: TELEGRAM_SEEN = check_if_link_seen(TRACKING_MODE, linkhash, storedhash, feed, 'telegram') # Only run if we've not published before *and* Telegram # is configured if not TELEGRAM_SEEN and TELEGRAM_CHANNEL and TELEGRAM_TOKEN: # Try and send if create_Telegram_Post(en)): # Mark as sent write_hash_to_storage( TRACKING_MODE, linkhash, feed, hashtracker, firsthash, 'telegram', en['link'] ) print(f"Posted to Telegram: {en['link']}") POSTED = True Posting into Telegram is handled by the `python-telegram-bot` module: python -m pip install python-telegram-bot The script constructs a simple message containing the post name, link and a teaser comprised of the first 100 words: def create_Telegram_Post(en): ''' Publish into a Telegram channel ''' msg = f"<a href='{entry['link']}'>{entry['title']}</a>\n" msg += ' '.join(entry['description'].split(" ")[0:100]).lstrip() msg += f"... <a href='{entry['link']}'>Read More</a>" An important thing to note here is that, although Telegram allows you to submit HTML, it only supports a very limited subset of tags. If you try and include something that the API doesn't support you'll receive a message like the following: > Can't parse entities: unsupported start tag "br" The script initialises an instance of the bot library and sends the message: bot = telegram.Bot(token=TELEGRAM_TOKEN) status = bot.send_message( chat_id=TELEGRAM_CHANNEL, text=msg, parse_mode="HTML" ) The callchain is wrapped in a `try`/`except` to provide simple exception handling and ensure that the bot can continue working through the feed if one item results in an exception. * * * #### The Channel ID On my first run, I found that the bot couldn't post into the channel `@bentest`: > Failed to post to Telegram: Chat not found After trying a few things and getting nowhere, I called the API to fetch the list of events waiting for the bot: curl -v "https://api.telegram.org/bot${TELEGRAM_TOKEN}/getUpdates"| jq Entries within the output provided the channel id and username (which changed when I customised the `t.me` link... doh): "chat": { "id": -1002619163536, "title": "bentest", "username": "testabcd24", "type": "channel" }, So, rather than `@bentest` I needed to use either the ID or `@testabcd24` -e TELEGRAM_CHANNEL="-1002619163536" Messages then started appearing: * * * #### Switching to Prod As the code was working, I created a new public channel: BenTasker.co.uk Feed. I added the bot to it as an administrator and updated my POSSE cron to use the latest version of my script. When the cron next ran, the bot posted my 10 most recent posts into the channel: It's since automatically published my most recent post in too and (by the time you read this) should hopefully have published _this_ post too: * * * #### Conclusion The `python-telegram-bot` library makes posting into a Telegram channel **incredibly** easy - sending a message can be achieved with a couple of function calls. As a result, my POSSE cron-job can now syndicate into My new Telegram channel, allowing Telegram users to chose to receive notifications whenever I publish something new. It wasn't necessary here, but the library also exposes everything necessary to create an interactive bot so that users can actively interact with it.

New #Blog: Syndicating Content Into A Telegram Channel With Python
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/software-development/monitoring-my-rss-feed-and-posting-into-telegram.html?utm_source=rss

#bot #bots #general #posse #python #rss #socialmedia #syndication #telegram

24.06.2025 07:45 — 👍 0    🔁 3    💬 0    📌 0
Preview
Containerising Calibre-Server To Read And Annotate Ebooks Via Web Browser I'm going to an in-person event later this year and, in preparation, attendees have been asked to read (and make notes on) a specific book. For me, this presents something of a challenge: although I _used to be_ an avid reader, nowadays I really struggle to be able to sit and read. I _think_ it's a combination of a busy life and having two dogs who'll happily find something (usually ball related) for me to do if I seem even _vaguely_ idle for more than a few seconds. I didn't _particularly_ want to arrive at the event with nothing but excuses, so I decided that the solution was probably to find a way to increase the amount of opportunity that I get to dip into the book. The easiest way to do that, seemed to be to make the book (and any annotations that I might make) available across all of my various devices. That way it would always be to hand if I found myself briefly unoccupied. The book in question _is_ available on Kindle but, aside from Kindle not properly supporting the majority of my devices, I didn't want to build a workflow around Amazon given that they've stopped users from keeping backups of their purchases. I needed something different. Although the solution would ideally not be tied to one provider's ecosystem, finding something open and multi-platform is apparently not that easy. After searching around a bit, I decided that my best hope probably lay with Calibre's content server (it helped that I've previously used Calibre to manage an old Kindle). The catch, though, was that Calibre's quite GUI heavy and I wanted to be able to run this headless on a server rather than on my machine. In this post, I talk about the customisations that I made to achieve my needs as well as a few to make the eventual solution more secure. * * * ### Requirements I had a few requirements that I wanted to meet: * The solution must be FOSS (which Calibre is) * Needs to be separate from my "main" library: on the offchance I end up sending someone a link, I don't need/want them having access to my full library * Should run as a server rather than needing to be on my personal laptop * Should come back up automatically after a restart * **Must** allow annotations and highlighting etc * * * ### Calibre Image A quick hunt around revealed that Linuxserver have a Calibre image, meaning that running Calibre in docker should be quite easy. Using someone else's post as a reference, I span up a container to play with it: docker run \ --name calibre-server \ -p 48080:8080 \ -p 48081:8081 \ -e PASSWORD='password' \ -e TZ='Europe/London' \ -v calibre-test-config:/config \ linuxserver/calibre:latest This worked but it quickly became clear that the image was heavier and more complex than I really wanted or needed. By default, the container runs Kasm (a web native VNC server) allowing access to Calibre's GUI: This is, undoubtedly, a clever and cool use of technology but it's not _particularly_ convenient for use on mobile devices. In the top right, though, there's a `Connect/share` button. This reveals a menu which allowed me to start Calibre's content server on port `8081` (which the container then exposed as `48081`) This was _much_ more convenient: The simple UI allowed me to browse, read and annotate books: Importantly, it worked just the same on Firefox Mobile. * * * #### Shortcomings The functionality that the container offered was _almost_ ideal, but there were a couple of shortcomings. The Content Server needed to be manually enabled whenever the container restarted, which meant that I'd need to continue to expose the web-to-VNC service. My biggest concern, though, came when I `exec`'d into the container to see what processes were running: ps -x PID TTY STAT TIME COMMAND 1 ? Ss 0:00 /package/admin/s6/command/s6-svscan -d4 -- /run/service 17 ? S 0:00 s6-supervise s6-linux-init-shutdownd 20 ? Ss 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B 36 ? S 0:00 s6-supervise svc-pulseaudio 37 ? S 0:00 s6-supervise s6rc-fdholder 38 ? S 0:00 s6-supervise svc-docker 39 ? S 0:00 s6-supervise svc-nginx 40 ? S 0:00 s6-supervise svc-kasmvnc 41 ? S 0:00 s6-supervise svc-kclient 42 ? S 0:00 s6-supervise s6rc-oneshot-runner 43 ? S 0:00 s6-supervise svc-cron 44 ? S 0:00 s6-supervise svc-de 52 ? Ss 0:00 /package/admin/s6/command/s6-ipcserverd -1 -- /package/admin/s6/command/s6-ipcserver-access -v0 -E -l0 -i data/rules -- /package/ 214 ? Ss 0:00 bash ./run svc-cron 232 ? Ss 0:00 nginx: master process /usr/sbin/nginx -g daemon off; 242 ? S 0:00 sleep infinity 248 ? Ss 0:00 bash ./run svc-docker 373 ? S 0:00 sleep infinity 374 pts/0 Ss 0:00 bash 584 pts/0 R+ 0:00 ps -x The container was running `docker`, `pulseaudio` and various other services. Some of these were even running with `root` privileges. This simply wouldn't do. * * * ### Customising The Container The docker image had _what_ I needed, the problem was that it wasn't running it in the _way_ that I needed. Calibre's UI isn't actually the only way that you can invoke the content server because there's also the `calibre-server` command. So, I decided to experiment with bypassing the container's entrypoint so that it would only run the bit that I needed. Invoking `calibre-server` worked OK within the _existing_ container, but failed when I tried to use it with a fresh/empty volume because it expects that some setup will have occurred first. Although one option was to run the container "normally" on first run, it would have felt a bit defeatist, so I set about figuring out what was needed and wrote a script that could act as the replacement entrypoint: #!/bin/bash # # Custom entrypoint to configure and launch the # Calibre content-server CALIBRE_LIBRARY_PATH=${CALIBRE_LIBRARY_PATH:-"/config/Calibre_Library/"} CALIBRE_USER=${CALIBRE_USER:-"abc"} # Preconfigure user auth if it doesn't exist if [ ! -f /config/.config/calibre/server-users.sqlite ] then calibre-server --userdb /config/.config/calibre/server-users.sqlite --manage-users -- add $CALIBRE_USER $PASSWORD 2> /dev/null fi # Create a library if one doesn't exist if [ ! -d "$CALIBRE_LIBRARY_PATH" ] then # Create the library dir mkdir -p "$CALIBRE_LIBRARY_PATH" # It won't be considered a library by Calibre yet, we need to add a book # so that the DB gets created cat << " EOM" | sed -e 's/^ //' > /tmp/welcome.md # Welcome Welcome to Calibre-Server, preconfigured by Ben Taskers hacky bash script. You should now be able to upload books to your library for reading and annotation. EOM # Add the book calibredb --library-path "$CALIBRE_LIBRARY_PATH" add /tmp/welcome.md fi # Start the server # # We use basic auth mode here because Calibre will # use digest by default. We're going to want our SSL # reverse proxy to send it's own creds, which is much # easier to configure with basic auth # calibre-server \ --listen-on 0.0.0.0 \ --port 8081 \ --access-log /dev/stdout \ --disable-allow-socket-preallocation \ --enable-auth \ --disable-use-bonjour \ --enable-use-sendfile \ --userdb /config/.config/calibre/server-users.sqlite \ --auth-mode basic \ "$CALIBRE_LIBRARY_PATH" Note: there's a copy of this script on Codeberg. Having written this script, I made adjustments to my docker invocation: * Port `8080` was no longer required as the web2vnc stuff wasn't being run * I mapped in the new entrypoint script * I overrode the container's entrypoint so that it would use the script The run command looked like this: # Start from a clean slate docker volume rm calibre-test-config docker run \ --name calibre-server \ -p 48081:8081 \ -e CALIBRE_USER="abc" \ -e PASSWORD='1Ch4ng3d7h15R3411y!' \ -e TZ='Europe/London' \ -v calibre-test-config:/config \ -v $PWD/entrypoint.sh:/entrypoint.sh \ --entrypoint="/entrypoint.sh" \ linuxserver/calibre:latest Logs from the first run showed the preconfiguration adding my Welcome book: Added book ids: 1 calibre server listening on 0.0.0.0:8081 Subsequent runs correctly detected that the library existed and so didn't attempt to re-insert the book. `calibre-server` was listening and I was able to upload, read and annotate ebooks using my web browser: Just as importantly, the container was now running far fewer processes: docker exec -it 118affff481b ps -x PID TTY STAT TIME COMMAND 1 ? Ss 0:00 /bin/bash /entrypoint.sh 12 ? Sl 0:00 /opt/calibre/bin/calibre-server --listen-on 0.0.0.0 --port 8081 --access-log /dev/stdout --disable-allow-socket-preallocation --e 24 ? S 0:00 /opt/calibre/bin/calibre-parallel --pipe-worker from calibre.utils.safe_atexit import main; main() 33 pts/0 Rs+ 0:00 ps -x Whilst this was a significant improvement, it wasn't perfect: the other stuff was _still_ in the container and so could potentially still present attack surface. Whilst it was true that an adversary would now need to start by exploiting an issue in `calibre-server`, vulnerability chaining _is_ a thing and the existence of other tooling can sometimes help promote a minor flaw into a bad day. Ultimately, this was _still_ a container running with root privileges with tools like `docker` inside it. I decided to look at having the entrypoint script remove unnecessary packages by doing something like: apt remove pulseaudio nginx docker-ce cron However, when I looked at the list of installed packages, I found that the image was **much** fatter than I'd originally realised: dpkg-query --list | wc -l 729 * * * #### Can we use Wolfi? Rather than spending time figuring out which of those packages could safely be removed, I decided that it'd probably make sense to start over from a more secure base: enter Wolfi. Wofli is an (un)distro intended as a secure _and minimal_ base for container images. Wolfi's packages are regularly rebuilt in order to remediate CVEs but, unfortunately, there wasn't a package for Calibre, so I wouldn't get the full benefit of this. However, there not being a package wasn't a major blocker: it just meant that I needed to manually install Calibre into the image. _Ideally_ I'd have liked to build it from source, but Calibre's build process is pretty complex so I settled for pulling binaries (which is also what the linuxserver image does). * * * ##### Calibre Dependencies Calibre depends on `PyQt6` which, in turn, depends on a number of shared libraries, so the list of dependencies was a little longer than I'd like. Unfortunately, I also had to build that list manually because `PyQt6` was installed with `pip` which doesn't pay a _huge_ amount of attention to non Python dependencies. This involved a bit of trial and error because the only clue that something was wrong came when Calibre threw exceptions like this (though only ever one at a time): > ImportError: cannot import name 'QWebEnginePage' from 'qt.webengine' > > ImportError: cannot import name 'QBrush' from 'qt.core' This was odd, because these modules are provided by `PyQt6` which _definitely_ was installed. Running `calibre_postinstall` threw similar exceptions, but it's output also gave a clue about the underlying issue: > Failed to import PyQt module: PyQt6.QtWebEngineCore with error: libxkbfile.so.1: cannot open shared object file: No such file or directory Although `PyQt6` was installed, it couldn't be imported because some of _its_ dependencies weren't. Getting this working was a case of 1. Running `calibre_postinstall` 2. Extracting the missing library name from the output 3. Using `apk search` to identifying the package which provided the library 4. `apk add`ing it (and updating the Dockerfile) 5. `goto 1` until exceptions stop It turned out that I needed 15 additional packages. * * * ##### The `Dockerfile` After working through the dependencies, I arrived at the following Dockerfile: FROM cgr.dev/chainguard/wolfi-base # Install Calibre RUN --mount=type=cache,target=/var/cache/apk \ apk add python3 curl jq py3-pip libxcb libnss \ libpcre2-8-0 libglvnd libfontconfig1 libxkbcommon \ libxcomposite libxdamage libxext libxrandr \ libxtst libdrm alsa-lib mesa-gbm libxkbfile \ && pip install pyqt6 \ && mkdir /opt/calibre /config \ && latest=$(curl -s "https://api.github.com/repos/kovidgoyal/calibre/releases/latest" | jq -r '.assets[] | select(.name | contains("x86_64")) | .browser_download_url' ) \ && curl -L -s -o /tmp/release.txz "$latest" \ && tar xvf /tmp/release.txz -C /opt/calibre \ && rm -f /tmp/release.txz \ && /opt/calibre/calibre_postinstall COPY entrypoint.sh /entrypoint.sh ENTRYPOINT ["/entrypoint.sh"] One build later, I had a Wolfi based container running Calibre Content Server. Despite the dependency list being longer than I'd wanted, the container still had far less in it than the linuxserver image had: apk list -I | wc -l 82 I wasn't done yet, though, as there was still room for improvement: * The application was still running as root * There were a few packages that could be removed post-install I tagged the following onto the end of the existing `RUN` statement: && apk del curl jq py3-pip wolfi-keys wolfi-base apk-tools \ && adduser -u 1000 -D calibre \ && chown calibre /config And then added a line to have the container drop permissions: USER calibre My container image was ready, so I built and published it: docker tag test codeberg.org/bentasker/calibre-content-server-docker:8.5.0 docker push codeberg.org/bentasker/calibre-content-server-docker:8.5.0 As a final indicator of the benefits of this, my image was 44% of the size of the linuxserver one: $ docker image ls | grep calibre codeberg.org/bentasker/calibre-content-server-docker 8.5.0 84b9667dc463 7 hours ago 1.29GB linuxserver/calibre 8.5.0 347f2c1b5fe2 39 hours ago 2.91GB Unfortunately, both Calibre and PyQt6 are quite large so there wasn't much scope to reduce the size further. You can see the final Dockerfile on Codeberg. * * * ### Deploying I was ready to actually deploy into production, which meant * Standing a container up on the host box * Acquiring a SSL cert * Configuring my Nginx reverse proxy Note: I could also have dropped it into my Kubernetes cluster, but decided that it was better to keep things simple in case I ended up needing to troubleshoot issues. I don't use named volumes in prod: instead I bind mount from the host filesystem (allowing backups to be performed with `rsync`), so I started by creating the directory structure and setting the owner to the UID used by the container: mkdir -p /home/ben/docker_files/Calibre_Web/data sudo chown 1000 /home/ben/docker_files/Calibre_Web/data Next, I started the container: CALIBRE_PASS="UseSomethingSecure" docker run \ -d \ --restart=always \ --name calibre-server \ -p 48081:8081 \ -e CALIBRE_USER="abc" \ -e PASSWORD="$CALIBRE_PASS" \ -e TZ='Europe/London' \ -v /home/ben/docker_files/Calibre_Web/data:/config \ codeberg.org/bentasker/calibre-content-server-docker:8.5.0 `calibre-server` was up and I was able to hit it directly and browse to my welcome book: * * * #### Fronting with Nginx Note: Because this isn't a fresh deployment of Nginx, some of this may be a little specific to me I created a simple server on port 80: server { listen 80; root /usr/share/nginx/letsencryptbase; index index.php index.html index.htm; server_name calibre.example.com; location / {thoug return 301 https://$host$request_uri; add_header X-Clacks-Overhead "GNU Terry Pratchett"; } location /.well-known/ { try_files $uri 404; } } I won't go into depth here, but next I created DNS records and used `certbot` to acquire a SSL certificate for that subdomain. I was almost ready to configure the HTTPS `server` block. However, first I needed to construct the `authorization` header that the proxy would send to Calibre. I _could_ just have let the proxy pass through the client provided auth header, but I prefer the flexibility of managing auth within Nginx (and I still wanted auth turned on in Calibre so that random devices on the LAN couldn't so easily hit it directly). Basic Authorization is just a base64 encoding of `username:password`, so in a shell I ran echo -n abc:${CALIBRE_PASS} | base64 I took note of the result and started to write the Nginx config: server { listen 443; root /usr/share/nginx/empty; index index.php index.html index.htm; server_name calibre.example.com; ssl on; ssl_certificate /certs/calibre.example.com/fullchain.pem; ssl_certificate_key /certs/live/calibre.example.com/privkey.pem; location / { proxy_pass http://192.168.13.5:48081; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $remote_addr; proxy_set_header Host $http_host; # Set this to the output of the shell call proxy_set_header Authorization "Basic ZGlkIHlvdSB0aGluayBJIGZvcmdvdCB0byByZWRhY3QgdGhpcz8="; # Bump the maximum request size up so that # ebooks can be uploaded client_max_body_size 20m; satisfy any; allow 192.168.13.0/24; deny all; auth_basic "Authentication is a must"; auth_basic_user_file /etc/nginx/htpasswd-files/developers; add_header X-Clacks-Overhead "GNU Terry Pratchett"; } } With this live, I was able to use HTTPS to access Calibre. * * * #### Tangent: Obsidian Integration I use Obsidian for a range of things and one of the aspects that I like about it is its extensibility - there's a vast range of community plugins to make it do new and wonderful things. It turned out that there's a Calibre integration plugin for Obsidian, which communicates with the Content Server. So, as well as being able to read books in a web browser, I can also use Obsidian: All of the underlying functionality (annotations etc) works. That's not too surprising as, after all, Obsidian's an electron app and so is really just a glorified web browser. The only minor issue with the plugin is that it doesn't expose a way to provide credentials, so it'll only work from IPs that I've allow-listed (which is fine, because portable devices tend to be on my tailnet). * * * #### Conclusion I now have a web based instance of Calibre which allows me to read and annotate ebooks on any device with a web-browser. Admittedly, highlighting and annotating small passages using a touch screen is a _little_ fiddly but it otherwise seems to work well enough: On Android, using it as a web app seems to work particularly well, with support for swiping to turn the page. Pleasingly, there's also _sort of_ an offline mode (the docs note that it's not as fully featured as it could have been). Hopefully, all of this should enable me to pick the book up and progress a little whenever I've a few minutes spare, something that wouldn't be as possible otherwise. Of course, arguably, the time that I've spent doing and writing about this _could_ instead have been used to, err, _read the book_.

New #Blog: Containerising Calibre-Server To Read And Annotate Ebooks Via Web Browser
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/general/containerising-and-deploying-calibres-content-server-to-allow-web-annotations.html?utm_source=rss

#calibre #containers #docker #ebook #web #wolfi

23.06.2025 07:30 — 👍 1    🔁 0    💬 0    📌 0
Preview
Dealing With Nuisance Abuse Complaints This week, I've been wrong about (at least) two things, believing that: * I was done with the Copytrack stuff: I sent them proof of license and subsequently audited my image use to ensure they'd never have any excuse to darken my virtual doorway again. * I could have a lazy Friday afternoon: I had a day off and intended to doze in front of Four in a bed. Unfortunately, I was wrong on both counts. Not long after I had settled down to watch people complain about the quality of sausages, my phone went "bing", indicating the arrival of an email (soon followed by another). It seems that Copytrack's customer had decided to start filing abuse reports with various service providers, insisting that they either take down my Copytrack blog post or _suspend my service entirely_. I imagine that there's an opportunity here for others to learn from my experiences, so this post takes a look at those complaints. As you might imagine, though, I _am_ going to choose my words a bit more carefully than normal. * * * #### Oh, Do Fuck Off This isn't the first legal threat that I've had to deal with in my time scrawling on the interwebs, so intimidation really wasn't on the cards. I choose my service providers quite carefully, looking to use those that are competent rather than focusing solely on things like price. That's done more with service quality in mind than abuse report handling, but oddly enough, providers that invest in service quality also tend to have good customer facing teams. But still... of all days, this just **had** to happen on what was supposed to be a relatively restful and stress-free day... * * * #### Contents * Oh, Do Fuck Off * Contents * The Complaint * Claim 1: The Content Is Defamatory * Statement Of Fact * Is It Defamatory? * Claim 2: It Violates GDPR * Legal Basis For Processing * Right To Be Forgotten * Rights To Object * Claim 3: It violates the Acceptable Use Policy * Follow Up Claim: Give Me His Info * TL:DR Summary * Received A Complaint? What To Do * Conclusion * * * ### The Complaint I'm not going to publish the originator's details here: inclusion of them would serve no material purpose and I don't _really_ want to give them a second URL to try and complain about. Instead, I'll simply refer to them as the complainant. When an abuse complaint is received, most service providers will send you a copy (for obvious reasons, it's sometimes redacted to strip the originator's details) and ask you to respond within a specific time-frame. Let's walk through the complaint email, which starts by providing an overview: > To the Legal Department, > > I am contacting you in my capacity as the legal representative of REDACTED, a French audiovisual production company headquartered at REDACTED > > This is to formally notify you of defamatory and unlawful content hosted on your infrastructure, accessible at the following URL: https://www.bentasker.co.uk/posts/blog/general/copytrack-sent-me-a-copyright-claim.html > > This article references REDACTED, in the context of a copyright enforcement case managed by COPYTRACK, a company with whom we are legally and professionally affiliated. The email then moves onto giving a more detailed description of the grounds for their complaint (we'll examine each of these in more depth shortly): > 1. **Defamation and Reputational Damage** : The article portrays our company as engaging in “copyright trolling” practices and uses derogatory language that undermines the legitimacy of our copyright enforcement efforts. These statements are entirely unfounded and constitute defamation under the UK Defamation Act 2013, resulting in reputational and financial harm to our business. > > 2. **Breach of UK GDPR / Data Protection Law** : The article discloses identifiable information about our company and myself, in a hostile and non-journalistic context, without consent. This violates the UK GDPR, specifically Article 6 (Lawfulness of Processing) and Article 17 (Right to Erasure). > > 3. **Violation of PlutoDNS Acceptable Use Policy** : Your published terms (https://plutodns.com/terms.html) prohibit the transmission or storage of content that is defamatory, illegal, or infringes the rights of third parties. The content in question clearly falls within these prohibited categories. > > Item three contains a pretty major mistake: PlutoDNS are **not** the provider that the complaint was sent to. In fact, I'd never even heard of them until now and they don't appear to be in any way affiliated with any of my service providers. It's quite an odd mistake to make. The best that I can conclude is that either the complainant was raising complaints about multiple people with multiple providers, or they've been using something like ChatGPT and not properly checking the output. Anyway, the mail then moves on to requesting that the service provider complete specific actions: > In accordance with your obligations as a hosting provider under UK law and the retained EU e-Commerce Directive (2000/31/EC), I formally request that you: > > 1. Remove or disable access to the infringing page mentioned above. > > 2. If moderation is not feasible, suspend or terminate the account responsible for hosting the domain bentasker.co.uk, due to repeated violations of your Acceptable Use Policy. > > There's also a small mistake here, in that none of the providers that complaints were sent to provide me with hosting services. This might sound like semantics, but it matters because the obligations (and exemptions) imposed by the E-Commerce directive vary depending on the nature of the provider: * A hosting provider may themselves become liable if they do not remove/disable once relevant content is reported to them. * A caching provider (e.g. a CDN), though, is exempted until they become aware that the upstream hosting provider has removed/disabled (or they receive a court order etc). * Providers acting as a mere conduit (for example, a connectivity provider) will almost never become liable. Obviously, the complainant has no way to know whether a CDN provider might also be acting as a hosting provider, but that's not the case here. It's inclusion in a complaint mail is also, somewhat unnecessary: providers are very aware of their responsibilities and duties and 2000/31/EC is not in-and-of-itself grounds for a takedown. Mentioning it feels more like a threat that you could try and come after them if they don't accede to your demands. The demand to terminate my account on the basis of "repeated violations of your Acceptable Use Policy" is also mildly interesting, in that the complainant is only actually alleging one occurrence. The complaint email then closes out with a set of links labelled as references and a request that the provider respond within 7 days: > REFERENCES > > Infringing content: https://www.bentasker.co.uk/posts/blog/general/copytrack-sent-me-a-copyright-claim.html > > PlutoDNS Terms of Service: https://plutodns.com/terms.html > > Our company website: REDACTED > > UK ICO: https://ico.org.uk > > CNIL (France): https://www.cnil.fr > > We kindly request written confirmation of the actions taken within 7 working days. Failure to act may result in escalation to the appropriate legal and regulatory authorities in both the UK and the EU. I'm not _quite_ sure what purpose the links to the homepages of the UK and French data protection authorities are actually supposed to serve - again, this feels a bit GPTesque to me. Anyway, moving on, let's take a closer look at each of the claims made. * * * #### Claim 1: The Content Is Defamatory > 1. **Defamation and Reputational Damage** : The article portrays our company as engaging in “copyright trolling” practices and uses derogatory language that undermines the legitimacy of our copyright enforcement efforts. These statements are entirely unfounded and constitute defamation under the UK Defamation Act 2013, resulting in reputational and financial harm to our business. > ##### Statement Of Fact The post being referenced was originally about a different company: Copytrack. In the original version of the post, the complainant's company was mentioned in passing, noting that * Copytrack claimed to be acting on their behalf * The complainant's company was not listed as the author on stock photo sites * I had been unable to find any evidence suggesting a link between the complainants company and the image that Copytrack had contacted me about Ten days later, an update to the post referenced the complainants company some more. This update was made because Copytrack provided documentation indicating their right to act on behalf of the complainant (who in turn had signed a Copytrack provided document stating that they hold the rights to the image). The complainants name appeared because it featured prominently on both of these documents - documents that, in the unlikely event that the dispute proceeded to court, would also enter the public record. Given the concerns raised, on sites across the web, of Copytrack seemingly being (mis)used by copyright trolls, the post goes on to perform some basic due diligence checks on the complainant's company, noting that: * They do appear to be a real (and long standing) business * Their website states that they work with Copytrack * The name on the document does appear to be a real person, at that company The post goes on to note that the company's Trustpilot page carries an interesting statement (translated from French): > VERY IMPORTANT | LEGAL STATEMENT | Protection of Intellectual Property `COMPANY` would like to remind you that any infringement of intellectual property, including the unauthorized use of our visual, audiovisual, or textual content, constitutes an offense punishable by civil and criminal prosecution, in accordance with applicable national and international laws. > > We reserve the right to take any necessary legal action against individuals or legal entities who have published defamatory or false opinions or who have exploited our works without prior authorization. > > Furthermore, any attempt at public defamation or publication of misleading content aimed at harming our reputation may be reported to the relevant platforms as well as to the competent authorities in the author's country of residence. It also notes that the same Trustpilot page carries Copytrack related reviews and quotes a relevant excerpt from one of them. > We have been an Elements member for nearly 5 years and support a lot of small creators, but shady tactics from this company are now making us consider even using stock images going forward. > > They are wasting a lot of our time with aggressive copyright claims for images - and unrealistic threatening demands for hundreds of pounds required in compensation. From there, my post * References a post by another blogger which claims to have made contact with the author of their photo, who denied any knowledge of the complainant or his company * Notes that there are reviews mentioning the complainant on Copytrack's Trustpilot page too, giving an example The final paragraphs of that section note that it's not possible to definitively say that the complainant is acting in bad faith and, in fact, goes so far as to suggest alternative possibilities: > In fairness, there's no way to definitively say whether Copytrack's customer is acting in bad faith - they could as easily be someone who doesn't understand how Copytrack works, or someone who's bought the rights (or believes they have) from some other org. ##### Is It Defamatory? The abuse complaint specifically references the UK Defamation Act 2013. UK defamation law is known to be a bit of a minefield, but the law provides for a number of defences, including Truth and Honest Opinion. **Truth**: if the complainant does not, in fact, own the rights to the image in question, any imputation in my post of them being a copyright troll would, in fact be true and not defamatory. To date, I've not put much effort into tracking down the original author to check with them, but it shouldn't be necessary: Copytrack provided a signed statement from the complainant stating that they had acquired rights to the image via a licensing agreement with the original author. The complainant should therefore be able to clarify the nature (and/or existence) of this agreement (in fact, section 10.2 of Copytrack's enforcement terms require that their customers be able to do so, on demand, something that Copytrack didn't appear to do when I requested proof). **Honest Opinion**: The main thrust of the article is **not** that the complainant is a copyright troll - it's focus is around the idea that Copytrack's mode of operation (and an apparent lack of due diligence) appears to enable (and even encourage) copyright trolling. The complainant's company was **not** randomly selected or presented as an example of this. They appear in the post _because_ they are the originator of the claim that Copytrack sent to me. Where the post does make reference to the rightfulness of this claim, it is quite clearly expressed in an opinionated manner: > Although it was still not clear that the claim is legitimate, I was fed up of having Copytrack in my inbox (with their ever present instructions on how to pay), so emailed over the uncropped screenshot that they'd asked for. It seems unlikely that the content of my post could be construed as being particularly defamatory about the complainant and their business: * That Copytrack said they were acting on behalf of the complainant is factual * That the complainant uses Copytrack is factual and documented, both on their own site **and** in the complaints that they sent to my service providers * Although the post suggests that Copytrack's communication style is reminiscent of copyright trolls, any suggestion that the complainant might be a troll is clearly caveated and never stated as fact * * * #### Claim 2: It Violates GDPR The second claim in the complaint email stated that my post violates GDPR: > 1. **Breach of UK GDPR / Data Protection Law** : The article discloses identifiable information about our company and myself, in a hostile and non-journalistic context, without consent. This violates the UK GDPR, specifically Article 6 (Lawfulness of Processing) and Article 17 (Right to Erasure). > Let's walk through this one bit by bit (The UK's implementation of GDPR is in the Data Protection Act 2018, but for consistency's sake I'll refer to it simply as GDPR): > 1. Breach of UK GDPR / Data Protection Law: The article discloses identifiable information about our company > GDPR defines rules relating to the protection of data about "natural persons". A company is **not** a natural person, and so processing of company data is not subject to GDPR (though, of course, processing of data about individuals within the company may be). There's no need to take my word for it, because the European Commision even has a page on it (for the record, the complainant's company does not appear to be a single person company - their website lists 5 people). > and myself, As noted in the Commission link above, GDPR _does_ apply to personal data relating to people in the course of a professional activity. This is not particularly surprising, given that that would otherwise be a **huge** loophole. So, let's look at what information the post shared about the complainant specifically: * Their name: this featured prominently on documents provided by Copytrack and was also referenced in the article body * A link to the complainants public profile on their company website It's worth noting that there was also some redacted personal data: the Copytrack documents included the complainant's business address (which might, of course also prove to be the complainants home). At no point was this information published unredacted (although, for the record, it appears that the same details _are_ published on the complainant's own site). I didn't seek those address details out: Copytrack sent them to me (and in fact, for other reasons, my 2025-05-10 update actually noted that I believed Copytrack's systems might not be safely handling EU user's data). So, the published personal data is a name: one which (demonstrably) appears in a number of locations around the web (though this, of course, is not a defence under GDPR) and is easily discoverable by searching the complainants company name. Given the wide availability of this information, the risk of harm is likely to be very low. > in a hostile and non-journalistic context, The term "hostile" can be quite loaded - some use it as an extremely strong term, others simply mean a less than friendly environment. It's inclusion here, though, doesn't seem _particularly_ pertinent to the application of GDPR. On the face of it, the reference to a non-journalistic context might also seem irrelevant, but has likely been included because the Data Protection Act 2018 includes exemptions for * The purposes of journalism * Academic purposes * Artistic purposes * Literary purposes It doesn't seem worth going into too much depth assessing whether any of these might apply here, given that there are other aspects of the claim which may obviate the need to do so. Instead, lets move onto the real bones of the GDPR claim: ##### Legal Basis For Processing > without consent. This violates the UK GDPR, specifically Article 6 (Lawfulness of Processing) This part of the complaint echoes a common misconception that consent is the only lawful basis under which GDPR allows personal data to be processed. In reality, though, consent is one of six: > a) the data subject has given consent to the processing of his or her personal data for one or more specific purposes; > > b) processing is necessary for the performance of a contract to which the data subject is party or in order to take steps at the request of the data subject prior to entering into a contract; > > c) processing is necessary for compliance with a legal obligation to which the controller is subject; > > d) processing is necessary in order to protect the vital interests of the data subject or of another natural person; > > e) processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller; > > f) processing is necessary for the purposes of the legitimate interests pursued by the controller or by a third party, except where such interests are overridden by the interests or fundamental rights and freedoms of the data subject which require protection of personal data, in particular where the data subject is a child. There's very little processing of personal data performed for the purposes of running my blog, but almost all is performed under the basis of Legitimate Interests. My Privacy Policy also clearly explains that I may sometimes share or publish information: It could, perhaps, also be argued that this processing could occur under the basis of being in the public interest: adding to the communal body of work (i.e. comments and posts) warning about concerns with some of the financial demands being made, by a company (Copytrack) seemingly associated with problematic claims and indiscriminate working methods, in the name of the complainant and others. One noteworthy aspect of processing under Legitimate Interests, though, is that it does give the data subject the right to object to that processing. So the complainant _could_ have contacted me and asked that I (re)assess whether my interests in the inclusion of that data were sufficient to override their rights. Within a month of that request, I would need to perform a new Legitimate Interests Assessment (LIA) and then act upon the result. Instead, they opted to file (quite poorly substantiated) complaints with my service providers. This does not _remove_ their right to object, but it does imply that they may be trying to make a point/example (by getting service suspended) rather than actually trying to exercise their GDPR rights. We're still not quite done with the GDPR claim though, because the email continues. ##### Right To Be Forgotten > and Article 17 (Right to Erasure). Article 17 of the GDPR is sometimes known as the "right to be forgotten". It gives data subjects the right to request erasure of data where one of the following applies: > a) The data is no longer required for it's original purpose > > b) the data subject withdraws consent where processing was on the basis of consent **and** no other lawful basis applies > > c) The data subject objects to the processing **and** there are no overriding legitimate interests > > d) The data has been processed unlawfully > > e) The data must be erased to comply with a legal obligation > > f) The data was collected in relation to the offer of information societies in Article 8 The most important thing to understand here, though, is that it places a duty on the **Data Controller** (in this case, me). Outside of a court order, Article 17 is of bugger all relevance to intermediate service providers and is not something which would or should cause someone like a CDN or DNS provider to suspend services. Ignoring that, though, the right to erasure is **not** absolute and Article 17 provides exemptions where processing is necessary: * for exercising the right of freedom and expression * for compliance with a legal obligation * for reasons of public interest in the area of public health * for archiving in the public interest, scientific or historical research (but only if erasure would mean it cannot be achieved) Running a **personal** blog should fall, quite squarely into that first exemption. But, as above, it's all rather moot because Article 17 requires that the complainant make a request to the controller (me) rather than invoking references to it in complaints to random service providers. ##### Rights To Object Instead of attempting to have a third party interrupt my service, the complainant had two valid routes to request that I re-assess the inclusion of their name in my post: * The right to object to processing under Legitimate Interests * The Article 17 right to request erasure Despite specifically referencing GDPR in their complaint email, the complainant has not chosen to exercise either of these rights. Splitting hairs a little, it could also be argued that the phrasing used in the complaint emails is problematic. GDPR provides data subjects with various rights, however the complainant clearly expresses that they are communicating on behalf of their company (which has no GDPR rights): > I am contacting you in my capacity as the legal representative of REDACTED, a French audiovisual production company headquartered at REDACTED None-the-less, having become aware that the complainant has objections, I have proactively completed a Legitimate Interest Assessment to balance their rights against my interests. * * * #### Claim 3: It violates the Acceptable Use Policy Here, I'll be generous and ignore that the complainant is referencing the acceptable use policy of an **entirely unrelated and unassociated company**. > 1. **Violation of PlutoDNS Acceptable Use Policy** : Your published terms (https://plutodns.com/terms.html) prohibit the transmission or storage of content that is defamatory, illegal, or infringes the rights of third parties. The content in question clearly falls within these prohibited categories. > What I have to be less generous about, though, is the apparent mischaracterisation of that company's terms, which don't actually make any reference to defamation or to content which "infringes the rights of third parties". It doesn't seem to be that the author mistakenly inserted the wrong link because searching PlutoDNS's site doesn't reveal any other relevant policies. The most likely explanation seems to be that the complainant is lawyering-by-GPT: asking an LLM to summarise the page and not noticing hallucinated portions of the response. If we look, instead, at my CDN provider's AUP we can see that it covers a range of things: * Material protected by copyright, trademark, trade secret or other IP rights (and without proper authorisation) * Material that is obscene, defamatory, constitutes an illegal threat or violates export control laws * Illegal material (CSAM, malware, scams etc) What this tells us is that, in order to compel a takedown, the complainant would need to adequately demonstrate that the post is defamatory. * * * ### Follow Up Claim: Give Me His Info Unfortunately, this was not some fire-and-forget attempt: when providers processed my dispute and closed the claim, the complainant attempted to argue against it. I'm (obviously) not privy to the full communications, but I understand that the complainant tried to insist that the service providers release details about me, requesting * Records relating to any agreements that we may have (even if indirectly) * Confirmation of the types of service provided (DNS, email, hosting etc). In support of this request, they referenced GDPR Article 15 (Right of access by the data subject). This claim is **absolute bollocks** : In the context of data about **me** , the data subject is **me** and not the complainant. I've noted a few times above that I thought the complainant might be using ChatGPT (or some other LLM) and, for me, this rather cements that. I'm just not sure that I can be convinced that anyone could independently believe that Article 15 of the GDPR provides each of us the right to run around demanding information about _other data subjects_. If true, it really would be quite ironic that someone so apparently invested in copyright enforcement is relying on a system who's creators claim is only possible by ignoring copyright. The information that the complainant wanted would only generally be released to a third party in response to a court order. Even then, it's not quite clear what bearing information about things like who provides my email would have. * * * ### TL:DR Summary I've written quite a lot in this post, so to summarise my response for each of the claims: * **Defamation** : The post is unlikely to be considered defamatory, the sections of the post relating to the complainant are either factual or quite clearly stated as opinion. The post even offers viable alternative explanations. Any mention of copyright trolling is in reference to the way in which Copytrack operates or is offered as commentary on reports of similar issues elsewhere on the web. * **GDPR** : The complainant's name **is** used (and covered by GDPR), however this is performed under a valid lawful basis and (given that the complainant's name is publicly documented, including on their own company website) the risk of harm is assessed as low. The remainder of the complainant's GDPR claims are invalid, as they either misunderstand the principles of GDPR (which applies to the personal data of natural persons, not companies) or misunderstand the applicability of parts of it (such as Articles 15 and 17). In any case, complaints would need to be filed with the data controller (me) or a supervisory authority (the ICO) and not an intermediate service provider. * **Acceptable Use Policy Violations** : Under my various providers AUP's, the complainant would need to demonstrate that the blog post is defamatory or otherwise illegal. They do not appear to be able to do so. * * * ### Received A Complaint? What To Do This seems like a good point to talk about what you should do if you've received notification of an abuse complaint. First, **breathe** : service providers don't generally take action immediately - you'll often have at least a day to respond. Once you've calmed yourself: 1. If your service provider provides the means to do so, acknowledge that you've received the report and are looking into it 2. Sit and re-read both the content and the complaint 3. Try to identify whether there are any possible grounds to the claim: perhaps you wrote something which can be read in a different light? Are you publishing something questionable that someone else said unnecessarily? 4. If you conclude that it's appropriate, make changes to (or remove) the content, but keep a note of exactly what you've changed (I track changes in git, but if you're running Wordpress, something like the Simple History plugin can do this for you). Once you're ready, respond to your service provider, but keep it brief - they don't need to know minutiae, nor do they need to hear how unfair it is. All that they need is the information necessary to be able to quickly assess whether you (or they) need to do more. For all that I've written here, what I actually sent to most service providers was along the lines of: > The grounds for the complaint are invalid. > > The content being complained about is not defamatory (it's factual) and is not in violation of GDPR (which applies to information about individuals, not publicly accessible details about companies). > > Please close the abuse case - FYI the same individual has made complaints to some of my other providers too. Your provider will also have reviewed the content, and if they need more information, they'll generally reply and ask for it. If you believe that this is part of a campaign and that complaints may also be sent to other service providers, it's worth contacting their abuse teams to give them a heads up. They can't simply ignore complaints and will still need to follow their policies, but getting ahead of the curve will help empower them to resolve the complaints quickly. Finally, don't be like me: you _probably_ shouldn't blog about it. It might be tempting to rant and rave (or wose, gloat) but, if further action is subsequently taken, your lawyer may well observe that you've made your own life harder. * * * ### Conclusion Receiving notification of an abuse complaint can be quite stressful: is a provider going to over-react and turn everything off on the say-so of a somewhat dubious claim? The answer to that is almost always "no". If a provider _does_ turn things off without giving you adequate time to respond, you should probably consider that a good sign of a bad service provider. Nuisance reports, by all accounts, are not uncommon and good providers invest in teams able to handle them. Once you've received a report, you need to move relatively quickly but without panicking: assess the details of the complaint against the content being complained of - can it reasonably be construed in the described way? Be aware that, in some cases (like mine), the complainant may also include claims which don't stand up to scrutiny. In a world full of genAI originated inaccuracies, you should also check that everything the complaint says is actually accurate (a **good** provider should also be checking this, but LLMs include such foundational mistakes that they can be easy to miss). By extension, **never ever use ChatGPT (or Copilot, Gemini etc) to respond** , LLM's are incredibly unreliable sources, prone to just making shit up and you **need** to be sure that you'll be able to stand by anything that you've sent in response. Having gathered the facts, send your service provider a brief overview of your findings - if there's no apparent issue, state why. If there was an issue, state how you've addressed it, etc. A good provider will (almost always) want to work with you to resolve things if they feel that you need to do more. Unfortunately, if you're writing critically about things that involve other people, however careful you are, you _will_ sometimes attract complaints and nuisance reports. They're not the end of the world, but you do need to be willing to address them as objectively as possible.

Dealing With Nuisance Abuse Complaints
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/law/handling-abuse-complaints-being-sent-to-your-providers.html?utm_source=rss

#abuse #analysis #copytrack #defamation #gdpr #law

07.06.2025 14:30 — 👍 0    🔁 1    💬 0    📌 0
Preview
Replacing the Valve In a Stuck or Dripping Tap I hate plumbing and, as a result, tend to try and put off any job where a mistake could lead to an inexorable drip-drip that might only become apparent when a ceiling comes down - admittedly, not too likely when repairing a stuck tap. A stuck tap is also harder to ignore (though not impossible, I managed to ignore one for long enough that I ended up with a second one which also needed doing). This short post describes how to replace the valve on a stuck (or dripping) tap - the photos and examples are for a quarter turn tap, but the same process also applies to a twist tap (i.e. one that turns multiple times and uses a compression valve). The entire process (not including getting a replacement valve) takes about ten minutes. * * * #### Valve Sizing Before moving onto the easy practical stuff, let's deal with the more complex bit: understanding sizing. If you're in the UK, there's a _very_ good chance that the valve size that you need has a G 1/2" BSP thread. With that knowledge, in order to avoid ordering the wrong part, you might be tempted to measure your valve to confirm that it is `12.7mm` in diameter. You might even be eyeballing the valve at the moment and thinking "nope, that's got to be 3/4". But, don't be fooled: BSP sizes don't match what you'd measure with a ruler and the diameter of a 1/2" BSP thread is actually more like 20mm. The 1/2 inch measurement relates to the inside diameter of the steel pipe that the thread would originally have been on the outside of. As a rough guide to sizing: BSP size | Rough diameter ---|--- 1/4" | 13mm 1/2" | 20mm 3/4" | 26mm Why do we still use these? Just because... Although not too relevant here, there's also some fun with the letters. You can either play the odds and get a 1/2" BSP valve or wait until the valve is visible and measure it to be sure of the size. * * * ### The Practical Bit The tap in my examples is a standard bathroom basin tap: Before you start, put the plug in so that nothing can get dropped down the drain. To begin, we need to take the lever off the tap - it's probably held on by a screw at the top, but feel around the sides of it in case there's a grub screw (note: if there is, you'll probably need a different valve to those described here). If there's no grub, then the screw is likely to be under a cover, use a thin flathead screwdriver to prise the top off and reveal the screw: Remove the screw and set it aside. Slide the lever up and off the valve - it sits on splines (normally about 20), so may grip a little. You should now be able to see the top of the valve: This is your opportunity to check the size of the valve. Measure the diameter at the widest part and then compare it to the table above - you're looking for a rough rather than exact match (if you want a better match, you'll have to continue to remove the valve). Before proceeding to the next stage, if you haven't already: ISOLATE THE TAP. There may be an isolation valve in the pipe leading to the tap, but if not, you'll need to turn the house supply off. Once done, sit the lever back over the splines and turn the tap on to release any residual pressure. The valve can be unscrewed using a deep socket. You might find that it's extremely tight and that the tap will move instead. You **can** brace against the tap but, if you've got one, it's a lot quicker and easier to use a battery impact gun to buzz it off. Either way, get the valve turning and then loosen and remove it: * * * #### Compression Valves: Cleaning If you've got a compression valve that's sticking, you may not need to replace it. If you look you'll see that there are two parts, the main body and the bit with the washer on. If you turn the splined spindle, you should find that the section with the washer advances until it comes off - you can then use wire wool to clean corrosion off it and the inside of the main part (wrap some wire wool over the end of a screwdriver to really get in there). Apply some silicon grease before screwing the washer section back on and you'll likely find that the valve's good to go for a few more years. * * * #### Refitting Refitting is the reverse of removal, take your new valve: Bung it into your tap-hole and screw it down: Push the tap lever back over the splines, but don't screw it down just yet. Turn the water back on and check that the tap operates (you may get some spitting as air clears out). Then, remove the lever and look for any signs of leakage - you're looking for beads of water around the edge of the valve or (if you've been rough) the tap. Once you're happy, put the lever back on and screw it down. Put the cover back on (I've fitted a new tap lever here): You've probably disturbed a fair amount of limescale and generated debris, so clean around the area, but then you're done!

New #Documentation: Replacing the Valve In a Stuck or Dripping Tap
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/house-stuff/replacing-the-valve-in-a-bathroom-tap.html?utm_source=rss

#diy #housestuff #plumbing

05.06.2025 20:15 — 👍 0    🔁 0    💬 0    📌 0
Preview
Deploying Anubis to protect against AI scrapers Much like others on the net, I've seen a marked increase in traffic to my sites. The new requests sport browser-like user-agents and originate from a range of IPs. Between them, though, they browse _much_ more rapidly and systematically than human users ever do. It's pretty clear that these requests are coming from bots and, on the balance of probabilities, are likely the result of others seeking to build and train AI models by crawling the web. Even ignoring the moral issues with the way that the AI industry has trained models, their rapid rate of crawling often translates into high resource usage on the services that they're hitting. Caught up as it is in a goldrush fever, the AI industry is externalising some of its costs onto the very people whose content it's misappropriating (not to mention the environmental cost which is externalised onto _all of us_). The cost of serving a static HTML site like this one isn't huge and so, I've generally been minded to ignore the traffic and content myself with laying tarpits and including prompt injections in pages. But, dynamic sites (like my cross-domain search) are more expensive to serve and high request-rates can cause issues that simply don't exist at more routine traffic levels. I decided that the time had come to do something about it, so I've deployed a reverse proxy designed to exclude (most) automated crawlers. This short post describes the process that I used to deploy anubis to either prevent bot requests reaching the backend or, for those able to solve the challenges, to substantially increase the cost of crawling. * * * #### Running Anubis My edge infra uses a docker-compose file, so I added a service for Anubis: services: anubis: restart: always image: ghcr.io/techarohq/anubis:latest container_name: anubis network_mode: host environment: COOKIE_DOMAIN: "bentasker.co.uk" TARGET: "http://127.0.0.1:8023" # The opengraph passthrough functionality # allows social previews to still work OG_PASSTHROUGH: "true" OG_EXPIRY_TIME: "1h" OG_CACHE_CONSIDER_HOST: "true" In an ideal world, I wouldn't be using `host` networking but it's needed because of the way that I have `openresty` configured. I **do** have pretty strict firewall rules though, so any risk should be fairly minimal. If you're not using containers, anubis supports communication over a unix socket, which is much more convenient. The environment variable `TARGET` defines where successful requests should be proxied onto. * * * #### Enabling Anubis is used as an additional proxy hop: Nginx terminates the SSL connection and proxies onto Anubis, which (after assessing the request) proxies it back into Nginx (if I only had one site it could go direct to origin, but in my case, Nginx is used to route the request appropriately): User ----- Nginx Nginx ---- Origin | | | | ----- Anubis ----- Enabling Anubis for an existing site means doing two things: * Reconfiguring the main server block to proxy onto Anubis * Creating a new server block, bound to `127.0.0.1:8023` and configured to proxy onto origin So, for `filesearch.bentasker.co.uk` I now have the following to terminate SSL and forward onto anubis: # Define the upstream # we use a keep-alive pool so we're not constantly # doing three-ways upstream anubis{ server 127.0.0.1:8923; keepalive 30; } # Server block server { listen 443 ssl http2; listen [::]:443 ssl http2; root /var/www/vhosts/bentasker.co.uk/public_html; server_name filesearch.bentasker.co.uk; ssl_certificate /etc/pki/certstore/filesearch.bentasker.co.uk/fullchain.pem; ssl_certificate_key /etc/pki/certstore/filesearch.bentasker.co.uk/privkey.pem; include /etc/nginx/domains.d/includes/letsencrypt.inc; location / { # Ensure that keepalives are used proxy_http_version 1.1; proxy_set_header Connection ""; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_pass http://anubis; } } Then, to receive requests that have been proxied through by Anubis, I have another server block: server { listen 127.0.0.1:8023; root /var/www/vhosts/bentasker.co.uk/public_html; server_name filesearch.bentasker.co.uk; location / { ... etc ... } } Now, if you visit `filesearch.bentasker.co.uk` you'll periodically see a brief interstitial whilst your browser does some proof of work: * * * #### The Catch There is, of course, a catch. In order to have Anubis "weigh their soul", the visitor's browser needs to have Javascript enabled: deploying doesn't just exclude bots, it excludes users who have chosen to disable javascript (and, by extension, search engine crawlers). That's not a major issue for `filesearch.bentasker.co.uk`, as the interface relies on javascript anyway. But deploying Anubis onto my `www` would mean a much bigger change: although the site _has_ javascript, it hasn't _required_ it for over a decade. I'm not ready to change that just yet so, for now, I'll only be enabling Anubis for sites which already require javascript.

New #Blog: Deploying Anubis to protect against AI scrapers
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/the-internet/deploying-anubis-to-block-ai-bots.html?utm_source=rss

#ai #bot #bots #llm #theinternet

02.06.2025 22:15 — 👍 0    🔁 0    💬 0    📌 0
Preview
Two years of solar We've now had a solar install for 2 years (give or take a couple of days). Although last year's summer wasn't as sunny as the previous year, May 2025 has got this year off to a good start (at least, if you ignore the fact that it's raining as I write this). This post is a _very_ quick look over the install's performance so far. * * * #### Headline Stats Since install, the system has generated 5.67 megawatt hours of electricity: This means that we generated `2.89 MWh` this year, slightly more than in the year before. Taking into account savings generated by charging from the grid and discharging when prices are higher, the system has generated savings of around £1150: The (rolling) average weekly saving is fairly stable, around the £12 mark: * * * #### Battery The battery continues to disappoint, generating average daily savings of £0.67 a day: Unfortunately, this is 32% _lower_ than the same time last year - this is probably partly because there were fewer savings sessions over winter (and those that we did have had far less attractive pricing than the year before) so we didn't have the benefit of dumping the battery to grid at massively inflated export prices. We had a short period at the beginning of the year where the battery was out of service although, to be fair, that was my own fault. Thankfully, I managed to resolve it without needing to contact the manufacturer, avoiding a repeat of them sending my details out and claiming that I'm a battery service tech. * * * #### Exports Our total export income remains quite low We only exported `159kWh` this year (compared to `298kWh` the year before). This drop in export is primarily the result of us using a hot tub as an energy sink. Although we've earned less export income, the energy has been put to good use.

New #Blog: Two years of solar
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/house-stuff/second-year-of-solar.html?utm_source=rss

#electrical #housestuff #solar

27.05.2025 19:30 — 👍 0    🔁 0    💬 0    📌 0
Preview
Automating an audit of my image use Earlier this month, I ended up with a known copyright troll in my inbox. As a result of not doing much (if any) due diligence, Copytrack seem to have acquired a customer base who (if reports are to be believed) like to upload other peoples stock images and then try their luck at milking royalties from anyone unfortunate enough to have used that image. It often won't _really_ matter whether you correctly acquired a license or not: many of those reporting these behaviours note that they acquired the image quite a long time ago. Am I likely to remember _where_ I acquired an image half a decade ago? If I can, does the service even still exist? If not, then Copytrack (and others like them) will try to claim that many hundreds of euros are "owed" for use of what was probably a royalty-free stock photo. Although these claims don't have much merit, they're a nuisance to deal with and there are better things that I could be doing with my time than entertaining the likes of Copytrack. So, to reduce the risk of having to deal with them in future, I decided to run an audit of images on my site with a view to pro-actively replacing anything that might enable trolls in future. I started by checking how many images there are on `www.bentasker.co.uk`: find ./images -type f | egrep -i '\.(jpg|jpeg|png|gif|bmp|webp)' | wc -l 3641 Given that the site also hosts an archive of benscomputer.no-ip.org, it could have been _far_ worse (thankfully, though, I very rarely used images back then). Still, there were too many to check manually so I decided to write a script to automatically run a reverse image search of each, flagging any that appeared on other domains. This post (mostly) talks about the process of doing that. * * * #### Finding a Suitable Reverse Image Search Service Google don't offer an API for reverse image searches and I didn't _really_ want to spend time building a one-off scraper to use it, so, I searched around for pre-existing solutions. The first that I found was PicDefense which almost sounded ideal... until I checked their pricing: **Ouch.** Next I found SerpAPI's offering, it was much cheaper but still a quite meaty $75 for 5000 searches. Although both prices are, arguably, a bargain compared to being on the hook for €500 per-image, there was no realistic prospect of me actually being in that position: after all, I _knew_ that the majority of images were ones that I'd taken and the aim was to find any odd exceptions. Next I found Copyseeker: Pricing was a much more reasonable $30 for 5,000 requests. It seemed perfect for my needs... at least, until I tested it by searching for an image that I knew that there should only be one of: import os import requests img = "https://www.bentasker.co.uk/images/BlogItems/zafira_dogboot_build/boot.jpg" url = f"https://reverse-image-search-by-copyseeker.p.rapidapi.com/?imageUrl={img}" headers = { "x-rapidapi-key": os.getenv("RAPIDAPI_TOKEN", ""), "x-rapidapi-host": "reverse-image-search-by-copyseeker.p.rapidapi.com" } response = requests.get(url, headers=headers) print(response.text) The result looked like this: { "Source": "https://www.bentasker.co.uk/images/BlogItems/zafira_dogboot_build/boot.jpg", "BestGuessLabel": "car", "Pages": [ { "Url": "https://www.bentasker.co.uk/", "Title": "Ben Tasker", "MatchingImages": [ "https://www.bentasker.co.uk/images/BlogItems/zafira_dogboot_build/boot.jpg" ], "Rank": 3.69 } ], "Entities": [ { "Score": 0.92187047, "Description": "Car" }, { "Score": 0.63688356, "Description": "2010 MINI Cooper" }, { "Score": 0.61004996, "Description": "MINI" }, { "Score": 0.4674, "Description": "used" }, { "Score": 0.4362, "Description": "finished" }, { "Score": 0.3483, "Description": "Battery" }, ], "VisuallySimilar": [ "https://images-na.ssl-images-amazon.com/images/I/813egTIW9sL._AC_UL495_SR435,495_.jpg", "https://m.media-amazon.com/images/S/aplus-media-library-service-media/87f4b44f-1492-4a89-9c9e-829a4b38c91b.__CR161,0,1279,1600_PT0_SX362_V1___.jpg", "https://i.etsystatic.com/24088658/r/il/e8aa83/2721525335/il_fullxfull.2721525335_p3dk.jpg", "https://frankfurt.apollo.olxcdn.com/v1/files/mptqc6ngzynv3-RO/image;s=1020x1360", "https://www.tesmanian.com/cdn/shop/products/model-x-pc01-2.jpg?v=1668795464&width=1000", "https://d3bp3uid63eixr.cloudfront.net/production/catalog/product/3/2/32347053_xc90_cargo_mat.jpg", "https://avatars.mds.yandex.net/get-autoru-vos/1889124/0083ecbf89710619984ae043d8ad05cc/456x342", "https://www.infinitiusa.com/content/dam/Infiniti/US/assets/2025/qx60/accessories/qx60-a38.jpg", "https://%D0%BA%D0%B8%D1%82%D0%B0%D0%B9%D1%81%D0%BA%D0%B8%D0%B5-%D0%B0%D0%B2%D1%82%D0%BE%D0%BC%D0%BE%D0%B1%D0%B8%D0%BB%D0%B8.%D1%80%D1%84/wp-content/uploads/2023/12/tank_500_mishin_5_1000.jpg", "https://carnewschina.com/wp-content/uploads/2024/08/autohomecar__Chtk2WazTVGALrlFAAdPPZ36qUw589-800x1001.jpg", "https://cache.willhaben.at/mmo/4/201/321/0474_428074999.jpg", "https://images.craigslist.org/00H0H_4PW93jz8Suy_0CI0t2_600x450.jpg", "https://www.lincoln.com/accessories/product/images/LC5Z7811600BA_0.jpg", "https://%D0%BB%D0%B0%D0%B4%D0%B0.%D0%BE%D0%BD%D0%BB%D0%B0%D0%B9%D0%BD/uploads/posts/2023-08/thumbs/1693202261.webp", "https://pbs.twimg.com/media/GnYYfc9WsAAwsgq?format=jpg&name=large", "https://www.ford.com/accessories/product/images/VNL1Z7813046A_10.jpg" ] } The results weren't limited to exact matches, but included anything which looked even remotely similar: It turns out that, in my eagerness to find a solution, I'd missed something quite important on Copyseeker's homepage: The API is built on top of AI, with the inevitable result that it's shit. In the hope that exact matches might be reported differently, I tried again using an image that I knew existed elsewhere (the chocolate image that Copytrack had contacted me about). Unfortunately, the results were still classed as "visually similar" with no programmatic way to discern between matches and lookalikes. The only thing that could possibly be worse than having to manually check 3600 images would be having to also sift through an AI's hallucinations for each of them. Copyseeker, clearly, wasn't able to do what I needed. * * * #### Ninjas to the rescue Without much hope left, I went back to searching until I stumbled on OpenWeb Ninja: As well as having a free trial, the entry point into the paid tier was a reasonable $25 for 10,000 requests: Their API is also offered via the RapidAPI Hub, so I set up a free trial and tested with the same image as I'd tried with Copyseeker: import os import requests url = "https://reverse-image-search1.p.rapidapi.com/reverse-image-search" querystring = {"url":"https://www.bentasker.co.uk/images/BlogItems/zafira_dogboot_build/boot.jpg"} headers = { "x-rapidapi-key": os.getenv("RAPIDAPI_TOKEN", ""), "x-rapidapi-host": "reverse-image-search1.p.rapidapi.com" } response = requests.get(url, headers=headers, params=querystring) print(response.text) With this API, there were no false positives in the response: { "status": "OK", "request_id": "c74425ab-8e74-42a2-b671-a02389149dab", "parameters": { "url": "https://www.bentasker.co.uk/images/BlogItems/zafira_dogboot_build/boot.jpg", "safe_search": "blur", "limit": 500 }, "data": [ { "title": "Ben Tasker", "link": "https://www.bentasker.co.uk/", "domain": "www.bentasker.co.uk", "logo": "https://encrypted-tbn2.gstatic.com/faviconV2?url=https://www.bentasker.co.uk&client=HORIZON&size=96&type=FAVICON&fallback_opts=TYPE,SIZE,URL&nfrp=2", "date": null, "image": null, "image_width": 600, "image_height": 472 } ] } I searched again for the chocolate image and the API helpfully provided a list of sites that had used it. * * * #### Scripting Now that I had the means to programatically search for images, I needed to chuck together a script to do so. I've got sites running a variety of technologies, so I decided against having the script enumerate images itself - instead it iterates through a list of URLs provided via `stdin`: def getImageCandidates(sess, skiplist): ''' Iterate through the list of images and then return a dict of those which show up elsewhere ''' toCheck = {} failures = [] # Note the start time st_time = time.time() # Read URLs from stdin for line in sys.stdin: img = line.rstrip() r = searchImage(img, sess) # Did the search fail? if not r or "status" not in r or r["status"] != "OK": # The API had some kind of issue # back off and then move on print(f"Failed to check {img} ({r})") failures.append(img) time.sleep(10) continue # Otherwise, continue for res in r["data"]: # Check whether the domain is in the skip list # (the skiplist contains my domains) if res["domain"] in skiplist: continue # Otherwise we've work to do if img not in toCheck: toCheck[img] = [] toCheck[img].append(res) print(f"Checked {img}") # Dump a snapshot of progress to disk every 2 minutes # this allows me to check progress and also means we won't # have to redo everything if something fails at a late stage if (time.time() - st_time) > 120: # Dump toCheck to disk as JSON writeDump(toCheck) # Reset the time counter st_time = time.time() return toCheck, failures My first attempt didn't include the failure block. However, although the API usually returned almost exactly the information that I needed, it seemed to be quite prone to timeouts. Each URL is passed into `searchImage()` which is basically just a modified version of the python that I'd used to test the API def searchImage(imageUrl, sess): ''' Call the reverse image search API ''' url = "https://reverse-image-search1.p.rapidapi.com/reverse-image-search" querystring = { "url": imageUrl, "limit":"10", "safe_search":"off" } headers = { "x-rapidapi-key": os.getenv("RAPID_API_TOKEN"), "x-rapidapi-host": "reverse-image-search1.p.rapidapi.com" } try: response = sess.get(url, headers=headers, params=querystring) return response.json() except Exception as e: print(f"ERROR: Request failed for [{imageURL}]: {e}") return False Once `getImageCandidates()` has worked through all of the images, the script writes a list of any that failed into a file called `failures.txt`: candidates, failures = getImageCandidates(sess, skiplist) with open("failures.txt", "w") as fh: fh.write("\n".join(failures)) This made re-running failures easy (because of the API's unreliability, I ended up doing this twice). The script goes on to write markdown to a file, embedding any images which appeared on other domains along with a list of the relevant domains: def writeCandidates(candidates): ''' Write a formatted file detailing matches etc ''' # Iterate through the candidates and structure a table rows = [] for url in candidates: row = { "image": f"![]({url})", "hits" : len(candidates[url]), "domains" : ", ".join(set([x["domain"] for x in candidates[url]])) } rows.append(row) # Generate a markdown document listing the matches # # We use replace on the output because the format isn't # one that Obsidian recognises tbl = markdown_table(rows).set_params( row_sep="markdown", quote=False ).get_markdown().replace("+-", "--").replace("-+","--") md = "# Image Matches\n\n" + tbl + "\n" # Write it to a file with open("results.md", "w") as fh: fh.write(md) A full copy of the script is available on Codeberg. * * * #### Building a list of images In order to run the script, I needed to generate a list of URLs for it to iterate through. The simplest way to do this was to use `find` to locate appropriate files and then prefix scheme and domain, for example: PREFIX="https://www.bentasker.co.uk" find images -type f | egrep -i '\.(jpg|jpeg|png|gif|bmp|webp)' | while read -r line do echo "$PREFIX/$line" done > image_list.txt Crucially, iterating through the filesystem included images that crawling the site might not have done: there _may_ have been images in there that haven't been embedded or linked to (or worse, were previously but no longer are). This matters because the server will still serve those images - the only real difference between them and others is discover-ability (which is no protection at all). * * * ### Results The reverse search API wasn't particularly fast, so it took the better part of a day for the script to chug through all of the images (which was still likely faster than I could have achieved). It flagged 771 images as needing eyeballs. I opened the report in Obsidian and added a column (`Problem`) so that I could track progress and decisions: As I scrolled through, I realised that I'd actually been quite a bit more exposed to nuisance claims from copyright trolls than I'd realised. This site used to run on Joomla and had JoomShopping installed. Both come with sample image sets which were still in my images directory: None of these images were _currently_ embedded into pages, but there was no way for me to _definitively_ say whether they ever had been. There's no licensing issues associated with this, but, if a copyright troll _were_ to pop-up, would I recognise any of these images as coming from sample data? Even if I did, they're all years old, so tracking proof down in a revision control system might be more than a little challenging. * * * #### Result Breakdown I worked through the list and found that the images almost all fit into the following set of categories: Category | Description ---|--- **Fair Dealing** | UK Copyright law doesn't actually have a broad "fair use" category, these images fall into one of the exceptions. Most images fell under one of - Criticism, review and reporting current events - Parody, caricature and pastiche **Replace** | An image that I don't own/didn't create and should look to replace and remove. **Sampledata** | Images that were part of my prior Joomla! install. Images may or may not have been used previously but are no longer referenced by any pages **No Issue** | An image created or owned by me * * * ##### SexiezPics The web being the web, the results inevitably included a slightly bizarre one. Assuming that they haven't repressed it (sorry!), some readers might recall this screenshot from my experimentation with Bing's AI Chatbot: It's been scraped and nicked by `sexiezpics.com`: The site appears to be an index of adult (and particularly, AI generated) images and is, _quite clearly_ automated even down to the text that they've put under my image: > This **Messing Around With Bing S AI Chatbot Possibly NSFW Bentasker Co Uk** pictures is one our favorite collection photo / images. Why did no-one telI me that I'm such a porn icon? /s The "Related Pictures" are _definitely_ NSFW. Unfortunately this is where the site stopped being quite so funny: they appear to have scraped an awful lot of non-consensual celebrity AI slop (aka deepfakes). More amusingly, the cheeky bastards also suggested this as related: Thanks, because that's just what I need under my name in a world where LLMs imbibe random shit and spit it back out. Not wanting my name on the same page as a bunch of AI celebrity fakes, I sent the site a takedown request. As a side note: the site is also clearly not compliant with the Online Safety Act: Who could _possibly_ have predicted that the more problematic sites would be the ones not to comply? * * * #### Speaking of AI Getting back to the consequences of Copytrack's business model, it's worth thinking about where this might lead in a world that now has genAI available to it. Copytrack (claim to) exist in order to protect the rights of artists, photographers and other creatives: finding those who improperly use their images and extracting renumeration from them. But, Copytrack happily letting anyone claim any image increases the risk associated with using stock images _at all_ (if only because of the increased admin required to ensure that licensing records are retained, potentially for decades). The most likely outcome is that users will turn away from stock images and start using generative AI instead (something they're already likely to be drawn towards by price). That's really not an improvement for creators. Not only are they still not getting paid, but the money is instead going to AI companies who built their products by crawling the net and ripping off creators _en masse_ . To be clear, that's not hyperbole. Recently, Nick Clegg warned that attempts to enforce copyright could kill AI: > Nick Clegg, former Meta executive, has expressed strong opinions about the demands artists are making regarding copyright laws and AI training. He argues that requiring tech companies to seek permission before using copyrighted material to train AI is simply unworkable and could end up killing the AI industry altogether. Similarly, OpenAI has noted that even the EU's quite limited opt-out is problematic > The European Union, for one, has created “text and data mining exceptions” with broadly applicable “opt-outs” for any rights holder—meaning access to important AI inputs is less predictable and likely to become more difficult as the EU’s regulations take shape. Unpredictable availability of inputs hinders AI innovation, particularly for smaller, newer entrants with limited budgets. > > The UK government is currently considering changes to its copyright regime. It has indicated that it prefers creating a data mining exception that allows rights holders to “reserve their rights,” creating the same regulatory barriers to AI development that we see in the EU What they seek, instead, is full application of the fair use doctrine: carte blanche to grab everything that society has to give. There _are_ creators out there who have blogged about using services like Copytrack. Although understandable, the problem (IMO) is that it helps to lend legitimacy to an organisation who don't appear to be particularly good at filtering illegitimate claims out. The end result of that, unfortunately, may be that more people are dissuaded from buying & licensing stock photos, instead spending their money with companies who promise (rightly or wrongly) to be able to generate images without any copyright concerns. * * * #### Conclusion It took a little bit of hunting around to locate a suitable service, but once I'd found one, making use of it to build an image auditing script was very little work. As a result, I'd audited image use across all of my domains within the space of a couple of days. The effort didn't turn up any definitive liabilities, though it did identify some avenues by which an illegitimate copyright troll could potentially make a bit of nuisance of themselves. It also identified some (unexpected and odd) misuse of some of my images by an adult image indexer.

New #Blog: Automating an audit of my image use
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/software-development/automatically-reverse-searching-images-to-list-possible-concerns.html?utm_source=rss

#audits #copyright #copytrack #genai #photos #search

26.05.2025 19:30 — 👍 0    🔁 0    💬 0    📌 0
Preview
Examining A Copyright Claim From Copytrack This morning, I awoke to find an email from Copytrack in my inbox. If that name sounds _vaguely_ familiar, it's because they made a bit of a name for themselves as a copyright troll in the early 2020's (including by trying to enforce "rights" on a photo of a postcard from the 1900s and threatening a White house reporter with legal action over his own photos). Although their name has come up less frequently since then, Copytrack didn't stop operating and were, apparently, investigated by the Italian Competition and Market Authority last year. The way that Copytrack works is: * Someone signs up for their services (it's even free!) * That person uploads images that they want Copytrack to "enforce" rights on * Copytrack performs reverse image searches to identify usage * The user marks usage as "legal" or "illegal" * Copytrack contacts those who've used it, seeking proof of license or settlement * Copytrack take a 45% commission You'll notice that there seems to be quite an important step missing: Copytrack do not appear to independently check whether the user's claim is valid (or whether the use might be permissible), outsourcing that effort and stress onto the poor schmucks at the other end. More than a few of those claiming to have been affected over the years had _done the right thing_ by paying for images on stock photos site, only to be targeted years later. Anecdotally, it sounds like they're prone to targeting smaller operations and individuals (i.e. those less likely to have the resources and appetite for a fight). Copytrack's email was quite strongly worded, but looking into it quickly raised concerns about the veracity of their claims. In fact, the more that I looked into Copytrack the shoddier things seemed to be. So, I thought "fuck 'em" and decided to play around to see just how robust their offering actually is. In this post, I'll start by discussing the issues with their claim and then I'll move onto talking about what creating a claim on their system as a purported copyright holder looks like. * * * ### Their Claim The claim email is quite long, so I'll just quote the relevant parts (I've also added line breaks as some of their paragraphs are fairly dense) > We, COPYTRACK, are writing to you on behalf of our client CONCEPT-PRODUCTION, who has assigned us the monitoring and protection of their licenses and image rights. > > On April 25, 2025 we have been informed that recipebook bentasker is likely using an image **without permission** and the client has exclusively commissioned us with the clarification, administration of the image rights **for the territory of the Federal Republic of Germany** and, if necessary, the enforcement of any copyright infringement through our partner lawyers. > > Images are protected by copyright law almost worldwide and infringements are actionable under the respective national law. > > Please see the attachment below for details. It's worth noting here that the email had no attachment - if you've read some of the historic stories about them, you'll have seen that they used to attach a PDF detailing their claim. Presumably, the final sentence is a sign that they've not updated their template since then. They insist that the email's recipient is obliged to do the legwork for them: > Please note that you are obliged to provide the necessary information about the usage of the image to allow our client and us to verify the lawfulness of such usage. This, frankly, is bollocks: You are not actually _obliged_ to do anything, however if you don't they're going to keep nagging and _might_ even instruct solicitors (sounds quite a lot like TV Licensing under Crapita). The strong wording seems to be designed to help push people into panicking and blindly complying. They promise that, if proof is provided, they'll close the case off. > Show us proof of your license by uploading it or providing any other legal justification to use these images and if it is valid, we will close your case immediately; They note that, if the usage _isn't_ authorised, there are a couple of ways in which you can settle the complaint: Payments are made by logging into `portal.copytrack.com`. They go on to explain that the amounts requested are not random and instead that > We calculate these fees based on our client's license history, as well as the duration of use and type of rights infringement. This does seem to _mostly_ be true (when submitting a claim, Rightsholders are asked how much they want and then what percentage of that they consider the minimum). To ensure a sense of urgency, a relatively tight deadline is set: > To avoid further action, including legal action, provide proof of a valid license or any other relevant information by **May 19, 2025** , or acquire it by making payment in our online portal. Strong wording, monetary demands and deadlines... no wonder so many people think "scam". * * * #### The Rights Are Held By Who Exactly? The opening paragraph of their email asserts that the rights to the image in question are held by `CONCEPT-PRODUCTION`. A quick bit of searching for them, though, doesn't yield any obvious results (though there was a post warning about both Copytrack and CONCEPT-PRODUCTION). Reverse image searching the image with CONCEPT-PRODUCTION doesn't yield any results at all. I appear to have had the image in question since about 2019, so it wasn't _immediately_ obvious to me where I might have acquired it from. A reverse image search brought me to Dreamstime: The author's name is in the bottom right of the image and is **not** `CONCEPT-PRODUCTION`. I had a little bit of a search around, and although I found the author's profile on a number of (more modern) sites, I couldn't find anything to link that author back to the name of the org that Copytrack claimed to be acting on behalf of. Out of curiosity, I logged into Copytrack's portal and found that it provided some information which hadn't been included in the original email: It's not clear what "Assignment Received On" indicates: at first I assumed that it was when the image was uploaded to Copytrack, but that date falls a couple of weeks after Copytrack claim to have received a complaint. The line that I've highlighted is more interesting though, as it provides the filename of the image that the "rightsholder" uploaded (`Fotolia_7042846.jpg`). This is useful because we can see that it begins with the name of a stock-image site (though Fotolia no longer really exists, having been borged by Adobe). The number at the end of the filename _appears_ to be an ID. If you go to `https://en.fotolia.com/Info/Images` you'll be redirected to Adobe's stock image site (`https://stock.adobe.com/images/`). By choosing an image and looking at the URL, we can see that the site constructs URLs using the following format https://stock.adobe.com/images/<meaningless slug>/<id> This means it's possible to construct a URL which led to the original listing: https://stock.adobe.com/images/are-copytrack-taking-the-piss/7042846: So even on (what was) Fotolia, the listed author was not the one that Copytrack claimed to be acting on behalf of. I sent them an email which laid this out and explained that, given the inconsistencies in their claim, I didn't feel comfortable providing them with too much information > Given that your claim appears to be erroneous and, on closer inspection, seems to meet the pattern of behaviour associated with copyright trolling, I do not feel particularly comfortable providing further information. I explained that, if they wished to proceed further, they would need to provide proof that their client actually holds the rights > Given the apparent issues with your claim, I do not feel it would be wise to release any additional details to you until and unless you are able to provide adequate proof that you are in fact representing the true copyright holder and that this is not, instead, the result of a poorly targeted dragnet operation. I _did_ however send them a very tightly cropped screenshot of a license to use the item in question - no point in having them hang around. * * * ### Their Setup #### Beware of Scams With my response sent, there wasn't much for me to do but grumpily toot about it. Unfortunately, in the process of doing so, I nerd-sniped myself. At time of writing, when you visit `www.copytrack.com` you receive a big warning that they are aware of an active scam using a typo-squat domain: Note the addition of a hyphen in `proof@copy-track.com`. You see, it seems that these self professed experts in online rights protection failed to recognise the threat posed by typo-squatters and so did not pre-emptively acquire `copy-track.com`. The result of this oversight is that there are now scammers apeing Copytrack's own scam-esque behaviour. Wunderbar... * * * #### Dodgy Website Links You would, of course, hope that recipients could still tell the difference between the "legitimate" provider and others through the sheer quality of their site. Unfortunately, Copytrack do not seem to have invested very much effort into proof-reading their own site (under Muphrys law, I've just guaranteed myself a typo), to the extent that their **own FAQs** manage to include a link pointing not to a domain, but to an IP (and using plaintext HTTP at that): Worse, the link _even_ works, switching the user to an insecure connection. What's particularly curious about this is that requests to the root **are** redirected back to HTTPS: GET / HTTP/1.1 Host: 3.72.104.87 User-Agent: curl/7.81.0 Accept: */* HTTP/1.1 301 Moved Permanently Date: Sat, 10 May 2025 17:49:26 GMT Server: Apache X-Redirect-By: WordPress Location: https://www.copytrack.com/ Content-Length: 0 Content-Type: text/html; charset=UTF-8 Whereas requests to non-existent paths redirect to a domain that no longer exists: GET /foo HTTP/1.1 Host: 3.72.104.87 User-Agent: curl/7.81.0 Accept: */* HTTP/1.1 301 Moved Permanently Date: Sat, 10 May 2025 17:50:32 GMT Server: Apache Expires: Wed, 11 Jan 1984 05:00:00 GMT Cache-Control: no-cache, must-revalidate, max-age=0 X-Redirect-By: WordPress Location: https://copytrack-gxzmht2vp9.live-website.com/de Content-Length: 0 Content-Type: text/html; charset=UTF-8 Paths that exist are served up: GET /about-us HTTP/1.1 Host: 3.72.104.87 User-Agent: curl/7.81.0 Accept: */* HTTP/1.1 200 OK Date: Sat, 10 May 2025 17:54:23 GMT Server: Apache Link: <https://www.copytrack.com/wp-json/>; rel="https://api.w.org/", <https://www.copytrack.com/wp-json/wp/v2/pages/10592>; rel="alternate"; title="JSON"; type="application/json", <https://www.copytrack.com/?p=10592>; rel=shortlink Vary: Accept-Encoding Transfer-Encoding: chunked Content-Type: text/html; charset=UTF-8 All in all, this has the smell of an ageing Wordpress install that's not been competently maintained when moving the site between domains. Even their privacy notice contains (mis-written) references to this IP > If you have any questions, please contact us using the form provided on the website (http://http://3.72.104.87/de/kontakt/). Much like their email template, their privacy notice doesn't appear to have received much love over the years, with a note implying that it might not have been reviewed since 2019. To put this into context, email recipients are being asked to part with money based solely on the word of a company who's website is being actively (and predictably) typo-squatted, is poorly maintained and has the sort of typos and mistakes that even modern phishing sites don't feature. * * * #### Signing Up For The Service Each of the site's pages contain a call to action: Given that Copytrack appeared to be trying to assert rights on behalf of an org who _didn't appear_ to own those rights, I decided I'd sign up and see just how careful Copytrack actually are. So, I signed up with (fairly obviously) fake details Arthur Chex, geddit? Once signed up, you're asked to detail what type of user you are: I chose "Photo Enthuisiast" Copytrack asks you to confirm that you are the author and/or that you have exclusive rights: I clicked `Skip` to see if it would ask me again later. I started by uploading a copy of a distinctive images that is very definitely mine: Disappointingly, the platform doesn't immediately start generating matches. However, it turns out that it's also possible to manually submit violations: Upon submitting, the report is validated in the background (Copytrack's website suggests that they use AI). If a report isn't considered valid (for example, the reported image is different or doesn't load), nothing happens (not so much as a notification). If it _does_ validate, the violation shows up in the dashboard and even suggests how much you can extor^H^H^H^H^H claim from the victim To see what the process looks like, I hit `Submit Claim`. It begins by making you provide address and bank details, so that they can pay out your gains. Next, you're presented with a checklist requiring that you declare that you hold the rights: The first two questions don't seem to matter, it's only the question on whether you're entitled to claim compensation which is able to block the process (being commission based, Copytrack aren't interested in handling claims where there's no chance of a payout). So, Copytrack **do** _technically_ ask whether you have the rights to the image. All the same, it's not a particularly robust check and feels _very much_ like this: I didn't hit submit on the claim, because it would bring CopyTrack's enforcement terms and conditions into effect and I didn't really fancy exposing my image to some of those terms. What seems clear, though, is that there is very little to stop someone from creating a Copytrack account, uploading someone elses images and using Copytrack to launch a speculative campaign - pocketing whatever is paid by hurried and panicked webmasters. * * * ##### Uploading Other's Work In fairness, it _might_ be quite difficult for Copytrack to assess whether a customer does or does not own the rights to an image (although, IMO, that doesn't absolve them of the responsibility to _try_). What they _should_ be able to do, is to ensure that any image in their index only has one "rightsholder" active at any one time. So, I wanted to see whether Copytrack _at least_ prevent multiple accounts from claiming the same image. Knowing there was a "hit" on the image that they had emailed me about, I grabbed a fresh copy from the Adobe Stock photos and tried uploading that. The platform didn't even bat an eyelid, so I decided to try uploading something a bit more recognisable Despite talking a lot about their image index, Copytrack's platform doesn't appear to be set up to be able to handle even the most basic of abuse. In fact, it turns out that Copytrack don't _even_ prevent matching against their own site and assets I **did** hit Submit claim on that one. * * * #### Information leakage In the images above, you may have noticed that there are clickable links to the "infringing" content. However, Copytrack's interface doesn't make use of `Referrer-policy`, which means that there's potential for a bit of information leakage. The user's browser will send a `referer` header, allowing us to identify requests which occurred as a result of activity in the Copytrack portal. I headed to my access logs and searched for `app.copytrack.com` (the domain of their user portal) # grep app.copytrack.com access.log.2025042* access.log.20250428-0600:156.59.126.78 - - [28/Apr/2025:05:56:19 +0000] "GET /posts/dessert/chocolate-orange-sauce.html HTTP/1.1" 200 5676 "https://app.copytrack.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36" "136.158.x.x" The IP that we care about is in the X-Forwarded-For header at the end (I've redacted the last two octets) A quick `whois` shows that this is in a block operated by an ISP in the Philipines. Given the timing, it seems likely that this was someone reviewing the page on behalf of Copytrack rather than the original customer. However, it's **also** possible to identify when a user is adding a "manual violation", because the portal attempts to hotlink the "infringing" image This, of course results in entries in access logs. But, because it uses `onkeydown` events, it attempts to hotlink on every keystroke, leading to requests for invalid paths [10/May/2025:16:31:09 +0000] "GET /BENTEST. HTTP/1.1" 404 958 "https://app.copytrack.com/" "Mozilla/5.0,(X11; Ubuntu; Linux x86_64; rv:138.0),Gecko/20100101,Firefox/138.0" "<MY IP>" [10/May/2025:16:31:10 +0000] "GET /BENTEST.j HTTP/1.1" 404 958 "https://app.copytrack.com/" "Mozilla/5.0,(X11; Ubuntu; Linux x86_64; rv:138.0),Gecko/20100101,Firefox/138.0" "<MY IP>" [10/May/2025:16:31:10 +0000] "GET /BENTEST.jp HTTP/1.1" 404 958 "https://app.copytrack.com/" "Mozilla/5.0,(X11; Ubuntu; Linux x86_64; rv:138.0),Gecko/20100101,Firefox/138.0" "<MY IP>" [10/May/2025:16:31:10 +0000] "GET /BENTEST.jpg HTTP/1.1" 404 958 "https://app.copytrack.com/" "Mozilla/5.0,(X11; Ubuntu; Linux x86_64; rv:138.0),Gecko/20100101,Firefox/138.0" "<MY IP>" The result is that Copytrack's user portal is (unnecessarily) leaking information about it's customers to those that it may then go on to accuse of copyright infringement. * * * #### Image Formats It seems worth saying: Copyright exists (or is supposed to exist) for the public good. So there is an argument that, run responsibly, services like Copytrack could _perhaps_ be beneficial and defend creative rights. The problem, though, is that Copytrack's offering isn't actually all that good anyway. For all their talk of advanced scanning (and using AI because, of course) they only support a handful of formats: Any _theoretical_ benefit brought by Copytrack's offering immediately disappears if an infringer uses any other format. Their system also needs Support intervention to be able to handle scenarios where the embedding page and the image do not use the same domain (so `www.example.com` embedding `cdn.example.com/foo.jpg` wouldn't be caught): > * The domain of the page URL and the domain of the image URL are the same. If this is not the case, you will get an error message and you cannot proceed. You can contact us and send us both the link to the page URL and the image URL. We will check them and connect them within our system to make the manual submission possible. > With the platform having such limited capabilities, it's _probably_ no surprise that most reports of interactions with Copytrack seem to err towards copyright trolling. * * * ### If you receive an email If you've received an email from Copytrack it's probably not wise to ignore it: Copytrack aren't lawyers, but reports online suggest that they do sometimes instruct legal teams (and they certainly post about "wins"). At the same time, though, don't panic and rush to give them the money that they demand. To deal with them, reply to their original email (don't use the portal, it doesn't send you a copy and only allows for limited responses). You should * Insist that they provide proof that their client owns the copyright in question (as we've seen above, they won't have checked in any meaningful way) * Work out where you acquired the image - is the copyright holder the one Copytrack have listed? * Grab proof that you licensed the image (if you can) The odds seem to be in favour of this causing them to quickly close the case - the model relies quite heavily on scaring and bullying people into paying over the odds. If for some reason they don't back down, at the very least, it seems that they'll often offer a reduction from the original demand. Going forward though, make sure you keep meticulous records (along with proof) of where the images that you use are licensed from. Personally, though, I've started the process of replacing and removing stock images: their widespread use makes them an attractive target for copyright trolls and I don't fancy spending future money on something that leaves me open to this kind of hassle. * * * ### Conclusion There are quite a few posts on the web about Copytrack and their lacklustre record. Some of those posts refer to Copytrack as a scam, but I don't think that's entirely the right label, because they seem to be something _worse_. Copyright claims come via Copytrack, but they're not generally launched _by_ them. Instead, Copytrack provide a platform which, through lax checks, enables _others_ to easily conduct copyright trolling campaigns. Unless a recipient challenges a claim, there doesn't seem to be _anything_ to stop someone from creating a Copytrack account and laying claim to a wide assortment of other peoples images. In other words: Copytrack's platform facilitates **much** more widespread abuse than that achievable by a simple copyright troll. If copyright trolls are mercenaries, then Copytrack are the arms dealers. This undesirable reputation isn't really helped by the fact that their website doesn't instil much confidence - their claim emails lean heavily on the side of "trusts us because we said so" but they can't even review their Privacy Policy properly. In comments to the media, Copytrack talk about their social good - they're fighting for the rights of photographers dontchaknow - but the limitations of their platform do seem to suggest that they're unlikely to be quite as effective as they claim.

New #Blog: Examining A Copyright Claim From Copytrack
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/general/copytrack-sent-me-a-copyright-claim.html?utm_source=rss

#analysis #copyright #copytrack #general

10.05.2025 21:30 — 👍 0    🔁 0    💬 0    📌 0
Preview
Doggy Boot Build The Mini is now gone: It failed one time too many, so we sent it to the great scrapyard in the sky (via, uhhh, the smaller and _very definitely_ land based, local scrapyard). The replacement is a Vauxhall Zafira, chosen, in no small part, because the size of its boot works well for a family with 2 dogs (It's an MPV, so there's room for a sixth and seventh seat in there). However, the fold down rear seats meant that the boot couldn't really be used by the dogs without some changes - there are gaps that a paw could easily slip into and cause injury. We've no actual need for the two additional seats, so I decided to board the boot out so that it's primarily a dog transportation area. Note: this project also reminded me that I'm getting old - the last time I thought about boarding out a boot was to put a sound system in there. * * * #### The Plan The boot looks like this: It's not _brilliantly_ visible in the photo, but there's a gap between the top of the laid down seats and those in front (there's also a gap under the orange line that I've unhelpfully drawn over the middle). The plan was to use some plywood to lay a new floor over the top, covering any gaps (including the one at the back). Originally, I intended to cover the plywood in boot carpet, but I quickly realised that that would be a mistake: if one of the dogs were to have a wee, we'd spend the rest of our lives trying to get the smell back out. So, instead, I decided to stick some rubber flooring to the wood, protecting it against liquids and making it _largely_ wipe clean. I also wanted to make sure that the build wouldn't rub at the plastic surroundings, damaging them and squeaking in the process, so I planned to cut the plywood a little smaller and run foam pipe lagging around the edge to provide a buffer. * * * #### The Build After a bit of umming and ahhing, I went for 5mm plywood: there was a trade-off to be made between rigidity (to cover that gap at the back) and not having too severe a lip along the front edge of the boot. I cut the plywood down to a 91 x 104cm sheet: Using spray adhesive, I stuck a strip of rubber flooring to the top of the wood The rubber flooring was 50cm wide, so the second run overlapped the edge a bit. I realised that having an overhang at the front _could_ prove useful, so I didn't trim it all the way back to the wood. To help ensure the rubber wouldn't lift in the middle, I ran a bit of clear sealant where the two sheets met. I also cut down some of the remainder and stuck it underneath the plywood, creating little grippy feet to help ensure the plywood didn't slide around on top of the boot's existing carpet. Then, I trimmed and ran 9mm foam pipe lagging around the sides and back The lagging also serves a second purpose: it helps to ensure that there'll never be a gap which might lead to pinch injuries or broken nails. Not to mention, of course, that it also helped to hide my somewhat messy trimming of the rubber flooring. * * * #### Fitting Hoping that I hadn't screwed up any measurements, I took the floor to the car to install it. It fit tightly inside (note: I hadn't pushed it all the way down when I took this photo): What I hadn't accounted for (because I hadn't noticed) is that the first ~5cm of the boot floor is slightly sloped, so even after the floor was pushed down there was a slight gap. This was helped, though, by the fact that I'd left the rubber a little longer. The length of the rubber overhang also means that it traps, just slightly, under the boot lid when it's closed - so any liquid "spillage" should never touch the carpeted section of the boot. To check that the board safely covered the gap at the back, I put weight on it - it remained solid: the flooring (and glue) had given it some additional rigidity **and** I'd extended the headrests to provide additional support. The next step, then, was to install a mesh boot guard so that the dogs couldn't jump over the top of the seats: Guards are easily available online. MPVs have gone out of fashion and are a bit of a weird shape, so I couldn't find one specifically intended for something like the Zafira, but, one intended for a SUV fit just fine. * * * #### Hiding The Back Seats We were almost ready to go, but there was a problem. The rear seats do not closely touch and there are visible gaps between them: Now, not to put too fine a point on it, but, dogs can be utter idiots - they're basically oversized and particularly toothy toddlers. If there's a gap visible, you can be fairly sure that, at some point, a dog **is** going to try and get through (or _at least_ push some body part into it). They're obviously _not_ going to fit, but the process of trying will involve damage to the seats along with possible injury. I initially toyed with the idea of putting a sheet of plywood across the back, but decided that it'd be quite hard to keep that from squeaking or causing a pinch hazard where it met the floor. Instead, I settled on a **much** simpler solution: a boot liner. The liner is a piece of soft material with straps which go around the bottom of the seat's headrests to hold it up: The manufacturer _claims_ that the material is dog proof (though nothing ever actually is), but the liner's main benefit is that it moves those gaps out of sight (and therefore out of mind). Although they're folded under in this photo, the liner also has flaps to go up the sides as well. I folded them under because I found that they didn't work particularly well with the shape of the Zafira's boot sides (there's a recess to allow room for arms when the now-hidden seats are in use). The rubber floor underneath grips the liner well and ensures that it doesn't slip about. Finally, I used some duct tape to hold the rear seatbelt tongues up and out of the way. I would have liked to tape the entire belt back, but Vauxhall seem to have found and built the boot out of the one material on earth that black nasty won't stick to. The tension on the belts does keep them quite firmly against the side walls, but we'll keep an eye on them to make sure they don't become a tangle hazard. Whilst we _could_ cut them out, it'd destroy any resale value on the car (who's going to buy an MPV where the additional seats can't be used?), so the worst case is probably that I have to build sides to box them in. * * * #### Conclusion Whilst I'm not _exactly_ delighted that we've got another Vauxhall, it's size means that it **is** pretty well suited to our current needs (and I definitely **am** delighted to be shot of the Mini). Boarding the boot out wasn't a particularly massive project and means that we now don't have to worry about paws slipping down any gaps in the floor. The rubber coating means that we don't need to worry (too much) about liquid spillages and if the need ever arises we can, with a little bit of effort, lift the new floor back out (that sloped bit gives access to the towing loop and the bolt which holds the spare wheel on). With temperatures starting to rise in the UK, we'll probably also look at putting some curtains over the rear windows to ensure that the floor doesn't warm up too much if the car ends up sat in the sun.

New #Blog: Doggy Boot Build
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/house-stuff/converting-a-zafiras-boot-to-a-dog-transportation-area.html?utm_source=rss

#cars #dogs #housestuff #vehicles

18.04.2025 14:00 — 👍 0    🔁 0    💬 0    📌 0
Preview
Running Imaginary on Kubernetes for Fast Nextcloud Thumbnailing We have quite a few photos in Nextcloud and, after a big upload, thumbnail generation can be quite demanding. About six years ago, I hacked together a solution to distribute the load and improve response times. However, that custom approach has not actually been necessary for some time: in version 24, Nextcloud introduced support for using Imaginary for preview generation. Imaginary is a microservice which uses bimg to perform a range of image processing operations on request. I missed support being added to Nextcloud, but have now finally got around to setting it up. This post describes deploying Imaginary into Kubernetes and then configuring Nextcloud to use it for thumbnailing. * * * #### Kubernetes Deployment So that I could load balance across nodes, I decided to start with 2 pods. First I defined a namespace --- apiVersion: v1 kind: Namespace metadata: name: imaginary Then I created a `Deployment`. The resources are defined based on the information in the Imaginary Readme. Imaginary is stateless, so there's no need for any volumes. Nextcloud maintain their own tuned images so we use one of those: --- apiVersion: apps/v1 kind: Deployment metadata: name: imaginary namespace: imaginary spec: selector: matchLabels: app: imaginary replicas: 2 template: metadata: labels: app: imaginary spec: containers: - name: imaginary args: [ - "-enable-url-source", - "-return-size" - ] image: nextcloud/aio-imaginary:20250306_093458 env: - name: PORT value: "9000" ports: - containerPort: 9000 protocol: TCP name: imaginary resources: requests: memory: 512Mi limits: memory: 2Gi restartPolicy: Always The deployment is then fronted with a service apiVersion: v1 kind: Service metadata: name: imaginary namespace: imaginary spec: ports: - port: 9000 protocol: TCP targetPort: imaginary selector: app: imaginary sessionAffinity: None type: LoadBalancer And an ingress (I use the Nginx Ingress Controller): --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: imaginary namespace: imaginary spec: ingressClassName: nginx rules: - host: imaginary.svc.lan http: paths: - path: / pathType: Prefix backend: service: name: imaginary port: number: 8090 I've got a wildcard record in my dnsmasq config which sends all `.svc.lan` domains to the ingress IP address=/.svc.lan/10.111.191.35 I applied the config kubectl apply -f imaginary.yml And then, from another machine, triggered a thumbnailing curl \ -v \ -o/dev/null \ "http://imaginary.svc.lan/crop?width=500&height=400&url=https://www.bentasker.co.uk/images/BlogItems/battery_died/battery_pack.jpg" The request was successful: HTTP/1.1 200 OK Date: Thu, 13 Mar 2025 16:14:40 GMT Content-Type: image/jpeg Content-Length: 39544 Connection: keep-alive Image-Height: 400 Image-Width: 500 * * * #### Configuring Nextcloud With Imaginary up and running, the next step was to tell Nextcloud to use it. This requires a simple change to `config.php`, adding the following to the config array: 'enabledPreviewProviders' => [ 'OC\Preview\PNG', 'OC\Preview\JPEG', 'OC\Preview\GIF', 'OC\Preview\BMP', 'OC\Preview\XBitmap', 'OC\Preview\MP3', 'OC\Preview\TXT', 'OC\Preview\MarkDown', 'OC\Preview\OpenDocument', 'OC\Preview\Krita', 'OC\Preview\Imaginary', 'OC\Preview\ImaginaryPDF', ], 'preview_imaginary_url' => 'http://imaginary.svc.lan', I restarted Nextcloud: docker restart nextcloud * * * #### Testing To verify that Nextcloud was actually using Imaginary, I started to tail the pod logs kubectl -n imaginary logs -f deployment/imaginary Then I downloaded some images from the web and saved them into my laptop's nextcloud directory. When I went to Nextcloud's web interface, it started generating thumbnails for the new images. The imaginary log tail started showing entries like this: 192.168.5.22 - - [22/Mar/2025 11:16:43] "POST /pipeline?operations=%5B%7B%22operation%22%3A%22autorotate%22%7D%2C%7B%22operation%22%3A%22fit%22%2C%22params%22%3A%7B%22width%22%3A1080%2C%22height%22%3A1920%2C%22stripmeta%22%3A%22true%22%2C%22type%22%3A%22jpeg%22%2C%22norotation%22%3A%22true%22%2C%22quality%22%3A%2280%22%7D%7D%5D&key= HTTP/1.1" 200 216562 0.1448

New #Documentation: Running Imaginary on Kubernetes for Fast Nextcloud Thumbnailing
Author: Ben Tasker

https://www.bentasker.co.uk/posts/documentation/kubernetes/deploying-imaginary-onto-kubernetes-for-thumbnailing.html?utm_source=rss

#kubernetes #nextcloud

22.03.2025 12:00 — 👍 2    🔁 0    💬 0    📌 0
Preview
Isolating our Guest Wireless Network in OPNSense I've changed jobs and, whilst setting my new work laptop up, decided that I'd create a completely isolated LAN segment for it. I once caught a (different) work laptop probing the rest of the LAN, which was decidedly unwelcome. Although that's a fairly extreme example, isolation means that I don't need to worry about EDR getting nosey or a misbehaving/compromised LAN client somehow impacting work. My OPNsense firewall had a NIC spare and so I ran a cable from that directly to the laptop. However, getting that isolated segment set up nudged a memory of something else that was on my TODO list: (re)isolating our guest wifi network. When I first created our guest wifi, we were using a Netgear R7800 running OpenWRT, with the result that creating an isolated wifi network was really straightforward. However, to improve coverage, we eventually moved to a mesh network, provided by a set of TP-Link Deco access points. This post describes the process of creating an isolated guest wifi network by attaching an access point (the trusty r7800) to a VLAN and giving it it's own isolated (IPv4 and IPv6) subnets. * * * #### Why not use the Decos? The Decos explicitly support creation of a guest wifi network, with TP-Link's advertising even calling the functionality out: Being able to use these would have been ideal: * It would avoid adding extra equipment * It would ensure that the guest network coverage was as good as our main networks However, there was a catch. The Deco's guest wifi functionality only works in any **meaningful** sense when the Decos are operating in _router_ mode (i.e. where one of them does DHCP, DNS and NAT). In Access Point mode, they'll still expose the guest network but connected guests have full access to the rest of the LAN (i.e. it's basically just a second wifi SSID). Although TP-Link's documentation notes that the guest network has it's own VLAN ID (591), it turns out that, they don't do VLAN tagging in AP mode. This means that the traffic dumped onto the LAN is indistinguishable from that of the main wifi (side note: TP-Link's response in that thread reminds me a little of the one I got after reporting a vulnerability to them). Switching the Decos over to router mode wouldn't _really_ address this either, at least, unless I was willing to replace my OPNSense firewall with a Deco and tolerate a significant drop in both flexibility and functionality. * * * #### The Plan As the Decos weren't a viable solution, I needed to do _something else_. I decided that the easiest solution was probably to add an access point to the isolated segment that I'd created for my work machine. However, I didn't want to sacrifice the benefits of isolation: simply plugging an AP in would mean that visitors (dirty dirty) phones, tablets and laptops could reach my work laptop (leaving it worse off than if it were connected to the LAN where I _at least_ have control of the systems). Isolation between Wifi clients is _easy_ , OpenWRT supports it out of the box, but isolating them from wired connections is quite a bit more difficult. I _could_ have connected the work laptop to the wifi, but it seemed silly to do so when it was within arms length of a switch. After fiddling around with OpenWRT a bit to gauge options, I settled on a rough plan: * Create 2 VLANs on my switch * Work machine connects to one * Wifi box connects to the other * Create a VLAN based interface on OPNSense * Assign the guest wifi it's own IPv4 and IPv6 subnets Then as a final step * * * #### Routing IPv6 My ISP (Andrews & Arnold) allocated us a `/48` (in line with RFC 3177) but only routes a single `/64` by default. So, the first thing to do was to log into the control page, browse into our line and click the button to have a `/60` be routed: There are 16 `/64`s within a `/60`, so assuming that I was allocated `2001:db8:0:10::/60` I'd have the following available to me: 2001:db8:0:10::/64 2001:db8:0:11::/64 2001:db8:0:12::/64 2001:db8:0:13::/64 2001:db8:0:14::/64 2001:db8:0:15::/64 2001:db8:0:16::/64 2001:db8:0:17::/64 2001:db8:0:18::/64 2001:db8:0:19::/64 2001:db8:0:1a::/64 2001:db8:0:1b::/64 2001:db8:0:1c::/64 2001:db8:0:1d::/64 2001:db8:0:1e::/64 2001:db8:0:1f::/64 This is **more** than enough. Alternatively, I could have clicked the `Add /64` button to get an extra subnet, but routing the `/60` means that I can easily bring new subnets online if I want to experiment with something. * * * ##### Configuring OPNSense Configuring OPNSense to handle IPv6 on AAISP is already very well documented so I'll just skim over the highlights (though there are examples later in this post): * Set the WAN interface to use DHCP6 * Set the LAN interface to use `Static IPv6` * Configure a static address within one of the `/64`s on that interface (and set the bitmask to `/64`) * Enable the DHCPv6 service on the LAN interface and set a range to allocate IPs from (obviously within the `/64` used on the interface) * Enable the Router Advertisement service on the interface (I use assisted mode so that I can push DNS settings etc) There is, however, an important step which I originally missed: **reboot the firewall**. In fairness, it _is_ noted on the AAISP control page, but I missed it: changes to IPv6 subnet routing won't take effect until the router reconnects. If you don't force that reconnect, your LAN devices will correctly acquire IPv6 addresses but won't be able to reach beyond the firewall (looking quite a lot like you've messed up a rule somewhere). To add IPv6 to my Isolated segment, I went through the same process (no reboot needed this time though, because I didn't need to change routing at AAISP's end) and assigned a different `/64` to the isolated segment. * * * ### VLANs The plan was to use three switch ports and two VLANs: * VLAN 403: Isolated wired * VLAN 404: Guest Wifi The ID's used are _entirely_ arbitrary (I chose 403 as an internal joke: I plan to deny everything else access), all that matters is that they're consistent between device configurations. The connection back to the firewall needed to be part of both of these VLANs giving us the following layout: Port | Connected To | VLAN 403 | VLAN 404 ---|---|---|--- 1 | OPNSense | Untagged | Tagged 2 | Isolated Wired | Untagged | No 3 | Guest Wifi | No | Untagged For those who've managed to avoid having to deal with VLANs before, a quick explanation * `Untagged`: Any traffic without a VLAN tag will be assumed to be part of this VLAN (so traffic on port 2 will be considered part of 403 etc). A port can only be Untagged in a single VLAN * `Tagged`: Traffic will be considered part of this VLAN if it carries a tag declaring it as such. A port can be set to Tagged in multiple VLANs The firewall port is in both VLANs but requires that traffic destined for VLAN 404 be tagged as such (any traffic that it emits from 404 will also be tagged accordingly). In packet captures, the tagging looks like this: Note: I _could_ have set `403` to be tagged too. In fact, doing so would have allowed me to use the existing switch <-> firewall LAN uplink, but I wanted to keep them on separate physical interfaces (mainly so that one couldn't bottle-neck the other). * * * #### Config: Netgear Switch My switch is quite long in the tooth now (it's a Netgear GS724T) but configuring VLANs on it is quick and easy. I added VLAN records for each of 403 and 404 and then configured the ports in line with the table above (though the port numbers are different, I went for three adjacent slots across the top row): Simple. Technically, I _could_ also have done this on the R7800 instead because OpenWRT supports VLANs (at least, so long as the underlying hardware also supports them). But, it would have meant that my work network had to transit the hardware which runs the guest wifi - whilst it's unlikely that some visitors device would compromise the R7800, for the sake of a short cable run it made more sense to avoid it (plus, it keeps all of my VLAN configs in a consistent location rather than spreading them across devices). For the record though, configuring the same thing in OpenWRT's LuCI interface looks like this (`Network -> Switch`): In practice, I'd probably have switched `403` to being tagged and had the guest wifi be untagged. * * * #### Config: OPNSense With the switch now tagging some of the traffic coming into the firewall's interface, the next thing to do was to configure OPNSense to handle the VLAN. After logging in, I went to `Interfaces -> Other Types -> VLAN` and created a new VLAN with ID `404` There was no need to create an entry for `403` because the switch defaults traffic on that port into that VLAN. I browsed to `Interfaces -> Assignments` and created a new interface from my VLAN: To set up the interface I browsed to `Interfaces -> [GuestWifi]` and gave it an IPv4 and IPv6 address (using dedicated subnets): I then browsed to `Services -> ISC DHCPv6 -> [GuestWifi]` and configured a range to use for assignments along with a DNS server (the firewall itself) to include in the DNS option Next was Router Advertisements (`Services -> Router Advertisements -> [GuestWifi]`), I set it to assisted mode and ticked the box to take DNS config from DHCP: Next, I enabled DHCP for IPv4 (`Services -> Kea DHCP -> Kea DHCPv4`) On the `Subnets` tab, I added a new subnet using the Guest Wifi range * * * ### Firewall Rules At this point, if a client were to connect, it would correctly acquire IPv4 and IPv6 addresses. However, the default firewall rules are restrictive, so it wouldn't actually be able to connect out to the rest of the world. The desired config is quite simple: It should be able to connect to the outside world, but not to other local networks or to the firewall itself (DHCP and DNS not withstanding). I configured a new ruleset (`Firewall -> Rules -> GuestWifi`): I also added a block rule to `Isolated` to prevent it accessing `GuestWifi`. * * * ##### Client Firewall Rules One thing that I ran into when first setting up IPv6 is that overly restrictive client rules can break DHCPv6. DHCPv6 works by sending a packet to a multicast address, so connection state tracking may not recognise the reply as being a solicited response. To avoid this, you need to ensure that clients will accept packets to link-local addresses on `UDP 564`: ip6tables -I INPUT \ -d fe80::/64 \ -s fe80::/10 \ -p udp \ -m conntrack \ --ctstate NEW \ -m udp \ --dport 546 \ -j ACCEPT * * * #### Wifi Client Isolation I connected my Ideapad to the R7800 and logged in to LuCI to create the guest wifi network (`Network -> Wireless`). To make sure that connected clients can't interact with each other, I clicked into `Advanced Settings` (the one under `Interface Configuration`) and ticked `Isolate Clients` This tells the access point (`hostapd`) not to forward traffic from one client to another (though technically a client could still see traffic destined for another by putting its interface into rfmon mode). * * * #### Conclusion We now (once again) have a properly isolated wifi network for visitors to use, protecting our kit from contamination by unknown devices. Although I haven't done so (yet?), this separation also opens the possibility of "disowning" visitor originated traffic by routing it out via a VPN connection or similar. Even without this, it is still currently possible to distinguish between Guest and home IPv6 traffic because they use different subnets, so if I ever were to receive a complaint I'd be able to tell, at a glance, whether it was one of us or someone else.

New #Blog: Isolating our Guest Wireless Network in OPNSense
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/house-stuff/creating-isolated-ipv4-and-ipv6-networks-with-vlans-and-opnsense.html?utm_source=rss

#aaisp #networking #opnsense #vlan #wireless

21.03.2025 20:00 — 👍 0    🔁 0    💬 0    📌 0
Preview
Automatically Deleting Old Gitlab Issues _Normally_ , tickets in an issue tracker are expected to exist forever (with the exception of ticket spam, which'll vanish once an admin purges it). However, there are use-cases where older issues hanging around can be undesirable. Back in September, I wrote about automatically creating recurring issues in Gitlab: a daily cronjob adds scheduled tickets to my queue. It's proven to be pretty good at making sure that I don't forget to do "once in a while" things. The thing is, though, once a task is done I don't really need the ticket for very much longer. I _certainly_ don't need months worth of tickets telling me to check salt levels in the water softener: These issues are created within a dedicated tasks project (`generic-tasks`) and are unlikely to ever contain any relevant history (if, for example, I went and found that the softener was broken, I'd likely raise a new ticket or move the task into the relevant project). **At most** , I might want to know when I last did something (i.e. when did the dogs last get a flea tablet?). Gitlab doesn't include functionality to age out tickets, so I forked my recurring tickets repo and created a simple script to delete older tickets. This post describes how to configure it, though Gitlab's auth model _does_ mean that there's a fairly big drawback. * * * ### Gitlab Authentication * * * #### Owner Permissions In order to be able to delete issues within a project, a user must have either `admin` or `Owner` privileges. Gitlab also won't let you give a user `Owner` privileges on a specific project, instead you have to grant them **at the group level**. So, if your project is `foo/bar`, you would need to grant your API user `Owner` permissions on every project under `foo` (clearly violating the principle of least privilege). That auth structure _sort of_ makes sense in a multi-tenant environment, where each user gets their own namespace, but doesn't work quite so well in a single-tenant environment. * * * ##### Dedicated Namespace Rather than grant my API user access to everything. I instead created a new group (`tasklist`) and moved my project/repo to be under that group, so that I could grant my API user `Owner` privileges without risking impact to unrelated projects. This isn't something that you should do lightly because URLs (and clone paths) will change: * Log into Gitlab as an admin * Menu -> `Admin` -> `Groups` * `New Group` * Provide a name * `Create Group` * Click into the group * Under `Add user(s) to the group` search for your API user * Set to `Owner` * Click `Add User to Group` Then, to move the project * Browse to the project * `Settings` * Scroll down to `Advanced` and click `Expand` * Find `Transfer project` * Choose your new namespace * Click `Transfer project` * Check and accept the warning Go and update any links you might have into the project and, if you've cloned it down, edit your git config to update the path. * * * #### Setting Up Access Now that you have a user with `Owner` permissions on the repo, you need to create an API token for the script to use. Log in as that user and then: 1. Click your Avatar 2. Choose `Preferences` 3. Choose `Access Tokens` 4. Set a name 5. Grant `api` scope 6. Click `Create personal access token` 7. Take a note of the token * * * ### Configuring Most configuration is performed in a YAML configuration file. As well as (optionally) providing details of the Gitlab instance, the configuration provides details of which projects to operate against, the threshold to use and (optional) labels for filtering: projects: - key: tasklist/generic-tasks days: 60 - key: tasklist/foo-bar days: 90 not_labels: ["Foo"] labels: ["Fixed/Done", "Won't Fix"] The threshold used applies to when a ticket was last updated. This means that old, but active, tickets will not be touched. The label filtering works on an any-match-wins basis: a ticket only needs to have one of `labels` to match, or one of `not_labels` to be excluded (exclusion takes priority). * * * #### Gitlab Connection Info You'll need to provide the script with both your Gitlab token and your instance URL. These can be provided via environment variable: export GITLAB_TOKEN="<my token>" export GITLAB_SERVER="https://gitlab.example.com" Alternatively, they can be written into the config file gitlab: url: "https://gitlab.example.com" token: "1234" * * * #### Dry Run Mode If you want the script to simply report which issues it would delete, you can enable dry run mode via an environment variable DRY_RUN=true * * * ### Running The script is designed to be run using the container image. It _is_ possible to run it outside of the container, but you'll need to install python-gitlab and set an env var to point it to your config. * * * #### Docker Docker invocation is simple: docker run --rm \ -e GITLAB_TOKEN="<my token>" \ -e GITLAB_SERVER="https://gitlab.example.com" \ -v $PWD/examples/example_config.yml:/config.yml \ codeberg.org/bentasker/gitlab_expire_issues:0.1 * * * #### Kubernetes Running in Kubernetes has the advantage that you can use a `CronJob` to handle scheduling (the disadvantage, of course, being that you need to have a K8s cluster in the first place). Create a secret to hold the token kubectl create secret generic gitlab-auth \ --from-literal=token=<my token> Define a config map to host the config --- apiVersion: v1 kind: ConfigMap metadata: name: gitlab-recurring-issues data: config.yml: | gitlab: url: https://gitlab.example.com projects: - key: tasklist/generic-tasks days: 90 not_labels: [] labels: [] Then define a CronJob to run the container --- apiVersion: batch/v1 kind: CronJob metadata: name: gitlab-issue-prune spec: schedule: "0 2 * * *" failedJobsHistoryLimit: 5 successfulJobsHistoryLimit: 5 jobTemplate: spec: template: spec: restartPolicy: Never containers: - name: gitlab-issue-prune image: codeberg.org/bentasker/gitlab_expire_issues:0.1 imagePullPolicy: IfNotPresent volumeMounts: - mountPath: /config.yml name: configfile subPath: config.yml env: - name: DRY_RUN value: "false" - name: GITLAB_TOKEN valueFrom: secretKeyRef: name: "gitlab-auth" key: token volumes: - name: configfile configMap: name: gitlab-recurring-issues Apply the config kubectl apply -f gitlab_issue_prune.yml The job should now fire at the required time. * * * #### Conclusion Six months ago, I built a system which quite reliably raises tickets for recurring tasks. However, using it for regular tasks had led to me having quite a few old (but otherwise pointless) tickets laying around. But now, I also have a system which can clean up behind it, so that I'm not perpetually storing tickets for one-off tasks.

New #Documentation: Automatically Deleting Old Gitlab Issues
Author: Ben Tasker

https://www.bentasker.co.uk/posts/documentation/general/autopruning-old-issues-from-projects.html?utm_source=rss

#docker #gitlab #kubernetes #scheduling

19.03.2025 17:00 — 👍 1    🔁 0    💬 0    📌 0
Preview
My Scammer Girlfriend: Baiting A Romance Fraudster At the beginning of the year, a spate of very similar mails appeared in my spam-box. Although originating from different addresses (and sent to different recipients), they all appeared to be the opener for the same romance scam campaign. Romance fraud is increasingly common and campaigns can extort large sums from victims, who are often quite vulnerable and lonely. If you found this page because you think that you might be being targeted, speak to Crimestoppers or Action Fraud. When stories of romance fraud hit the news, we often hear that the victim had become extremely attached to the scammer, but very little on _how_ they got engineered into that position. At it's heart, romance fraud relies on social engineering and I was curious to see what techniques were actually being used. I'm no particular stranger to scam baiting, so I decided to masquerade as a mark and see how the campaign was run (as well as what, if anything, _I_ could engineer out of the fraudster). The emails that I'd received were all associated with one persona: "Aidana", who claimed to be a dentist in Kazakhstan. This post analyses the scammers approach, systems and material, sharing some of what I was able to learn over the course of a few weeks of back and forth. * * * ### Contents Warning: even by my standards, this is a **long** post. To try and assist readability, rather than reporting things chronologically, I've broken this post into sections which focus on a specific area: * The first few (short) sections lay the groundwork and provide context. * Section 4: focuses on the nature of the communications * Section 5: delves into technical analysis of the scammer's emails, systems and materials * Section 6: looks (quickly) at things spotted in other campaigns Most subsections are quite short, it's just that there's just quite a few of them. **Contents** * Contents * Section 1: Opsec * Rules of Engagement * Don't Try This At Home * Section 2: The Opener * Image Attachments * Email Wording * Section 3: Making Contact * Delayed Response * Analysing The First Reply * Section 4: Our Conversations * Common Backstory * Common Playbook, Different Wording * Reassure little and often * Getting Told Off * Possible Use of Conditioning Techniques * Playing Fast And Loose With Photos * Developing Familiarity * Phone Calls * Sex * Shared Tasks * The Medical Association * The Request For Money * Ending It * Summary of Communication Techniques * Section 5: Gathering Information * Click My Link * Finding The Source Of The Images * Examining Changes To Images * Carl gets dumped * Scammer's System * Fingerprinting The Scammer's Mail User Agent (MUA) * Scheduled Sending * Confirming the MUA * Message Timings * Email Hours of Activity * Travel Info and PDF Generation * Passport Photo * The Visa Photo * Fake Dental Practice Website * Section 6: Corroborating Information From Other Campaigns * Common Backstory Elements * Section 7: Conclusion * * * ### Section 1: Opsec #### Rules of Engagement When doing things like this, I tend to set myself some rules in order to try and make sure that _I_ don't carried away. In this case the rules were 1. **No active intrusion** : techniques should be limited to passive analysis and social engineering. There's no sense in risking retaliation _or_ arrest and besides, analysis presents a _much_ more interesting challenge. 2. **Techniques not people** : the aim of this is _not_ to identify the individuals behind the scams, it's to see how they work. Identification involves an inherent risk of mistaken identity and would likely have no tangible benefit (because they're probably not operating from a jurisdiction which might act against them). 3. **No Payments** : this stops as soon as some form of payment is required to be able to proceed. Even a small convincer is unacceptable as it'd be directly funding fraud (plus I _really_ don't want to end up with a CIFAS marker for the sake of a blog post). * * * #### Don't Try This At Home Before I get into it, it seems worth adding a warning for anyone thinking of doing anything similar. As well as observing the rules above, I took a number of opsec measures. It was always my intention to publish findings under my own name, but even then it still wouldn't have been wise to interact with the scammers from my normal devices or connections. If you don't know _what_ measures might be needed, or don't know how to take them, you should avoid trying to do something similar: this is a business for these people, they won't always take kindly to being messed about and you sure as hell don't want them having your real details. * * * ### Section 2: The Opener The opening emails are sent from hacked accounts. There was nothing of particular interest in the headers of these mails. What **is** of note though, is that the style of this opener makes it relatively easy to link it to other campaigns. For example, a new campaign started in February, although it uses a different backstory and images of a different woman, it's visually very similar: * * * #### Image Attachments Each of the opening mails had a single photo attachment which varied (though they were always of the same woman). Unlike with some other spam, the photo attachments were not of an explicit nature. In the openers that I received, there were three different photos: Note: I've since identified and warned the subject of these images. Checking the metadata of each of the images revealed that they carried IPTC tag `iptc.application2.SpecialInstructions`: The `FBMD` prefix of the value is _relatively_ useful in that it tells us where the images originally came from: Facebook/Meta are known to add this tag as part of _their_ effort to stalk the world. The presence of that tag means that it's a near certainty that the images originated from a Facebook or Instagram profile. * * * #### Email Wording The wording used varied between mails, but always followed much the same theme: * She ran across your email somewhere inspecific, a while ago * She is not a scammer, but they need to know that you aren't either * She will not send nudes (because the profile the images are stolen from doesn't include them) * She lives in Kazakhstan The other thing that the mails **all** had in common, was that the recipient was asked to reply to a _different email address_ : Aidana's "regular" mailbox. This is requested because the opener was sent from a hacked account and the scammer need marks to communicate with an account that they fully control. It's a slightly disconnected and unusual approach though, because other campaigns tend to use `Reply-To` headers to achieve the same thing without it being so obvious to the user. However, that disconnect also afforded **me** an opportunity. A brand new mail thread doesn't include `In-Reply-To` headers, so I could initiate contact from multiple different accounts (and personas) without it being immediately obvious to the scammer that they were receiving replies from accounts that they'd never actually emailed in the first place. * * * ### Section 3: Making Contact I emailed the scammers from multiple accounts, with each using a different writing style and sharing a different backstory. Adopting different personas brought a number of benefits: * it mitigated the risk of not being assessed as a viable mark: if I lost one account, I'd still have the others * if I discovered something interesting, I could use another persona to test and confirm consistency In this post, I'll only refer to two of these personas - the others didn't yield anything _particularly_ different and bouncing between them would probably make for quite confusing reading. The two personas had **very** different writing styles. Carl struggled with spelling (which got worse as conversations progressed). His first message was pretty on the nose: > OMG Aidana, your gorgeous! > > Khazakstan looks amazin, have you got any more pics of where you live? > > I'm 53, divorced and live in Yorkshire in the UK. Its a pretty place to live, but there aren't any women who look nearly as good as your round ere and the evenings can be a bit slow > > btw, how did you find my email, is there someone we both know or was it just luck? either way, nice to meet you Andy, on the other hand, wrote quite clearly and asked questions. His mail also included a gentle hint that he had some money in the bank: > Dear Aidana, > > I'm very sorry that I missed your email last night, but it was an absolute delight to read this morning. > > I know you're not used to making contact first, but I'm really glad that you did. > > My name's Andrew, but I normally go by Andy. At 49 I'm only a little older than you. As you've told me so much about yourself I feel I should do the same: I'm a jolly sort and enjoy a good laugh, > > I live in Buckingham in the UK, although I was born a little further south. > > It's nice living here, yes. A couple of years, I managed to retire early, so I spend my days gardening or exploring the local countryside. I'd love to go travelling around the world but haven't yet met someone to go with. > > How about you, have you always lived in Kazakhstan? Have you ever been to the UK? * * * #### Delayed Response I expected that things would start up quite quickly but, instead, days went by with no reply. Eventually, I decided that I'd probably been jilted: maybe my messages hadn't met whatever criteria they were looking for in a victim. A little over 2 weeks later, though, replies started arriving and all of my initial contacts received replies on the same day. I've since learned that this is because the scammer runs campaigns on a monthly cycle, taking a cohort of victims through the process at _about_ the same speed, before starting again the next month with a new set of victims (and a slightly different storyline). * * * #### Analysing The First Reply The very first thing that stood out was that the reply came from a different email address than the one that I'd contacted. To help mask this change, the scammer had copied a quote of my email into the bottom of their mail and had also set an `In-Reply-To` header referring to my original mail In-Reply-To: <BANIUp32c5MFW4-1GOr=BLAdQpnECDBiTuPn532gTWruaOU172A@mail.gmail.com> The inclusion of this header will cause _some_ mail agents to show it in the original mail thread, masking the change in sender. This mail had a photo attached too: This time, there was no Facebook IPTC tag in the metadata, but it **did** show that the image had been edited in Photoshop in early January: We'll come back to Photoshopping in Section 5. Inspection of the email headers also yielded some additional information. Firstly, the `Date` header indicated that the author was likely to be in a UTC+3 timezone Date: Mon, 3 Feb 2025 12:18:45 +0300 Kazakhstan is in UTC+5, so it was unlikely that the author was there (_shock_ , horror!). If we look at countries with a UTC+3 exposure, we get the following Although this narrows things down a little, it doesn't tell us all that much - there are a number of countries on there known to be associated with online fraud. Going back to the headers, then: Received: from [192.168.191.209] ([194.147.xxx.xxx]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-438dcc6df1esm184820675e9.27.2025.02.03.01.18.48 for <redacted: my address> (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 03 Feb 2025 01:18:50 -0800 (PST) The format used in `Received` headers is defined in RFC 1123: > The FROM field SHOULD contain both (1) the name of the source host as presented in the HELO command and (2) a domain literal containing the IP address of the source, determined from the TCP connection. This tells us that, after connecting to Gmail, the downstream mailer `HELO`'d as `192.168.191.209` (an RFC1918 address), over a TCP connection originating from `194.147.xxx.xxx` (redacted by me). Both the RFC1918 and the public address remained constant across all subsequent emails. Shodan showed that the IP had ports 80, 111, 443, 1723 and 5006 open. Although 80 (HTTP) and 443 (HTTPS) were open, they weren't serving anything HTTP like. Whilst the TCP session established just fine, `443` didn't respond to a SSL handshake (I didn't look any further - Rule 1). Port `1723` is used for Point-to-point tunnelling, and Shodan showed the port returning a PPTP response. Combining that with the RFC1918 address in the SMTP headers, it seemed likely that the box was being used as a VPN endpoint. * * * ### Section 4: Our Conversations I'm not going to write a blow-by-blow of the conversation that went back and forth - it's unlikely to be particularly helpful or interesting (in fact, at one point, _even I_ was starting to get a little bored), instead this section talks about some of the things that I observed during those conversations. Something worth noting is that the original emails often consisted of long blocks of text: This can make for quite difficult reading, so I've added some line breaks when quoting here. * * * #### Common Backstory Although her age differed a bit, the backstory given to Aidana was the same across the board. In order to try and build trust, details were spread across replies. For example, the first playbooked reply told me that "Aidana": * Is a Dentist and Maxillofacial Surgeon * Lives in Almaty, Kazakhstan * Has worked abroad in USA, France and Israel * Doesn't smoke, barely drinks * Goes to the gym regularly * Is looking for a genuine man to build a relationship with In later replies, Aidana also confides that * Her father left her and her mother for another woman when she was young, she doesn't hear from him but is still angry * Her mother died of cancer * Aidana doesn't have a Facebook profile because her friend was hit by a car whilst walking across a street absorbed in the app * Her best friend is called Yulia After a little bit of back and forth, Aidana reveals that she's going to give a presentation in order to try and win a dentistry competition, which would fund overseas training in a country of her choice: > Tomorrow I'm going to take part in a competition among medical professionals and the best candidates will receive grants for training and professional development. > > Such events take place in our country every year. Such grants give the opportunity to free-of-charge training abroad in order to acquire a new experience. > > I told you that I have already got such grants for several times and I had an internship in Israel, the US and Europe. So today I tried to finish my work as soon as possible. I think that I'll go to the gym because it helps me to concentrate and express my thoughts correctly. > > This evening will be dedicated to the preparation of my presentation because I really wish I can get this grant. > > This grant gives the right to choose the country the winner wants to have there practice and a clinic, where one will be trained and work. You will no doubt be unsurprised, dear reader, to hear that Aidana ultimately won that competition, and chose to come to the UK in order to build a life with me (well, Andy). * * * #### Common Playbook, Different Wording After the first reply, conversation shifted onto what seemed to be a common and loosely templated playbook. Each mail had a semi-personalised section at the top, but then shifted onto a common story, albeit worded slightly differently. For example, Andy received > I am emailing you from my work computer as my personal computer is broken and it's being repaired now. As it's not a new computer, I was told that there are a lot of issues to be done, but they will do their best to make it work again. > > I devote a bunch of time to my work and it will not be an issue for us to communicate and I can use my work computer to message you with no issues Whilst Carl received > I am writing you an email from my work computer as my home computer is broken and I took it in for repair. My computer is old and the handyman told me that there are many problems to be solved, though he will try to repair it. > > I devote a bunch of time to my work therefore it will not be a problem for us to communicate and I can use my clinic's computer to email you without any problems. The two passages _basically_ say the same thing but expressed in (very) slightly different terms. The reference to using a work computer was _very_ likely intended to provide cover for the fact that "Aidana" only replies to emails once a day. It's _not exactly_ a Gavin and Stacey style love affair... * * * #### Reassure little and often Aidana's communications often contained reassurances about her intentions. Some of those were obvious: > I heard a huge amount of stories about Internet fraud and I want to make you sure that I need nothing from you. > > I have nothing to do with your personal details, bank profile or money either. All I'm interested in is building relationship based on love and trust Whereas others are (a little) more thinly veiled: Aidana wanted to be sure that she was not talking to a scammer (a technique which in turn sows the suggestion that _she_ is not one): > I only want you to be sure that all I need is a fair relationship and I have no interest in your money. 3 times I attempted to meet my love on the Internet, but I met only freaks, benedicts or just scammers. It was initially a little difficult to know how to respond to these statements: I didn't want to blow cover by going overboard with overly-early expressions of trust, nor did I want to risk not appearing to be trusting enough. In the end, Andy ignored it, whilst Carl breezed over it with a reply that wasn't really much more than a hand-wave: In practice, I don't think that the scammer paid much attention to either. As the conversation progressed, explicit reassurances like this became less pronounced, with effort instead being expended on reinforcing the unfolding story line. This passage, in particular, made me think of how Netflix producers sometimes ask writers to have a character explain what's going on: > By the way, Yulia is sending you big greetings. She asked me how this would all work and how I would go to your country, because it sounds very expensive. I told her that the medical association and I signed a contract according to which I get the award in the amount of 5000 pounds per month. This is the first part of my grant liabilities. This money is given to me for travel and accommodation expenses. > > Under the terms of this grant, my employer, as one of the sponsors of this organization, is obliged to contribute 50 percent of the total amount. > > After that, the representative of this organization incurs all my expenses connected with the transfer to your country. As soon as I arrive at my new place of work, within 2-3 days I will meet a representative of the medical association in UK. > > Yuliya said that it's great that I do not need to pay for anything by myself and just collect some documents. The only practical purpose this passage serves is to reassure the mark, otherwise a simple "I explained how the funding works to Yuliya" would have been **more** than sufficient. * * * #### Getting Told Off Aidana seemed to have bad luck with technology: in the email after disclosing her computer issues she mentioned that she'd accidentally broken her mobile phone (by dropping it in some kind of fluid at work) and that it'd be a few weeks before she could Whatsapp. Apparently, though, she'd been given permission to call using the work phone instead: > I am asking you write your phone number in your next message and in the next couple days I will be happy to call you. Okay? I want to believe that you have a desire to know how I sound too, do you??? I am sure that a telephone conversation will help to bolster our credence in each other. I wasn't _quite_ prepared for a call yet, so Carl explained that his phone was also broken ("Imagine us both breakin our phones at the same time, must be a sine from heaven!") whilst Andy simply ignored the request. In the next reply, **both** got chastised for it: > My dear, why didn't you answer my question? In my last letter, did I ask for your mobile number????? Why didn't you write me your number?????? You don't want me to call you????? I did _eventually_ provide a number (we'll come to that below). * * * ##### Possible Use of Conditioning Techniques Being told off for not providing a phone number did lead to an interesting observation. Every other email had included photos of Aidana, but these did not. Was this deliberate? Over time, I used different accounts to fail to comply with requests in different ways and, sure enough, it _seemed_ to have an impact on whether the scammer attached a photo or not. This only happened during the earlier stages of the scam though - once our psuedo-relationship had started developing, photos were much more consistently sent. It's still a small sample, so perhaps I'm reading too much into it, but it _appears_ that the scammers have built positive conditioning into the early stages of their workflows: if the mark does what Aidana asks, they receive a reward in the form of a photo, if they don't, then no photo is forthcoming. * * * #### Playing Fast And Loose With Photos Unsurprisingly, having sent a few photos, Aidana wanted my personas to provide photos of themselves - presumably to help build and demonstrate trust. **Obviously** , I wasn't going to send photos of myself, so for most accounts I used an AI image generator to generate a set of photos of an appropriate looking person. I wondered, though, whether the photos were actually being looked at (and whether there was a risk of them turning up elsewhere), so I decided that Carl should experience some technical issues: > Oh sorry Aidana, I fort I'd added photos, tryin again, hopefully that worx. I'm not gud with computers lol Carl attached three "photos": Aidana seemed to be _delighted_ > I want to thank you very much for your photos. I think you're a very handsome man and you look very good. It seems quite unlikely that "Aidana" is into men who look like a scaled out game of mega-Tetris, but it's impossible to say whether this shows that: * Supplied images aren't looked at * Supplied images are normally looked at, but this wasn't * The images were looked at, but the scammer didn't care and just wanted to progress the playbook * * * #### Developing Familiarity In this sort of scam, the fraudster generally tries to increase the level of apparent intimacy and trust quite quickly. In keeping with this, most of our conversation focused on relationship building - talking about likes, dislikes and our respective days (though in practice, of course, we were _both_ lying through our teeth). The scammer drove that agenda by asking questions before sharing information about Aidana's day: > How are you today? How is your mood? I hope everything is fine. > > .. snip .. > > Today I woke up early in the morning. Bright sunbeams made me open the eyes. They filled my room with light and woke me up, because my bed is situated near the window. I got up, took a shower, went to the kitchen, prepared my favorite breakfast of omelet, a salad of Caesar and coffee. This aspect of conversation was all pretty superficial - neither of us wrote very much or went into any meaningful depth. But, despite this, Aidana's emails began to introduce more familiar terms, starting gently with things like "Darling" and "My Dear". This familiarity started relatively quickly (about 4 emails in) and was soon followed by the fraudsters describing how Aidana had found herself thinking about my personas during the day: > Even on such a big day I woke up thinking about you. I guess I'm starting to get used to reading your letters every day. They have become something very important for me. As always, I quickly had my breakfast. I keep thinking of you all the time… I feel you are that very person about whom I was dreaming and finally I’ve found and now I don’t want to lose you! Around this time, phrases such as "all my heart" also started to sneak in. By the 8th email, "Aidana" was signing emails off **much** more affectionately: > Love and 1000000000000 kisses to you! A _trillion_ kisses! Whirlwind relationship or what? * * * ##### Phone Calls Aidana had pushed several times for me to provide a mobile number. I suspected that there wasn't _really_ a way to get them to progress through their playbook without providing one: the endgame likely involved use of SMS or Whatsapp to message repeatedly about some urgent issue that Aidana would need me to help financially with. But, dragging things out until we seemed closer to that point also wouldn't work: romance fraudsters often have quick phone calls with their marks - if you look at news stories, victims will often say "we even spoke on the phone". So, in preparation for this, I set up a burner phone and download a pay-as-you-go e-sim onto it (with a plan to contact the phone company once I was done so that they could take the number out of circulation for a while). Aidana was very happy and even provided her own numbers > My dear, I really want to hear your voice and today or tomorrow I'm going to call you on the phone. I'm asking you to keep your phone close on these days so you don't miss my call. Ok? I will call you from my work phone `$REDACTED`. > > If I can't reach you from my work phone, then I bought a prepaid phone card for international calls with the number +1-843-968-4772. This prepaid card allows you to make calls much cheaper than in a regular phone. So I think I'll call you from this number. I've redacted the "work" number because it's not clear that it's under the control of the scammers and was likely only provided to help build confidence. I checked though, and it is on a Kazakhstan provider. Looking up the US number shows that it's a VOIP number used by Skype, which isn't overly surprising given that Microsoft make it quite difficult to report out-call fraud (though, it appears that Skype is soon to become a thing of the past anyway). The recording doesn't do it justice, but for a woman purporting to be from Kazakhstan, Aidana sounded _an awful_ lot like she was from somewhere else (I've cut my side out of this recording as it seems a little unwise to include my voice): The call was just over 3 minutes long - she didn't drive any particular agenda and basically just made small talk, with a bit of talking about the future mixed in. A little way into the recording, you'll hear "Aidana" say "house?". She'd asked what I was doing, so I'd said that I was on my way to Chatsworth house (having retired early, Andy spends his time visiting stately homes on his own). Unfortunately, it's not audible in the recording, but there was a definite "ding" sound in the background just after. As I hadn't heard any typing, the impression that I got was that someone else had messaged her. It could _totally_ be coincidental, but might also be a sign that there was a silent third participant. There were several subsequent phone calls, all from the same woman and all quite short, though I didn't hear any similar background noises in those. * * * ##### Sex Now, if you recall, the opening email had been _pretty clear_ that Aidana didn't want to talk about sex. However, as the relationship bloomed, the scammer tried to drive intimacy a bit further: > I am more and more often visited by the thought that you think of me as a woman. Andy Do you like my body? I believe that Sex in a relationship is also very important I would like to order what you think about it. I ask you to tell me your sexual ideas. I want you to do this first since I never wrote about it first. This posed a potential problem. Although I was invested in this project, I _definitely_ wasn't "flirty sex chat with some random scammer" levels of invested. The thought also dawned on me that part of their playbook _could_ even involve "Aidana" calling for phone sex. Either would be crossing lines that I didn't want to cross, meaning that I'd stumbled upon an unexpected 4th rule of engagement: don't talk dirty with scammers. Thankfully though, my persona was British, so I was able to deflect things by essentially saying that, whilst I was attracted, sex is just not something that we talk about. * * * #### Shared Tasks Although Aidana drove the storyline, she occasionally needed Andy or Carl to help out by finding her information necessary to prepare for her trip. For example, she needed to know which airport to fly to: > In order to book the tickets, Andy, I have to know the nearest airport to you She also needed to know details of a few dentist practices that she could potentially complete her training at: > Could you please find in the internet at least several dental clinics that are situated to as close as possible???? I need 2-3 state dental clinics and 2-3 private practices. It is necessary for me to conclude a contract with the International Medical Association. The aim here, of course, is to encourage the victim to feel invested in the story line, because they're _even_ helping with the planning. * * * #### The Medical Association In one of her emails, Aidana informed Andy that the Medical Association would be calling him in order to verify arrangements: > My dear, I told the medical Association that you would meet me at the airport and help me adapt to the United Kingdom and give me some tips about the culture and life of your country. > > The main office of the medical Association in Denmark wants to make sure that you are a good person, and not some kind of maniac. > > I hope you don't mind if a representative of the international medical Association calls you and asks you a few simple questions??? I ask you not to worry, this will just be a normal conversation to make sure that I am not in any danger in the United Kingdom. Andy you must understand that this is necessary. I hope you won't have any problems answering a few questions in English. > > You will get a call from a representative of the medical Association, Erika SARBANTOVICA ( +1-318-602-0467). > > She is currently on a business trip to the USA and therefore will be calling you from this country. That number, just like Aidana's, is a Skype out-call number. The call didn't come through for a couple of days, but eventually my phone rang: Annoyingly, Cube ACR didn't record it very well beyond this point, there was some background noise which seems to have broken things (it seems to have to hook in as a speech recognition helper, so my uneducated guess is that android just doesn't send it audio that doesn't sound like speech). When I first received Aidana's email, I assumed that this call would be used to try and collect additional personal information (address etc) under the guise of safety verification. However, in the event, the questions were much simpler * Will you be able to collect Aidana from the airport? * Are you able to provide/arrange accomodation and help her understand the territory? * She'll need to get to various places, can you provide/arrange transport for the duration of her stay? * Can you help with language and developing her English? Erika ended the call by saying that everything sounded fine and that she was glad that Aidana seemed to have friends to support her in the UK. So, just like the others, the only purpose of this call was to help to reinforce the storyline. Although this was the only direct contact with a "member" of the medical association, they were fairly key to the story line and the scammers provided various forged documents purporting to be from the European Medicines Agency (who, incidentally, are aware that they're being used in campaigns like these). One of the interesting points here is that it'd be easy for a curious victim to mistakenly _authenticate_ some of the documents: Searching for the name of the top signatory **will** lead to real results: Christopher H Fox is the CEO of the International Association of Dental Research (good lies have an element of truth to them and all that). * * * #### The Request For Money I introduced various opportunities for the scammer to take financial advantage. For example, when Aidana mentioned that her phone was broken, Andy asked if she needed help buying a new one. These all went ignored. However, about 3 weeks in, the inevitable happened and Aidana's plan to move to the UK came under threat. Unfortunately, part of the funding that Aidana was relying on had fallen through because her boss had found out about _our_ relationship and realised that she wasn't coming back: > In the morning I had a meeting with my director of the clinic. > > We had an extremely unpleasant conversation. I wanted to get concrete answers to my questions. I asked him why they still have not transferred the money. My director immediately rebelled and told that he won’t pay for anything. > > He got to know that I am planning to stay with you forever. He explained me that the clinic does not want to waste money on me if I am not going to return to Kazakhstan. > > I was very disappointed to hear that, but I understand what he is talking about. So, not only was our future in jeopardy, but it was partly _my fault_ that Aidana risked missing out. Thankfully, the situation wasn't entirely hopeless: Aidana still had the certificate for the first month's £5000 grant. The problem was, the curator of her trip said that it could only be cashed in once she had arrived in the UK: > I asked him if I can cash out the certificate in Kazakhstan that I could make this payment. He replied that over the past two years UK changed the rules. If I decided to fly to China, there would be no problems. I could get the whole amount in Kazakhstan. But your country allows to cash out the grant certificate only on its territory. > > That is why my curator said that the only way out of the situation is to pay this money for my own. Aidana's a resourceful woman though, so rather than breaking down under the weight of the bad news, she set about trying to solve the issue on her own: > I went to the bank where I keep my money on the deposit. I was a client of this bank for a long time and I was sure that they will be able to give me a loan for a short period of time. I explained the whole situation to the manager of the bank. I showed him a copy of my grant certificate that I will be able to cash out it and immediately transfer money back from UK. > > The manager denied me for the same reason that my curator. I collected all my saving, it was 900 pounds and went to work to write you a letter. On my way I thought where I can get the rest 600 pounds ???? I'm still paying a loan for my apartment, so I can’t sell it so quickly. Then I realized that I have no other way but borrow this amount from you. Although she could raise most of the money, she couldn't quite get across the line. But, love always wins, so Aidana _knew_ that Andy wouldn't leave her high and dry: > I'm sure that for you Andy it's not big a problem. We both have done so much for our meeting, I do not think you'll deny me now. > > Moreover I will give you this money back in 3-4 days after my arrival to you. I will cash out my certificate and give you all sum back. I can’t even imagine that such a situation could happen with me! I feel so helpless now… So many things have already done, documents, tickets… And now I have this problem that can ruin everything!!! I have no other choice but to ask you for help! > > I am confident that you Andy will not leave me in this situation! Hope that situation will be the first and the last obstacle on your way to be together. This email had 4 attachments: These were * A photo of "Aidana" holding a handwritten note reading "Help me please my love" * An invoice purporting to be from the European Medicines Agency, totalling about £52K * A photo of a UK entry clearance visa, with Aidana's photo and details on it * A PDF claiming to be Aidana's bank statement Aidana also recognised that this was something that really needed a proper conversation: > P.S. My dear, we need to discuss this issue over the phone. I'm asking you to wait for my call today. I'm asking you to keep your phone close. > > I am very upset about this situation and I want to consult with you and discuss everything in detail. I am very ashamed to ask for your help, but I have no one else to turn to. She phoned a few hours after the email had arrived, sounding much more despondent than in earlier calls: Later that day, as promised in the call, details arrived of how and where to send the money. However, the scammer also asked if I could send a little extra to help cover any "unforseen expenses": > My dear, I am glad that we were able to talk on the phone and that you understood my problem and can help me with these 600 pounds. If it's not difficult for you, add an additional 20-30 pounds for unforeseen expenses. There were two ways that the money could be sent: * SWIFT transfer to an account held at Bereke Bank in Kazakhstan * Using Moneygram or Western Union The beneficiary name provided for the bank transfer was the same as on Aidana's docs. However, this doesn't _really_ mean too much: although SWIFT transfers require a beneficiary name, not all banks bother to validate it when a transfer is received - as long as the BIC and IBAN are correct, the money will end up in the destination account. Moneygram and Western Union were also offered as an option but, Aidana explained, needed to be sent to a different beneficiary: > I can't use the Western Union or MoneyGram system because my name is blocked there. The fact is that 3 years ago I was in the USA and sent money to my cancer-stricken mother in Kazakhstan through the Western Union and MoneyGram system and I constantly had problems with money transfers. > > Because of some problems, for some reason, money transfers were constantly blocked. And after three attempts, I was blocked in all systems. It was a sad experience using these systems, as my mother urgently needed money to buy medicines. > > But this is not a problem and I asked my friend Diana to help us if we use these systems to transfer money. The friend, Diana, is probably a money mule and, in fact, the address that was provided for her later in the email _appears_ to be that of a hotel. * * * #### Ending It We'd reached the point of no return, because Rule 3 was clear that I couldn't make any payments. I could _perhaps_ have faked a payment receipt and messed them around for a bit, but that wasn't really the aim of this process, so it was time to tie things up. Andy "tried" to send money, but the bank flagged the transaction and froze his account > I tried to transfer you the money, but unfortunately the bank blocked it. They phoned up shortly afterwards and said something about it being a donkey (or maybe mule?) account. > > Are you sure you gave me the right details? I didn't entirely understand what they were saying, but they seem to have been concerned enough to freeze my account. > > I can probably get down to Moneygram with my credit card tomorrow, but if you've got the correct bank details it'd be useful so I can explain it to the bank and unlock my account. Aidana replied and said that the details were definitely correct and that the issue must be at Andy's end > I have had this bank account for 3 years now and have never had any problems. That's why you probably did something wrong and that's why your bank did it. She suggested that he use Moneygram instead, in the process sounding _a little_ like she might work for Moneygram's marketing department > Since we are having problems with a bank transfer, of course it would be better if we use the MoneyGram system. It is a fast and convenient money transfer system, and MG has offices in every city. Attached to the email was another picture of Aidana holding a handwritten note, this time reading "I love you Andy". The photoshopping on this image was a little more clumsy though and one of the original items was still visible in the woman's hand. The next day, Andy replied to say that he'd been to the bank to unblock his account and that they'd sat him down and explained that this was all a scam. > The manager explained to me that you're not really you and that this is all part of some kind of grand scam. I told him that he was wrong and that we'd spoken on the phone, but he said that that was quite common. They said that you're probably not even in Kazakhstan at all! Aidana's reply came within hours > It's insulting to me that you're comparing me to scammers. Did I deserve this? Why do I always have to prove something to you? What kind of nonsense is being said in your bank??? > > I have sent you all my personal documents, passport, visa, plane ticket, grant certificate, website of the clinic where I work and even a photo with your name. Don't you know that no one gives you such information on the Internet, but I trust you 100% and that's why I sent you my bank and personal documents. > > But even so, you don't want to believe me. How else am I supposed to prove to you, Andy, that I love you? You're making me cry with your suspicions. To help re-assure Andy she re-attached the photo of her holding a note, the photo of her visa and a new document, purporting to be an order of payment from the European Medicines Agency. She also provided a contact email address for the medical association (who totally use gmail, _obvs_) so that Andy could verify her story (I did email them, but got no response). Despite being upset, she knew that Andy would change his mind > I also sent you my plane ticket and my visa. Doesn't that prove to you that I'm coming to your country??? You offend me with your doubts and distrust..((( > > I will be anxiously waiting for your answer. I know that you are a smart and strong man, and you will change your mind and send me this money via MoneyGram. But Andy stopped replying, having gone to try and work out how to mend his broken heart. It was over. * * * #### Summary of Communication Techniques The scam starts with a simple lead: I'm young, I'm hot and I'm looking for something _real_. From that point on, the communication focuses solely on building trust (both in Aidana and the story line itself). The techniques used are not _particularly_ sophisticated, but are applied quite consistently. I presented multiple opportunities for the scammers to try and take advantage of me, either financially or by collecting personal details, but they ignored those in favour of the much bigger potential pay-off at the end of the scam. The early communication is loaded with reassurances, Aidana tells marks about the horrible luck that she's had with online communications in the past and that she needs to know that this time is different. Attempts are also made to reassure the victim that Aidana _really_ doesn't need anything financially, she's only here for a chance at love. Although the early communications are friendly, the terminology used is (more or less) in line with a couple of strangers conversing. However, more familiar terms start to appear quite quickly. A good example of this can be seen in the way that Aidana closes emails: Email | Closing Line ---|--- **1** | Best wishes **2** | I wish you a good day **3** | Hope to hear from you soon and I would email you as soon when I get your letter **4** | Sincerely yours **5** | From all my heart **6** | I kiss you **7** | Send you kisses and hugs! **8** | Love and 1000000000000 kisses to you! **9** | I send you my tender kiss and will wait for your letter **10** | With all my love **11** | I send my hugs and kisses to you **12** | 10000000000000000 kisses **13** | 10000000000000000000 kisses. Waiting impatiently for your reply **14** | Forever yours **15** | Hug you and send you 10000000000000 kisses **16** | I love you Andy! yours lady **17** | Your woman **18** | Your Aidana **19** | Forever yours **20** | Your love **21** | Your Aidana Side Note (I got nerd-sniped): The huge amount of kisses in email 13 is _ten quintillion_. Assuming that each kiss is a quick peck, averaging 1 second in duration, it would take _310 million years_ to deliver them all. Even the ten trillion in the other mails would take 317 thousand years. The level of familiarity used within the emails takes quite a sudden leap, and the L bomb is dropped pretty quickly. This rate of development is, obviously, very much in the scammer's interest - they need the victim to be as invested as possible, as quickly as possible. Periodic phone calls are also used to help this along. Going into this, I knew that there probably _would_ be phone-calls, but I had no idea what to expect. I thought that they _might_ be used to try and verify my side of the story (I even printed a small cheat sheet to keep in my pocket), or to take advantage in other ways. But, instead they were literally just small talk, with the human contact being used to help increase the feeling of being in a relationship. Although, the rate of relationship progression is quite unnatural, it's masked by Aidana only emailing once per day: We only talked for a little over three weeks, but it _felt_ like we were talking for ages. That time-stretching effect, presumably, helps to explain why victims often seem to have fallen _so deeply_ in so little time. Although not definite, there are signs that the scammers may be using conditioning techniques to help encourage marks to invest more in the psuedo-relationship - photos seem to be withheld if the victim ignores a request that Aidana cares about (such as providing photos or a phone number). In order to help the victim feel invested in the story line, Aidana asks them to seek out some information (such as nearby dentists) for her. The scammer also uses guilt - not only was the issue with money implicitly Andy's fault (it was being withheld because Aidana's boss had found out about their relationship), but the emails contained a number of loaded phrases: > My heart is crying and has only dream to see you and to be with you… > > I am very ashamed to ask for your help > > My love, I hope you really understand that your help is vital for me! I rely only on you! You are my moral support and confidence in the future! > > I cried all day today after I read your letter. > > Did I deserve this? Why do I always have to prove something to you? > > You're making me cry with your suspicions > > You offend me with your doubts and distrust The last four in that list, of course, came after Andy raised the possibility that this was all a scam. Having (hopefully) built the feelings of a relationship, the scammer weaponises them to try and ensure compliance. After all, if you've fallen head over heels as part of a whirlwind romance (or mistaken lust for the same), the words "You're making me cry" will likely go a **long** way towards changing your behaviour - humans that have been swept off their feet tend not to be particularly rational. This also means that if, like Andy, a victim becomes aware that they're being scammed, the "break up" will be _extremely_ painful and difficult: the scammer will keep replying (and trying to convince) for as long as the victim is willing to keep engaging. All that the victim can do is walk away (and of course, be aware that their email address is going to be considered a viable target for future campaigns). * * * ### Section 5: Gathering Information This section focuses more on analysing the materials that the scammer sent over, as well as looking for additional information. * * * #### Click My Link After I'd included photos in a reply, Aidana had an additional request: > I will be very pleased if you'll send me as many of your photos, photos of your home or friends I sensed that this gave me a potential opportunity. I bought a suitably named domain, pointed it at a cloudy VM and built a very simple page which, before redirecting onwards, ran the following javascript function recordStats(){ inf = btoa(JSON.stringify(gatherInfo())); img = document.createElement('img'); img.src = "house.gif?q=" + inf; img.className = "pix"; document.body.appendChild(img); }; function gatherInfo(){ return { platform: navigator.platform, lang: navigator.language, timezone: new Date().getTimezoneOffset(), ua: navigator.userAgent } } recordStats(); For those who don't JS, what this does is * Capture the OS name, configured language, timezone and user-agent * JSON encapsulate it, base64 encode that and embed an image with the base64 value as the query string (resulting in the base64 string being included in a request to my server). For those who are thinking "you could have collected most of that from request headers", you're right, it's an _extremely_ limited set of collection. It was a toe in the water, though, to test whether they'd be willing to visit a link that I controlled. Although it seemed unlikely that they'd closely inspect links, I didn't want to burn bridges by pushing my luck too far on the first attempt. Note: as this was intended as a test of trust, and only collected information, I didn't feel that it was breaking Rule 1. I replied to Aidana and included my link saying that it was the original listing for my home: > I love your photos, I'll try and find some of me that I actually like for you. In the meantime, it looks like the original listing for my home is still available on the net, so you can see just how wonderful my surroundings are: `$REDACTED_LINK` Relaxing in the garden is really hard to beat. I wasn't _at all_ sure whether they'd click the link - there was no **need** for them to, but doing so _would_ also help them assess Andy's worth as a mark. The next day, a reply came through and, sure enough, my access logs showed that they'd clicked the link: 194.147.xxx.xxx - - [06/Feb/2025:13:00:59 +0000] "GET /[REDACTED] HTTP/1.1" 200 581 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/132.0.0.0 Safari/537.36" "-" "REDACTED" CACHE_- 0.000 tempvm - "-" "-" "-" 194.147.xxx.xxx - - [06/Feb/2025:13:00:59 +0000] "GET /house.gif?q=eyJwbGF0Zm9ybSI6IldpbjMyIiwibGFuZyI6InJ1LVJVIiwidGltZXpvbmUiOi0xODAsInVhIjoiTW96aWxsYS81LjAgKFdpbmRvd3MgTlQgMTAuMDsgV2luNjQ7IHg2NCkgQXBwbGVXZWJLaXQvNTM3LjM2IChLSFRNTCwgbGlrZSBHZWNrbykgQ2hyb21lLzEzMi4wLjAuMCBTYWZhcmkvNTM3LjM2In0= HTTP/1.1" 404 196 "REDACTED" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/132.0.0.0 Safari/537.36" "-" "REDACTED" CACHE_- 0.000 tempvm - "-" "-" "-" If we base64 decode and pretty print the query string, we get the following: { "platform": "Win32", "lang": "ru-RU", "timezone": -180, "ua": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/132.0.0.0 Safari/537.36" } This confirms the `GMT+0300` timezone that we'd observed in the mail headers. The `lang` value is an RFC 5646 language string and tells us that the scammer's browser is set to use Russian. The user-agent tells us that they're running Chrome on either Windows 10 or 11 (the OS version got frozen in UA's after Windows 10). Knowing that the scammer speaks Russian allows us to cross most entries off our list of countries in UTC+3: This leaves us with Ukraine, Belarus and Russia as likely source locations (of course, it could also be a Russian overseas, but there's no good way to account for that possibility). The other thing that was _mildly_ interesting is that the click came immediately before the reply was sent. Because replies to different accounts arrived throughout the afternoon, I had originally theorised that maybe they were replying in bulk and randomising send times, but this suggested that that was not the case. * * * #### Finding The Source Of The Images Reminder: **the images are stolen and this is not actually the person that you might be speaking to** When I first set out on this journey, I did a quick reverse image search of the images attached to the opening emails, but got no meaningful results back. By the time that I was some way into the process, though, "Aidana" had sent me quite a few more photos, so I used standard OSINT techniques to help narrow down my search, ultimately finding the Instagram profile of the user that they'd been stolen from. The woman in the photos is a Russian personal trainer. To avoid being the cause of some angry victim ending up in her DMs, I'm not naming her here, however I _have_ sent a warning that her images are being nicked so that she's not taken by surprise. The real individual being a PT explained why "Aidana" kept mentioning going to the gym - the Instagram profile is **full** of work-out pictures and videos, so the scammers have adopted part of the persona of the person that they're impersonating. Unsurprisingly, "Aidana" eventually started attaching videos of herself during a workout - again ripped from the PT's profile. * * * ##### Examining Changes To Images My earlier examination of metadata had shown that the scammer had edited at least some of the photos (we know it must have been the scammer rather than the PT, because Meta routinely strips metadata from images - really, "Meta" is a bit of an ironic choice of business name). I wanted to understand what they'd changed, why _and when_. Each of the images that Aidana had sent appeared somewhere on the PT's profile, so I grabbed a copy of each for comparison. Remember this image? The reason that the framing feels a bit tight is that the original actually has **two** other people in it. With this knowledge, if you look closely at the cropped version you can see another woman's thighs and part of an arm: Although clearly cropped, this photo didn't contain metadata indicating that it had been edited in Photoshop, so I moved onto one that did - the pink slippers photo: Looking at the original showed that the image had also been cropped, but it wasn't immediately clear _why_ - all that had been cropped out was some food and a nondescript bottle of bubbly. Looking a little closer, though, showed that cropping wasn't what had motivated the edit: She's wearing a wedding ring! The scammers removed it so that marks wouldn't end up thinking that they're talking to someone who isn't single. One of the images that the scammers sent in later emails also contained metadata indicating photoshopping. A quick check revealed that, once again, it had been edited to remove a ring: The metadata on this image, though, indicated that it was edited back in June 2024: When they _finally_ emailed to ask for money, the scammer included a photo of her holding a sign saying "Help me please my love". Aside from the words having been shopped in, comparing it to the original shows a number of other discrepancies The metadata for this image shows that it was photoshopped on the 2nd of July 2024. All of this information tells us a few things * The scammers have been mining the PT's profile for quite some time * They continue to track her profile and steal content - the pink slippers photo was posted to her profile for this years New Year * It's quite likely their first successful use of this campaign was in July 2024 (we've no way to know whether the mark paid, but someone presumably reached the final stages, requiring that the "help" image be created). * * * #### Carl gets dumped I mentioned earlier that Carl's spelling degraded over time. Eventually he ended up writing the following > I no wot you mean about strange but pleasant emotions, cos I'm feelin them too - u were my last fort before I fell asleep and my first when i woke up. I know that it sound cringey but ur a bootiful person Aidana an Im so happy if Im makin life feel better for u like u are me. I really feel like ur my favourite person an I'm oldin our talks very dear The initial drop-off in Carl's spelling hadn't been _entirely_ deliberate, but once it started I decided to let it happen to see how much of a headache they were willing to put up with. As a native english speaker, the passage above is hard going, but readable. Someone less fluent will struggle with it because they'll have to stop and think what the words sound like, but they _should_ make it through. Machine translation into Russian (and back, for my benefit), though, leads to something bordering on nonsensical That paragraph seems to have been the final straw: despite having responded, without issue, to every other mail that Carl had sent, Aidana's next email complained that Carl hadn't replied and said that his messages kept ending up in spam > Hello my dear Carl! It made me very sad to not have a letter from you again. Are you busy? Or are there any other reasons for your silence? Maybe you didn't get my letter? I ask you to check your Spam folder. Your letters are also very often in my spam folder. I hope that today I will receive letter from you. Carl replied to say that he had mailed and re-stated the relevant parts, but Aidana stopped replying. Poor old Carl had been dumped, ghosted even! The key thing about this technique is that it provides a non-confrontational way for the scammer to end things. After sending the initial reply, the scammer can simply filter out any further replies, leaving the mark thinking that it's technology that has thwarted their chances, rather than feeling they've been dumped. Obviously, this isn't done out of care for the mark's feelings: it's likely done so that the mark won't feel reluctant to re-engage if the scammers change their mind later. Sure enough, this **wasn't** the end for Carl, because a couple of weeks later a new possible love emailed him: Aside from the fact that it comes from a hacked account, this opener doesn't follow the same pattern as the original. Not only is the writing style different, but this time, rather than asking the mark to reply to a different address, it uses `Reply-To` so that their mail client quietly does it for them: Reply-To: Tosy <sikeruniplunvv@gmx.com> From: <REDACTED> Subject: you are in my heart., The image attached to this mail was much easier to track down than the ones used in the original campaign - it was a profile shot of (Russian) model Diana Korkunova. These differences suggest that Carl's details might well have been passed (or, more likely, sold) on to someone else. Andy's account ultimately started receiving them too. * * * #### Scammer's System Having exchanged a few emails since my original link, I decided that it was time to have Andy send another in order to try and collect some additional information. The previous link click had provided potentially useful information: the scammer was using a Microsoft Windows machine. Windows has a long history of being **awful** for VPN leaks, opening the possibility that I _might_ be able to induce the scammer's computer into revealing more of their real location. However, doing so would likely require that the scammer be on the page for longer than had been needed for my original redirect and I was worried about blowing Andy's cover. So, I decided to take advantage of (some of) the social harm that Cloudflare has visited onto web users: Thanks to solutions like Cloudflare Turnstile, users are now quite used to page visits being delayed whilst a seemingly unrelated provider checks (or claims to check) that they're human. In fact, I suspect most people are probably just relieved that the page doesn't also show ads... a sad state of affairs really. I designed a simple interstitial page which appeared to be an automated captcha process, explaining away an additional couple of seconds delay before redirecting on to a page that Andy would be talking about: There should be a video here, you may need to visit in a browser Whilst presenting the visitor with a low effort, but semi-plausible interface, the underlying javascript collected additional information about the visiting browser and used a handful of techniques to try and induce VPN leakage. Frustratingly, it took a few attempts to get them to actually click a link, but I was eventually successful by asking > On that topic, your wonderful news meant that you didn't answer my earlier question. Which of these would you rather be collected in? car one or car two For the record, Aidana preferred the more expensive of the two Aston Martins. I wasn't _entirely_ sure which way _that_ was going to go: it's not _really_ in the scammer's interest for a mark to go out and splurge £70k on a car, because it reduces the money available for stealing. But, I guess they figure that you've probably got access to more money anyway. The link clicks allowed me to ascertain the following: * The system was still a Windows box in a UTC+3 timezone * It still had Russian configured as the language * It had the same user-agent as before * There was no DNS leakage: DNS queries all egressed via the known VPN endpoint * They had Google DNS configured as their resolver * They don't have IPv6 connectivity (**or** their VPN client explicitly disables it) * They were using a connection which averaged `1.6mbps` across recent connections * Across all their browser's recent connections, the average round trip time was `250ms` (quite high) * Their system had 8 CPU cores * The browser's reported usable resolution was 1280x760 * Chrome was not in headless mode The browser also reported a connection type of `3g`, however this a little misleading: The Network Information API spec only provides for cellular connections and chooses between them based on RTT. So, the system could just as easily have been connected to Wifi. The resolution that javascript reports is the portion of the screen that's available to display the webpage, so doesn't include things like the address bar. If we allow 40 pixels for that, we arrive at the resolution of WXGA (`1280x800`), which has historically been used by a few tablets and some older laptops. This was flying a little close to Rule 1, though, so I decided not to push it further. * * * #### Fingerprinting The Scammer's Mail User Agent (MUA) Back to passive analysis. It used to be that MUAs would include a `User-Agent` header (hell, most web clients even used to include an `X-originating-IP`) but, over time, those have fallen by the wayside for privacy reasons. However, that _doesn't_ make it impossible to identify what mail client the scammer is using, because most mail clients handle formatting slightly differently. So, if we grab a set of headers from one of Aidana's emails: Date: Tue, 4 Feb 2025 14:21:08 +0300 From: Aidana <REDACTED> Reply-To: Aidana <REDACTED> Message-ID: <155055577.20250204142108@gmail.com> To: <REDACTED> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----------069134143049E2085" Note: I've stripped `In-Reply-To` etc for brevity. The first thing that we can see is that the value of `Message-ID` seems to contain a timestamp: Message-ID: <155055577.20250204142108@gmail.com> The format of `Message-ID` is known to differ between MUAs, so we can use a resource like Forensics Wiki to help narrow down which client might have been used. The only MUA in that list which generates an ID of that format is The Bat!. `The Bat!` is a secure message client for Windows 10, which features at-rest encryption of mail. It's easy to see how that might appeal to a scammer and the use of Windows 10 is something that we already know about. However, we can't _reliably_ conclude that this is the MUA in use based on a single data-point, so we need to go back to the mail. The other thing which commonly varies between MUAs is the format of boundary markers. Outlook and Microsoft Mail, for example, use a marker of the format `----=_NextPart_{long string}`. Unfortunately, Forensics Wiki didn't have an example of the boundary marker format used by `The Bat!`. However, the authors of `The Bat!` offer a free trial, so I span up a cloudy Windows instance and downloaded a copy before emailing myself: At the receiving end, the headers look like this From: B <REDACTED> Message-ID: <381612535.20250214122406@gmail.com> To: REDACTED Subject: Hello World! MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----------0DB0781421434046D" That boundary format looks pretty familiar - both my test mail and the scammers use a short string of hex encoded bytes. * * * ##### Scheduled Sending This brings us to an interesting short-coming that I noticed in one of `The Bat!`s features. The client has a feature called Postponed Sending which (as the name suggests), allows the sending of any given mail to be delayed (you can even set an mailbox wide default): Earlier in this post, I commented that: > Because replies to different accounts arrived throughout the afternoon, I had originally theorised that maybe they were replying in bulk and randomising send times The postponed sending functionality would be pretty much _perfect_ for this. I used `The BAT!` to send a delayed email to myself (using the values in the screenshot above): Date: Fri, 14 Feb 2025 12:35:00 +0000 From: B <REDACTED> Message-ID: <1603024618.20250214123320@gmail.com> To: b <REDACTED> Subject: Test 2 There was now an irregularity in the headers, because although the `Date` header matched the scheduled send time, the timestamp in `Message-ID` still reflected when I hit the `Send` button. Presumably, `The BAT!` generates `Message-ID` when the mail is first moved to the outbox. The features page implies that postponed send is only really intended as a convenience function, so it makes sense that they don't bother to rewrite the other headers. * * * ##### Confirming the MUA I now had three points of contact which allowed me to say, with some confidence, that the scammer was using `The BAT!` as their mail client, because in their emails: * `Message-ID` followed a format that's documented (and observed) as being in use by `The BAT!` * Multipart boundary strings took a format known to be used by `The BAT!` * The metadata of (some) messages contained a discrepancy associated with `The BAT!`'s postponed send functionality Use of this MUA also suggests an explanation as to why the scammer hadn't picked up on my mega-tetris "photos", `The BAT!` can be configured to not download attachments by default, which would allow the scammer to confirm that _something_ had been attached, without having to waste bandwidth, time or local storage by downloading it. `The Bat!` also supports dynamic templating, allowing email bodies to be populated using macros. This feature _could_ provide a potential explanation for why emails to different personas used different words to tell the same story. * * * #### Message Timings The metadata inconsistency does also allow us to get an idea of the scammer's habits, because it enable us to see what time of day they're _really_ replying to emails. I wrote a quick Processing Engine plugin to extract the relevant headers from emails, convert them into timestamps and write them into InfluxDB 3 Core (there's a full copy of this script in my article-scripts repo): mbox = mailbox.mbox(args["mbox"]) influxdb3_local.warn("iterate") # Iterate through messages for i, message in enumerate(mbox): if sender_name in message['from']: subject = message['subject'] # Turn the message date header into a timestamp mdate = mktime_tz(parsedate_tz(message['Date'])) # Grab and strip a copy of the message ID for the tag # value message_id = message['Message-ID'].strip("<>") # Chop up the message ID # these are of the format # aaaaa.YYYYMMDDHHMMSS@somewhere mid = message['Message-ID'].split(".")[1].split("@")[0] # Parse and adjust to be UTC # then convert to ns timestamp mid_dt = datetime.strptime(mid, '%Y%m%d%H%M%S')# - timedelta(hours=TZ_OFFSET) mid_ts = int(mid_dt.strftime('%s')) # Calculate whether it differs so that we can set a tag # We allow a couple of seconds leway just in case the MUA is racey differs = (mdate - mid_ts) > 2 # Create a line recording both. The timestamp is taken from "Date" lp = LineBuilder("scammailtimings").tag("account", account).tag("differs", differs).tag("messageID", message_id).int64_field("id_date", mid_ts).int64_field("date", mdate).int64_field("count", 1).time_ns(mdate * 1000000000) # Write influxdb3_local.write(lp) I added the accounts into Thunderbird and passed the script the path to Thunderbird's local store. Technically, it's possible to fetch the mail directly using `imaplib` but, to connect to GMail, you have to jump through a bunch of OAuth hoops and I didn't really feel it was worth the effort. I added a scheduled plugin trigger for each account so that the script would periodically check for messages and automatically update the database: docker exec -it timings influxdb3 create trigger \ --trigger-spec "every:60m" \ --plugin-filename "timings.py" \ --trigger-arguments "mbox=/mail/INBOX,account=andy,sender=Aidana" \ --database timingsdb \ email-timings Having the data in InfluxDB allowed me to easily compare patterns across recipients. For example, the following looks at how much variance there is between Andy and Carl (before he got dumped) SELECT -- convert the ID date to nanosecond epoch -- and from there to a time column to_timestamp(id_date*1000000000) as time, account, count(account) as count FROM scammailtimings WHERE account IN ('andy', 'carl') GROUP by account, id_date ORDER BY time ASC This generates the following graph If you look closely, you can see that, whilst Carl was active, the bars were pretty close together. This tells us that the scammer consistently hit "Send" in their MUA at around the same time for each recipient. We can then change the query to look at the `Date` header (i.e. when the scammer scheduled the mail to send): SELECT to_timestamp(date*1000000000) as time, account, count(account) as count FROM scammailtimings WHERE account IN ('andy', 'carl') GROUP by account, date ORDER BY time ASC Initially, there were some quite significant differences but, in the run up to Carl getting dumped, they drew closer together. The first emails were written in the early afternoon but scheduled for sending in the small hours of the next day. Running the same query across all of my personas shows a similar pattern across the board: scheduled send times vary but the **actual** send times of emails to each persona are all _extremely_ close to one another. This data tells us that: * There's probably one person behind the emails * They likely do all of their day's work in one sitting * They do so at different times of the day. Although they still make up a non-negligible proportion, delayed sends are in the minority: SELECT differs, count("date") FROM scammailtimings GROUP BY differs It's not entirely clear why, but scheduled sending is primarily used in the early stages - I observed the same happening in a later campaign. * * * ##### Email Hours of Activity We can also use the data to check whether the scammer has any particular hours of operation during the day. This query provides a count of how many days saw activity in the given hour: SELECT COUNT(val) as days_active, hour FROM ( SELECT extract(hour, to_timestamp(id_date*1000000000)) as hour, 1 as val FROM scammailtimings ) GROUP BY hour ORDER BY hour ASC The data can also be used to generate a heatmap to show whether there are any patterns on particular weekdays As with the previous queries, the value is the number of days which had a least one email sent, rather than a count of emails sent: * * * #### Travel Info and PDF Generation As the playbook progressed, Aidana talked about her upcoming flight to the UK. So that Andy would know when he needed to collect her from the airport, she provided a PDF copy of her e-ticket: The first thing of note here is that the flight numbers and timings are real, I was able to look them up on the Turkish Airlines website: The e-ticket itself is, of course, fake and there are some inconsistencies within it (the QR code contains text about skyscanner.ru and the baggage allowance is wrong). Using `pdfinfo` to dump the PDFs metadata also revealed some interesting intel: $ pdfinfo 20250221_attachment_2.pdf Title: Microsoft Word - e-tickets London Author: <redacted> Creator: Bullzip PDF Printer (11.10.0.2761) Producer: PDF Printer / www.bullzip.com / FG / Freeware Edition (max 10 users) CreationDate: Thu Feb 20 16:26:39 2025 GMT ModDate: Thu Feb 20 16:26:39 2025 GMT The author info contained a common Russian name. I've redacted it because of Rule 2. One of the final emails included another PDF that purported to be a bank statement - the metadata on that indicated that it was created by the same author. * * * #### Passport Photo The email that had provided the e-ticket also had an image purporting to be of Aidana's passport (the `Fake` markings are mine): Unsurprisingly, it had been photoshopped, however the metadata on it was more interesting than on most of the other images. All of the previous images had been stolen from an instagram profile. Instagram routinely strips source metadata (because Meta want to be the only ones hoovering up other people's data). This image, however, hadn't passed through Meta's bowels and so still had its original metadata: The photo was originally taken in January 2023 on an iPhone 7 Plus. There was no geo-location metadata, but further down the (long) list of entries was the original timezone offset: `+0600`: The number of regions in UTC+6 is much smaller than in UTC+3: Omsk oblast borders with Kazakhstan, so it's quite possible that the original photo was perhaps taken during "extra checks" of a Kazakh national's crossing. Russia is also the only country which also appeared in our UTC+0300 list: Whilst it's _possible_ that a passport photo might get bandied about internationally, it seems fair to conclude that the author of the emails is likely to be in Russia rather than Belarus or Ukraine. * * * #### The Visa Photo As Aidana's plans began to fall apart, to help reassure (and convince) Andy, she sent a photo of her visa to work in the UK. Just like the passport photo, it hadn't passed through Meta and so had it's original metadata intact. The metadata suggested that the image had been photoshopped by someone _other than_ our scammer: Not only did it report a different version of Photoshop (Creative Studio 2), but the modification date indicated that the photoshopper was in a different timezone to the scammer: The metadata also showed that the original photo had been taken in that same timezone (using a Meizu M6S) This suggested that the work had been outsourced to someone more able to convincingly photoshop official documents (who, presumably, has their own set of base images to work from). There was definitely some attention to detail: the scammers had even ensured that the passport number on their fake visa matched the one on their fake passport. However, the age of the original photo is, _technically_ , an issue. The photo was taken in February 2023, but the UK changed the visa vignette later that year. This means that there are a number of observable differences between the provided visa and any issued this year: Of course, It's _extremely_ unlikely that a target would spot this issue and the visa is ultimately a pretty convincing fake. * * * #### Fake Dental Practice Website Eventually, Aidana provided me with a link to the website of her employer: > I forgot to give you the website ckdent.tilda.ws of the clinic where I work in Almaty. The bits that I've blurred are both the same phone number as Aidana had provided in an earlier email. Interestingly, the links at the top didn't work - clicking Специалисты (Specialists) updated the page fragment but didn't take me anywhere. Scrolling down though revealed a section with staff images: Айдана is Aidana, the profile text describes her as a "tooth fairy for children". Checking the `Last-Modified` header for this image shows that it was uploaded in June 2024 So, they've likely been using this site for a while: June 2024 is also when they photoshopped a ring out of one of the PT's images and is slightly before the photo of a handwritten note was created. This supports the idea that they developed this persona and storyline around that time. If we check in Google Maps, we can see that they're impersonating a real business - there **is** a Dentist at the listed address Unsurprisingly, their contact number is **not** the same as on the site that the scammer sent. The **other** thing that's important here is that this website appears to be used for other campaigns too. For example "Svetlana" also claims to live in Almaty, works as a dentist and looks _quite a lot_ like one of the staff photos. * * * ### Section 6: Corroborating Information From Other Campaigns At the beginning of this post, I mentioned that recruitment for another identical looking campaign started up in February: I had some personas respond to this too, to confirm whether the observed behaviour seemed to align with what I've written in this post. However, it yielded much more than this: the first reply came through whilst I was working on drafting this post and there were some real similarities in the mail headers: Received: from [192.168.0.103] ([188.187.xxx.xxx]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43b7371701esm125442985e9.17.2025.03.02.05.30.57 for <redacted> ... snip ... Date: Sun, 2 Mar 2025 10:21:11 +0300 From: "viktory23@milen2.com" <viktory23@milen2.com> To: George <redacted> Message-ID: <1711410497.20250302102111@milen2.com> Subject: Greetings! MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----------06D0E00071963CB3A" We can tell from `Message-ID` and the boundary format that the author is using `The Bat!`. However, there's a slight difference in the `Received` header - this time the connection did **not** come via the VPN box. In fact, that IP belongs to the consumer subnet of a Russian ISP: inetnum: 188.187.168.0 - 188.187.175.255 netname: ERTH-YOLA-PPPOE-3-NET descr: CJSC "Company "ER-Telecom" Yoshkar-Ola descr: Yoshkar-Ola, Russia descr: PPPoE individual customers country: RU admin-c: ETHD12-RIPE org: ORG-CHYB3-RIPE Yoshkar-Ola is the capital of Mari El which sits quite nicely in a UTC+3 timezone. Fun fact: it's also the location of the prison where Prigozhin was filmed promising convicts release if they fought in Russia's attack on Ukraine. Ukraine is doubly relevant here because the scammers are also using photos of Ukrainian blogger and model Natalia Krasiuk, and the cheeky bastards are even sending out a photo of her with Ukrainian flag ribbons tied in her hair. Getting back to the mail headers, though, there are two likely possibilities here * This is another scammer, part of the same (local) group as the first * This is the original scammer and they forgot to enable their VPN before replying I replied a couple of times to see whether messages eventually started arriving via the original box, but they did not. Other evidence also supports the idea that they were being sent by a different person: * The messages routinely lacked `In-Reply-To` headers * Not replying didn't lead to me getting told off, the scammer just kept replying and progressing Vika's storyline (even referring to replies that I hadn't sent!) Rule 2 loomed large though, so I didn't dig too far. * * * #### Common Backstory Elements Although the reason for it differs, the personas used by the scammers often have no family: their parents are usually dead or estranged. > Today I want to tell you about my family and childhood. I didn't see my mom and dad. Dad was in a car accident when Mom was pregnant. When my mother gave birth to me, she died. I always dreamed of having parents like other children. When I was 4 years old, my uncle took custody. > > I lived with him for two years. But then he began to abuse alcohol. And he was deprived of custody rights. It seems likely that this is intended to convey that they have no real ties to their current country, so would be free to come and live with the victim. * * * ### Section 7: Conclusion Stitching it all together then: there's a reasonable amount of evidence to suggest that the scammer that I was conversing with is Russian. However, they do not appear to be operating entirely on their own - at the very least, they solicited the assistance of a photoshopper in order to generate some (visibly convincing) forgeries of official documents. From onboarding to money request, the scam took about 3 weeks to play out, though it _felt_ as though I'd been talking to them much longer. During the course of the scam, I received multiple phone calls (some pre-announced, some not) from my Kazakh "girlfriend", along with a call from someone purporting to be an official running safety checks. These calls served no purpose other than to try and reinforce my belief in the storyline. The scam is designed to entrap the lonely and the vulnerable: it's quite easy to see how someone who **wants** to believe that they're finding a companion for life could be drawn in before eventually being manipulated into sending money. For all that I've documented here, many of the intended targets of this type of scam simply don't stand a chance, because the scammer spends _weeks_ building trust before hitting an "unexpected" hump in the road. Even if a victim's suspicions do get aroused, the materials that the scammers provide contain _just enough_ truth to stand up to basic scrutiny. A cursory search would validate the airline ticket, the certificate from the EMA and even the address of Aidana's claimed employer. Where that isn't enough, the scammer uses emotional manipulation to try and get the victim over the line: she's given so much, _how could you_ doubt her? Presumably, the storyline and the techniques must work because the evidence suggests that they've been using the Russian PT's images in their scams since _at least_ June 2024. Although observing my rules of engagement meant that I didn't go too far down the rabbit-hole, analysis of the scammer's communications revealed a bit about their habits, including which mail client they use and the fact that they sometimes use it's delayed send functionality. Much like bad spelling in 419 emails, the opener acts as a sort of filter: you're either going to fall for it or you're not. Presumably, this gives the scammer something of a false sense of security. They were more readily trusting and certainly leaked quite a bit more metadata than I had been expecting - despite Rule 2 meaning that I didn't go looking for identities, I _still_ ended up with a name and a home IP being essentially handed to me. Of course, it might also just be that they don't care - it's not as if the Russian police are ever going to be knocking on their door, after all.

New #Blog: My Scammer Girlfriend: Baiting A Romance Fraudster
Author: Ben Tasker

https://www.bentasker.co.uk/posts/blog/security/seducing-a-romance-scammer.html?utm_source=rss

#infosec #osint #scams #security

11.03.2025 13:00 — 👍 0    🔁 1    💬 0    📌 0
Preview
Geoblocking the UK with BunnyCDN I recently wrote about performing an Online Safety Act assessment for my Mastodon server. In that post, I noted that Ofcom's vague statements meant that they could feasibly conclude that another of my sites might _potentially_ fall in scope of Part 5. Since then, I've performed an assessment and ultimately concluded that it's impossible to say whether Ofcom would consider it as being in scope or not. Given the potential for multi-million pound fines, the only **safe** way to proceed is to assume that they _would_ consider it in scope and so would have expectations around compliance. The problem with that, is that it's impossible to comply with demands for "highly effective" age verification without significantly infringing the privacy of everyone who visits. It's not like a licensed premises where you check anyone who _looks_ sufficiently young, verifying that no online user is underage means that you have to track the age **of everyone**. That's a disaster waiting to happen, and I'll not be a part of it. Instead, I've taken the decision to move the site _definitively_ out of scope by geo-blocking UK users. This post describes how to configure BunnyCDN to geoblock requests from specific countries. I'll describe both how to block and how to redirect them to a page explaining the reasons behind the block. * * * #### Blocking Rule To create a rule which simply blocks the request: * Browse to your CDN pull zone in Bunny's dashboard * Choose Edge Rules * Provide a meaningful name * Set the action to Block In the match rules section: * Match All * Country Code (2 Letters) * GB Once ready, click `Save Edge Rule`. * * * #### Redirect Rule Simply blocking users is effective, but potentially comes with a support burden: if you're reachable via other means (say, social media), you may find that you start getting messages from confused visitors saying that your site is broken. To avoid that, I created a page to explain the block and suggest that, if they want to bug anyone, visitors should talk to their MP. Technically, it's possible to serve the blockpage from the same domain, but I didn't want to have to mess around with a complex ruleset, so decided to serve it off a different domain (`www.bentasker.co.uk`) instead. To create a redirect: * Browse to your CDN pull zone in Bunny's dashboard * Choose Edge Rules * Provide a meaningful name * Set the action to `Redirect` * Enter the URL to redirect to * Set the status code to either `302` or `307` Then, in the match rules section: * Match All * Country Code (2 Letters) * GB * * * #### Adding Exceptions The reason that we chose `Match all` rather than `Match any` is that it allows us to add additional rules to create exceptions. For example, I'd _quite like_ for my home address to still be able to access the site (I don't count as a user under the OSA because I'm the provider). Because the rulesets are set to `Match All`, we can achieve this by creating a condition which will not match for our excluded IPs: * Click `Add condition` * Set the match type to `Remote IP` * Click `Match none` * Add an IP that you'd like to exclude You can add additional IPs by clicking `Add Property`: Any IP included in this rule will **not** be redirected. * * * #### Bonus: Block Page Status Having created my block page, I decided that I wanted to ensure that it was served with an appropriate status code. HTTP/2 451 `HTTP 451: Unavailable For Legal Reasons` was mooted in RFC 7725 and seems the most appropriate here. My site is _also_ served by BunnyCDN, so I decided to add the override there * Browse to pull zone * Edge Rules Then * Set the action to `Set Status Code` * Set `Status Code` to `451` Create a condition * Request URL * Set the URL to be the URL of your block page * * * #### Conclusion It is, all things considered, a fairly ridiculous situation. It's not as if I'm operating Pornhub, it's a small archive of photos from when I was more active as a photographer. It's not even _particularly_ focused on model photography: there's architecture, vehicles and even flowers mixed in. Personally, I don't think that it would fall within Parliament's intended scope, because the act's wording relies on the creator's intent > content of such a nature that it is reasonable to assume that it was produced solely or principally for the purpose of sexual arousal Unfortunately, Ofcom have said that they believe it is, instead, driven by context > Whether content has been produced either solely or principally for the purpose of sexual arousal is likely to be dependent on the nature of the content itself, having taken the relevant contextual factors into account, rather than the intent of the uploading user or any viewer of it. As I noted in my previous post on the subject, this leaves a ton of unanswered questions * Is explicit content OK in a non-sexual context? If not, at what point does content become inherently in-scope? * Do viewer interactions alter the context? If users have left sexually suggestive comments, could that push something in scope? Ofcom's stated position is, essentially, that it'll all get ironed out in court. No-one sensible is going to want to be the test case for _that_ , so all we have in the meantime is chilling effects.

New #Documentation: Geoblocking the UK with BunnyCDN
Author: Ben Tasker

https://www.bentasker.co.uk/posts/documentation/general/geoblocking-uk-users-with-bunnycdn.html?utm_source=rss

#bunnycdn #georestrictions #onlinesafetyact #web

10.03.2025 15:15 — 👍 0    🔁 0    💬 0    📌 0
Preview
Mini Cooper: Battery Replacement Earlier this week, the Mini's battery gave up. When the start-stop button was pressed, none of the usual warning lights illuminated, only the handbrake and airbag warning lights were lit. The starter motor didn't so much as click. I used the trusty Saab to jump start it and then put a multi-meter across the battery to check that the alternator was working: it registered `14.5V`, indicating that the alternator was healthy and generating a charge (though it's not definite, I **have** been caught out in the past). Wiping some dust off the top of the battery revealed a handwritten date: 19th October 2019. So the battery was over 5 years old. This post describes the process of swapping a battery out on the Mini Cooper. The car in the photos is a 2010 Mini Cooper One Convertible (R57). The process shouldn't take any longer than about 20 minutes (I was done in less than 15). * * * #### Battery Location The battery is located under a panel just below the windscreen: * * * #### Tools You'll need a 10mm socket and wrench. It's worth having a small flathead on hand too, in case the battery terminal clamps are stuck. * * * #### Process Lift the plastic cover off (it clicks into place around the edges) and pull the rubber seal out from around the bottom edge of the panel. Then remove the plastic retaining bolt to the right On the other side of the bonnet, just below the windscreen is another retaining bolt The cover should now be free, wiggle the far end round until you're able to pivot it out from underneath the bonnet. Slacken the retaining screw on the negative terminal clamp and then pull it up and out of the way Then do the same for the positive terminal There may be a tube coming out of the top of the battery on the left hand side (there isn't in the photo), if so, carefully disconnect it. Unscrew the retaining bolt at the front - this goes into a foot which helps hold the battery in place Tilt the battery backwards a little and then slide it forwards before lifting up and out * * * #### Installing The New One Sit the new battery in place, sliding it backwards so that it slides under the rear retaining lip. Pull the gas vent hose up the side and press into the vent on the side of the battery Sit the front retainer in place and then screw down Reconnect the positive terminal clamp and tighten its retaining screw. Then carefully reconnect the negative terminal (it will likely spark a little as you make contact) before tightening its retaining screw Work the panel back into place Put the front and rear retaining bolts back in and then click the cover back in place

Mini Cooper: Battery Replacement
Author: Ben Tasker

https://www.bentasker.co.uk/posts/documentation/vehicles/replacing-the-battery-on-a-bmw-mini-cooper.html?utm_source=rss

#battery #bmw #cars #mini #replacement #vehicles

07.03.2025 17:30 — 👍 0    🔁 0    💬 0    📌 0
Automatically Backing Up Zones Stored In ClouDNS I recently moved DNS providers and migrated my domains to using ClouDNS. What I didn't write about in that post, though, was setting up local backups of my zones. This was something that I previously had with DNSMadeEasy (though it had been a bit of a pain to set up), so I wanted to make sure that I retained that ability with ClouDNS. This post describes how to easily automate fetching of ClouDNS records in BIND format. * * * #### Authentication Credentials The ClouDNS API requires that you present some API specific credentials. To create these, you * Login to ClouDNS.net * Go to `API & Resellers` * Under `API Users` click `Add New User` You're then prompted to provide a password and (optionally) an IP address to limit the user to The option to restrict to specific IPs is a nice addition. Once you've saved the user, it will show up in the API users table along with it's `auth-id` and (if you set one) the IP address(es) that the user must connect from: * * * #### Making Requests Once you've got an `auth-id` and password, making requests to the API is easy. The credentials are provided in either the `POST` body or the query string (tip: **never** put them in the query string, they're far more likely to end up getting logged there). For example, to list the DNS zones that ClouDNS hosts for us, we can do the following: curl \ -d "auth-id=${CLOUD_ID}&auth-password=${CLOUD_TOKEN}&page=1&rows-per-page=50" \ https://api.cloudns.net/dns/list-zones.json The response is a list of objects, each describing a configured zone: [ { "name": "bentasker.co.uk", "id": "894172", "type": "master", "group": "None", "hasBulk": false, "zone": "domain", "status": "1", "serial": "2025022423", "isUpdated": 1 }, Fetching records for a given zone is similarly easy curl \ -d "auth-id=${CLOUD_ID}&auth-password=${CLOUD_TOKEN}&domain-name=${domain}" \ https://api.cloudns.net/dns/records-export.json With the format looking like this { "status": "Success", "zone": "$ORIGIN bentasker.co.uk.\n@\t3600\tIN\tSOA\tpns61.cloudns.net. * * * #### Backup Script The following script expects that API credentials are provided in environment variables `CLOUD_TOKEN` and `CLOUD_ID` #!/bin/bash # # Backup records from ClouDNS into a # BIND format zone file # # Backup CloudDNS curl -s -d "auth-id=${CLOUD_ID}&auth-password=${CLOUD_TOKEN}&page=1&rows-per-page=50" https://api.cloudns.net/dns/list-zones.json | jq -r '.[] | .name' | while read -r domain do curl -s -d "auth-id=${CLOUD_ID}&auth-password=${CLOUD_TOKEN}&domain-name=${domain}" https://api.cloudns.net/dns/records-export.json | jq -r '.zone' > ${domain}.zone done This uses `jq` to parse the JSON and write out a BIND format file: $ORIGIN bentasker.co.uk. @ 3600 IN SOA pns61.cloudns.net. support.cloudns.net. 2025022423 7200 1800 1209600 3600 @ 3600 IN NS pns61.cloudns.net. @ 3600 IN NS pns62.cloudns.com. @ 3600 IN NS pns63.cloudns.net. * * * #### Revision Control I've long wrapped my DNS backups in revision control - it means that if I'm making a substantial change, I can trigger a backup and then write a commit message which references the relevant ticket. To do this, I create a `git` repo with a copy of the backup script in it git init dns_backup cd dns_backup # optional add remote: git remote add origin <url> # Copy and commit the backup script cp ~/cloudns_backup.sh ./ git add cloudns_backup.sh git commit -m "feat: add backup script" There's then a module in my backups which calls the script and commits any results: cd ~/dns-backup/ ./cloudns_backup.sh git add . git commit -m "chore: Auto changes detected" git push origin main * * * #### Conclusion Getting backups of ClouDNS data up and running is **really** straightforward. When I originally set them up for DNSMadeEasy, I had to spend quite a lot of time turning them into a format that'd be useful, whereas ClouDNS will happily spit a BIND format file straight out.

New #Documentation: Automatically Backing Up Zones Stored In ClouDNS
Author: Ben Tasker

https://www.bentasker.co.uk/posts/documentation/general/automatically-backing-up-cloudns-dns-records.html?utm_source=rss

#backups #cloudns #dns

02.03.2025 12:30 — 👍 0    🔁 0    💬 0    📌 0
The end of NATO, Televised This made for hard watching and seemed a significant enough moment to be worth noting. There should be a video here, you may need to visit in a browser In this video, we see a _literal_ hero being abused by a man who hid (from his own supporters) in a bunker and his VP: a man who even once referred to his new boss as America's Hitler. It seems pretty clear that it was an ambush: Vance went in hot and hard **because** they wanted to claim that Zelensky was being difficult, giving them a pretext under which to withdraw aid and support (and, oh look). > .. repeatedly talks over .. It's going to be very hard to do business like this What a cunt. Krasnov's handler will be pleased, though it looks like the rest of the world seems to be seeing it for the disgrace that it is. * * *

The end of NATO, Televised
Author: Ben Tasker

https://www.bentasker.co.uk/posts/videos/trump-and-vance-childlike-bullies.html?utm_source=rss

#ukraine #video

01.03.2025 00:00 — 👍 0    🔁 1    💬 0    📌 0