Mia ๐Ÿณ๏ธโ€โšง๏ธ's Avatar

Mia ๐Ÿณ๏ธโ€โšง๏ธ

@mia.pds.parakeet.at

Hi, I'm Mia! Trans (she/her) โ€ข programmer (Rust, ATProto @parakeet.at, sometimes more) โ€ข photographer โ€ข maths/stats nerd โ€ข resident of Normal Island. PFP: https://picrew.me/share?cd=QZKgROU6cC

163 Followers  |  71 Following  |  1,357 Posts  |  Joined: 26.02.2024  |  1.7968

Latest posts by mia.pds.parakeet.at on Bluesky

one day Iโ€™ll make something that doesnโ€™t need at least two different database implementations running simultaneously. today is not that day

31.10.2025 23:32 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
PDSls Browse the public data on atproto

yeah it's the mass tagging thing again. looks like that account fired off quite a few in the last hour or so pdsls.dev/at://did:plc...

31.10.2025 13:43 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
screenshot of a bluesky user counter showing 40 million users

screenshot of a bluesky user counter showing 40 million users

number go up

31.10.2025 08:35 โ€” ๐Ÿ‘ 21    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

my kingdom for literally just any media ID in this goddamn file. itโ€™s already over 300MB, you can spare an extra few chars per row for it. I have the playlist ID that was on at the time but thatโ€™s not bloody useful.

30.10.2025 22:48 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

about ready to throw hands with the utter pillock at apple that decided this was a sensible format for the export.

almost like they donโ€™t want you to use this data for anythingโ€ฆ

30.10.2025 22:48 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

(Iโ€™m not relying on lastfm data exclusively bc my phone and tablet donโ€™t scrobble)

29.10.2025 22:55 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

itโ€™s currently three steps:
โ€ข import from csv into duckdb
โ€ข process, get metadata, store into staging table
โ€ข create records

allows me to fudge the metadata which I will 100% need to do and Iโ€™d rather do it in datagrip before pushing

29.10.2025 22:55 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

This strat will likely break for plays crossing days but thatโ€™s not a thing I tend to do, so I can fudge it with lastfm data if that ever happened.

29.10.2025 22:55 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

thereโ€™s a file with listen history to a daily lvl (kinda hourly but not really*) so I think imma link into that to smooth the data out (and Iโ€™ll get track ID that way too - for unknown reasons, activity doesnโ€™t contain any unique media ID)

*a row for song, date + list of hours that song was played.

29.10.2025 22:55 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

but only on start events?

29.10.2025 21:49 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

truly a cursed file - at some point it just stops including all but one of the timestamps for 24h or so

29.10.2025 21:48 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

progressively adding more and more things to the server running parakeet but it seems to be dealing okay.

I would get another but I donโ€™t really want many more atm

28.10.2025 23:54 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

everyone else seems to be using svelte so maybe Iโ€™ll finally give it a proper shot instead of just using react like I always do.

28.10.2025 23:54 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

stewing on a fun little idea off the back of me hopefully dumping years of music history into my PDS tomorrow but I fear I may have to write frontend code again

28.10.2025 23:54 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

this shouldโ€™ve been prefixed with I am not a database engineer but this is what Iโ€™ve picked up from too many hours of researching how to get this smaller and faster

28.10.2025 23:31 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

but normal columnar stores flat out donโ€™t work for this usecase imo. you need high write speed too but canโ€™t get it because they want large batched writes not many single ones.

Tiger Dataโ€™s PG extensions may provide solutions but I havenโ€™t tested them yet (likewise oriole but that broke last time)

28.10.2025 23:31 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

tbqh idk - it might be possible to do compression per partition (if you could settle on a good partition - date??) or per page (but Iโ€™m not in the weeds enough yet to know the implications of this one). When you start venturing further you get interesting Qs about columnar stores and col compression

28.10.2025 23:31 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image 28.10.2025 19:30 โ€” ๐Ÿ‘ 86    ๐Ÿ” 15    ๐Ÿ’ฌ 5    ๐Ÿ“Œ 2

something something being locked must feel great for the database

28.10.2025 19:18 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Iโ€™m 90% sure you could totally run a relay+appview+cdn of the full (Bluesky) network for significantly under ยฃ600 and have it be useable (perf wise).
might even be able to do HA/redundancy for that too.

(I know this isnโ€™t necessarily apples/apples but alas)

28.10.2025 13:29 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
a screenshot of Jaeger, a tracing tool. A trace is open and a very tall waterfall is shown (there's a lot of calls to a DB)

a screenshot of Jaeger, a tracing tool. A trace is open and a very tall waterfall is shown (there's a lot of calls to a DB)

pictured: niagara falls

26.10.2025 14:02 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I worked out the seemingly undocumented arcane incantations to get otel+axum working properly and idk why what I did fixed it.

Worked 1st time with tonic after tho, just need to get the trace id passed over.

want to get it plugged in to the DB too but idk how. think Iโ€™d need support inside diesel?

25.10.2025 22:03 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I would happily use rust for backend and systems stuff and swift for apps/GUI stuff, that sounds ace tbh.

24.10.2025 23:00 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

half the reason I donโ€™t use android is because I refuse to develop for it because I tried it and hate it (I have lost weeks to JNI) but this is getting closer to nice

(yes I tried flutter and RN and I donโ€™t love either)

24.10.2025 23:00 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

getting close to the possibility of being able to write android apps without constantly wanting to yeet the laptop out the window, nice.

if/when this can link into UI stuff, weโ€™ll be golden.

24.10.2025 23:00 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

it took way too long to get events pushing to jaeger at all - now I (just) need to get all the correct scopes and info recorded.

doing the inter service request linking is going to be an experience, too.

24.10.2025 22:20 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

I stand by my comment from a while ago that opentelemetry is painful. I have it half working but not in the way it should and thereโ€™s a random warning sometimes. absolutely wonderful stuff.

how much of this is the tracing and Axum integrations? idk.

24.10.2025 22:17 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

itโ€™d be good to get the car bug fixed first but I need some test data more than I need *all* test data. would like better metrics first tho*

*Iโ€™m hoping jacquard and its zero copy deserialisation might improve consumer perf but want to test that properly

24.10.2025 12:28 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

"Oxford commas are a sign you write with ai" I will find such a unique way to rip out your spine that they'll make a movie about it

23.10.2025 16:50 โ€” ๐Ÿ‘ 8709    ๐Ÿ” 3350    ๐Ÿ’ฌ 143    ๐Ÿ“Œ 317

there's a lot of data that can and should be compressed (and I think you could pull strong ratios out) but it's working out how to do that and not kill the read performance.

It'll be interesting to see if bluesky's kvdb ends up having any large scale compresion on it too

24.10.2025 07:02 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@mia.pds.parakeet.at is following 20 prominent accounts