Current status
21.06.2025 20:55 — 👍 2 🔁 0 💬 0 📌 0@geirsson.com.bsky.social
Building Agents at Sourcegraph. Posts about coding, AI, and family (3 kids). @olafurpg elsewhere. Based in Oslo, Norway. https://geirsson.com
Current status
21.06.2025 20:55 — 👍 2 🔁 0 💬 0 📌 0The best mental model for buying strollers is to think of renting them by the month. The resale value holds pretty high so the final cost isn’t so bad even for the $2k premium twin strollers.
07.06.2025 12:24 — 👍 0 🔁 0 💬 0 📌 0The usual answer is “no” whenever people ask whether LSP can be used in a novel way.
The protocol’s strength is also its weakness, it’s very much optimized around a single human user interacting with an IDE.
Meanwhile we’ll get ads in ChatGPT.
31.05.2025 21:15 — 👍 0 🔁 0 💬 0 📌 0Anthropic is the king of function calling and deserves their incredible revenue growth. They’ve paved the way for AI agents, not OpenAI or Google. It’s only a matter of time before Google gets the memo and Gemini starts taking function calling more seriously.
31.05.2025 21:15 — 👍 0 🔁 0 💬 1 📌 0Three kids done with chickenpox this month.
30.05.2025 05:21 — 👍 2 🔁 0 💬 0 📌 0On a second iteration, it seems like it's the web tool that's causing troubles. Disabling the web tool makes Claude 4 reach the right syntactic solution although not with the optimal token edits. Goes to show that you need to be careful with what tools you're exposing. Less is more.
29.05.2025 20:02 — 👍 1 🔁 0 💬 0 📌 0Sonnet 3.7 is the only model I've seen that delivers the perfect solution, it replaces the tokens for `.` and `apply` and nothing else. All other models I've tested use the worse tree replacement APIs.
29.05.2025 19:51 — 👍 0 🔁 0 💬 1 📌 0Surprisingly, Sonnet and Opus 4 both fail on one of my go-to codegen tests for new models
> Implement a Scalafix rule that converts foo.apply(...) to foo(...) and explain why it's semantic or syntactic
They both think it needs to be semantic (aka. have access to types and symbol).
Amp Tab is coming along nicely, it's not too far from being able to replace Cursor Tab as my daily driver.
27.05.2025 12:23 — 👍 0 🔁 0 💬 0 📌 0Last note, even with code that can be unit tested, I still think most of the tests that AI generates is crap. And the AI generated commit messages also miss the point. I'm seeing lots of PRs now where people add AI generated tests that aren't even testing anything meaningful.
27.05.2025 07:53 — 👍 2 🔁 0 💬 0 📌 0The Dwarkesh episode still gave a fresh perspective on how these models work, and I have probably underestimated how powerful they will become. If you're still judging AI capabilities by today's products and today's models then you are probably also underestimating how weird things are going to get.
27.05.2025 07:51 — 👍 0 🔁 0 💬 1 📌 0I am knee deep in the AI hype, and I don't think software engineering will ever be the same again. I love working on ampcode.com and I see daily anecdotes how AI coding is turning software development upside-down for our users.
27.05.2025 07:51 — 👍 1 🔁 0 💬 1 📌 0Even components that can be unit tested or e2e tested via behavioral assertions have lots of implicit constraints wrt. latency or how features interact with each other in long-running user sessions that are impractical to tests in an automated fashion.
27.05.2025 07:51 — 👍 0 🔁 0 💬 1 📌 0The fallacy is thinking that all software engineers do is deliver code that can be tested in isolation, and AI is very good at doing that now. The problem is that tests only cover maybe 0-50% of real-world constraints.
27.05.2025 07:51 — 👍 0 🔁 0 💬 1 📌 0I keep shaking my head hearing AI folks claiming software engineering will be automated this year. After listening to this conversation, I better understand what they at least mean by this. These AI researchers are super smart, but they're also sort of clueless over what "software engineering" is.
27.05.2025 07:51 — 👍 0 🔁 0 💬 1 📌 0The Dwarkesh episode on Claude 4 is the most in-depth, balanced, and (almost) non-hype conversation I have heard on why AI researchers believe AGI is around the corner open.spotify.com/episode/3H46...
27.05.2025 07:51 — 👍 4 🔁 0 💬 1 📌 0Memory reminds me of the Facebook feed circa 2016. It was clearly beneficial for the company, it sure boosted engagement, but deleted my Facebook account and was better off for it.
25.05.2025 15:15 — 👍 0 🔁 0 💬 0 📌 0Memory in AI chatbots is overrated, it turns the LLM into a sycophant by tying every response with random pieces of information that got extracted from past conversations.
I’m sure memory is great for engagements/likability, but it’s turned me off ChatGPT personally.
After starting working on Amp (ampcode.com ):
- No meetings
- No code review, just push to main
- Take responsibility for your changes
- Rarely need to create a branch off main
- Auto-release every few hours
- Prioritize user bug reports whenever possible
At the risk of being pedantic, when many people say “one shot” the actually mean zero shot pass@1.
Technically, one shot means including one example output in the prompt, and most prompts don’t do that.
Not blaming, I even catch myself saying one shot meaning pass@1.
Contrary to popular belief, the models are surprisingly bad at writing CSS.
22.05.2025 13:56 — 👍 0 🔁 0 💬 0 📌 0There’s a different trajectory for the people who are excited about AI because it enables them to build more expertise, or skip building expertise.
Concrete example, I love AI because it helps me learn CSS faster, not because AI writes all my CSS so I don’t have to learn it.
git worktrees are overrated, they're a performance optimization that only makes sense when working in a repo that's super slow to clone.
For normal repos, just clone twice and enjoy benefits like being able to check out the main branch in both clones at the same time.
Sprinkling Copilot dependencies across the VS Code codebase is a great technique to make it more annoying to keep a fork up-to-date. Well played, Microsoft.
19.05.2025 19:07 — 👍 0 🔁 0 💬 0 📌 0Got o3 to think for a record-breaking amount of time (4m54s) with this prompt.
“How many songs in the Eurovision finals are in English? I have a feeling there are unusually few this year.”
It enumerated the songs and classified the lyrics, very cool. Answer was 11 out of 26 songs were in English.
It’s naive to believe that tech companies are slowing down hiring due to AI productivity gains.
Only a small fraction of people in tech work on projects with meaningful revenue generation, and it’s a better investment to buy GPUs for the winning projects than to employ people for losing projects.
Working on Amp makes it clear that Observables is really all you need, no requests, no notifications, no promises. Wire it straight into Svelte readable and everything becomes magically reactive. Beautiful.
17.05.2025 10:54 — 👍 2 🔁 0 💬 0 📌 0Hand-rolled a JSON-RPC like protocol with JSONL on the wire and it’s so simple that it’s almost a crime that LSP popularized the complicated Content-Length based format on the wire.
17.05.2025 10:52 — 👍 0 🔁 0 💬 1 📌 0Freshly baked cinnamon buns
Move to a place where neighbors randomly invite you to enjoy freshly baked cinnamon buns on a Thursday morning
15.05.2025 08:03 — 👍 4 🔁 0 💬 0 📌 0