antirez's Avatar

antirez

@antirez.bsky.social

Reproducible bugs are candies ๐Ÿญ๐Ÿฌ I like programming too much for not liking automatic programming.

10,241 Followers  |  379 Following  |  907 Posts  |  Joined: 26.04.2023  |  2.075

Latest posts by antirez.bsky.social on Bluesky

The M3 max is barely able to do real time, and can transcribe live with --from-mic (new option). Quite impressive how you can talk and it immediately shows stuff on the screen.

07.02.2026 18:34 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Video thumbnail

Now Flux2.c also supports the base model (if you have time and a >= M3, and the inclination), and different schedulers, and it is quite fan to run, also at reduced steps, since 50 are... a bit too much to wait.

07.02.2026 11:28 โ€” ๐Ÿ‘ 13    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Now flux2.c is consistently faster than official PyTorch MPS and not far from Draw Things (14s vs 19s at 1024x). DT uses 6bit quants while flux2 uses BF16 (2.7x the weights). Happy with the result so far. github.com/antirez/flux...

07.02.2026 11:02 โ€” ๐Ÿ‘ 14    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I use both routinely. However I must admit that after Opus 4.6, the continuous need to go to GPT2 xhigh (or GPT3-codex xhigh now) feels more rare.

06.02.2026 22:02 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Video thumbnail

voxtral.c decoding speed is now near to Metal hardware limits for ~80%.

06.02.2026 19:00 โ€” ๐Ÿ‘ 32    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
GitHub - antirez/voxtral.c: Pure C inference of Mistral Voxtral Realtime 4B speech to text model Pure C inference of Mistral Voxtral Realtime 4B speech to text model - antirez/voxtral.c

github.com/antirez/voxt...

06.02.2026 15:29 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image 06.02.2026 15:18 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

The HN commenters who simultaneously argue "it just decompressed GCC" and "the output is way worse than GCC" don't seem to notice they're refuting themselves. I didn't expect better than that, given the normie-alization of the site in recent times.

06.02.2026 14:42 โ€” ๐Ÿ‘ 11    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

We all need to learn this patterns. How to span agents with exactly the right instructions and guidelines in order for them to loop into "quality-driven / effective rails", producing results.

06.02.2026 14:36 โ€” ๐Ÿ‘ 16    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

I wrote a SPEED markdown file with instructions for Claude Code (Opus 4.6) with a loop process in order to improve the speed of the Voxtral speed with the Metal backend and left my home to pickup my daughter, have lunch, ... Back home, the code is 2x faster, and it is still going.

06.02.2026 14:33 โ€” ๐Ÿ‘ 24    ๐Ÿ” 3    ๐Ÿ’ฌ 5    ๐Ÿ“Œ 0
Post image

No mercy for abstractions that suck. Don't let your LLM be tempted.

06.02.2026 10:53 โ€” ๐Ÿ‘ 23    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Voxtral (from @MistralAI) transcription quality is quite incredible, the way it handles the punctation and all the rest, makes transcribed audio messages so much more understandable. I implemented a few fixes in the FFT and now there is no longer a skipped tokens issue in voxtral.c

06.02.2026 10:11 โ€” ๐Ÿ‘ 42    ๐Ÿ” 3    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 1
Post image 05.02.2026 23:53 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Maybe you were wondering why serious, grown-up folks would spend nights writing a Commodore 64 well made demo or game, in recent years. Now that your dear LLM can write a serious C compiler in Rust, but can't write a well made C64 game/demo, maybe you are starting to get it.

05.02.2026 23:45 โ€” ๐Ÿ‘ 23    ๐Ÿ” 1    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

Yesterday @MistralAI released an open weights transcription model able to work in real time, Voxtral Mini 4B. Today, following the Whisper.cpp lesson, here is a C inference pipeline ready to use as a library, I hope you'll enjoy it:

github.com/antirez/voxt...

05.02.2026 21:00 โ€” ๐Ÿ‘ 59    ๐Ÿ” 5    ๐Ÿ’ฌ 4    ๐Ÿ“Œ 1
Post image

Mmmm... 44ms per token. Not stellar but nice.

05.02.2026 15:10 โ€” ๐Ÿ‘ 10    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Anthropic ADs post: they realized that with Claude Code they have a solid business model and that ADs-model is not comparable. They made a business decision, and used what they refused (ADs) to improve company image. As simple as that.

04.02.2026 22:37 โ€” ๐Ÿ‘ 15    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

I'm pretty confident that Claude Code, since it spawns multiple sub-agents for sub tasks, has lower performances.

04.02.2026 21:01 โ€” ๐Ÿ‘ 14    ๐Ÿ” 0    ๐Ÿ’ฌ 5    ๐Ÿ“Œ 0

For comparison, in Italy it is impossible you don't understand somebody talking a different accent, even before TV existed. You can struggle if they are talking different *dialects*, but they are different languages, Sicilian for instance predates Italian by centuries.

04.02.2026 18:07 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Italian is an intrinsically understandable language, by people and machines.

04.02.2026 16:12 โ€” ๐Ÿ‘ 29    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

What is the hesitation to fix what Siri gets wrong continuously? You can implement things with different layers of isolation / security, as models evolve. But they are not able to do anything useful. It's product folks faliure.

04.02.2026 14:28 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

The ubermost absolute disaster is Apple, of course. Imagine trying, a few years ago, to push Siri, a product without the technology existing, and now after the tech is there for some time, totally failing into the integration. If you don't fire CEOs for that, when?

04.02.2026 14:24 โ€” ๐Ÿ‘ 9    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

I told you several times here: the problem is that *product* departments of big companies are mostly fried. They don't have talents, for some reason, there is the technology, but those folks are so unimaginative, stopped by legal & ethical limits, that they can't use it.

04.02.2026 14:22 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

We will see soon such systems integrated as apps to in our phones, will be the first big shift since the iPhone in the way mobile works.

04.02.2026 14:21 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

Explain me one thing: how a sane human being that saw Claude Code / Codex / ... at work for a few months can have *any* doubt about the ability of those systems to work as a personal assistant, given enough access to your stuff? It requires a lot less capabilities than the other main task.

04.02.2026 14:21 โ€” ๐Ÿ‘ 15    ๐Ÿ” 0    ๐Ÿ’ฌ 5    ๐Ÿ“Œ 0

MPS implementation, so it is using the GPU in my M3 Max for speed and parallelism.

04.02.2026 09:01 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

With this prompt trick, you can use flux.2 4b klein (distilled) as a handy and powerful super resolution model. In the example 128x128 -> 1024x1024.

03.02.2026 23:28 โ€” ๐Ÿ‘ 26    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

Right now, as powerful LLMs are, and how well they work in agents work flow, they are completely unable to come up with great products. Since this is a limitation that most intelligent human beings also have, it is not impossible that for a long time it will stay this way.

03.02.2026 12:32 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Yes, but since AI future is very hard to predict, I split the speculation (which I do, a lot, especially on my YouTube channel) from the practice. In the practice side, I consider the current tools we have with the idea they will be better without necessarily changing nature.

03.02.2026 12:31 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Also different people have vastly different product design capabilities, so the average vibe-coded tool written in this way will be really based, hardly usable, with terrible security, ...

03.02.2026 12:04 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

@antirez is following 20 prominent accounts