Please, the world needs more videos like the color one. I really enjoyed it, learned a ton of stuff from it.
Rust is just a tool, don't care too much about the views, the algorithm sucks and all platforms are filled with bots anyway.
Please, the world needs more videos like the color one. I really enjoyed it, learned a ton of stuff from it.
Rust is just a tool, don't care too much about the views, the algorithm sucks and all platforms are filled with bots anyway.
Me: This function is too slow. Find a faster algorithm.
Cursor: Hold my beer.
Me: *Slacking off with colleagues*
Cursor: Ping.
Me: π€―
That's for sure. And I think aligns with my thinking. The LLMs are good at getting the "unspecified" part of what we ask of it. But we need the compiler/type checker, that is mathematical and rigorous alongside, to guide it.
09.05.2025 15:49 β π 1 π 0 π¬ 1 π 0Cursor with claude has been surprisingly good at getting things right in 2/3 shots. I haven't enough experience with others to judge.
09.05.2025 15:47 β π 1 π 0 π¬ 0 π 0
We just released text-generation-inference 3.3.0. This release adds prefill chunking for VLMs π. We have also Gemma 3 faster & use less VRAM by switching to flashinfer for prefills with images.
github.com/huggingface/...
I had the same observation.
Even in Rust, if I don't hand hold carefully the LLM it will tend to spiral out of control producing random crap.
But not unlike a junior, if you explain carefully what you want, it tends to get it correct, or self correct relatively OK. Just don't ask too much at once.
Well TS goes only so far, any use of `any` (which is unfortunately quite common) and you lost all the benefits.
Also there are no runtime checks for the types, happened too many times for me, that the culprit was not my codebase but my sanitation of browser data (which TS doesn't protect against)
.
Hot take: Rust is really good for vibe coding, much better than Python or JS. Why ? The compiler will not let crap pass.
Yes the LLM can still get it wrong, and fail.
The elegant error messages will nudge the LLM, so I don't have to do it constantly.
Tipping my toe into vibe coding myself to get a gist for it.
My first project is writing something like superwhisper, because I couldn't find anything that worked good enough for wayland.
github.com/Narsil/whisp...
It also works on Mac for kicks (Whisper.cpp backed)
Want to run Deepseek R1 ?
Text-generation-inference v3.1.0 is out and supports it out of the box.
Both on AMD and Nvidia !
Text-generation-inference v3.0.2 is out.
Basically we can run transformers models (that support flash) at roughly the same speed as native TGI ones.
What this means is broader model support.
Today it unlocks
Cohere2, Olmo, Olmo2 and Helium
Congrats Cyril Vallez
github.com/huggingface/...
Zero config
Thatβs it. Remove all the flags your are using and youβre likely to get the best performance. By evaluating the hardware and model, TGI carefully selects automatic values to give best performance. In production, we donβt have any flags anymore in our deployments.
13x faster
On long prompts (200k+ tokens) conversation replies take 27.5s in vLLM, while it takes only 2s in TGI. How so ? We keep the initial conversation around, so when a new reply comes in, we can answer almost instantly. The overhead is ~5us. Thanks Daniel de kok for the beast data structure
3x more tokens.
By reducing our memory footprint, weβre able to ingest many more tokens and more dynamically than before. A single L4 (24GB) can handle 30k tokens on llama 3.1-8B, while vLLM gets barely 10k. A lot of work went into reducing the footprint of the runtime.
Performance leap: TGI v3 is out. Processes 3x more tokens, 13x faster than vLLM on long prompts. Zero config !
10.12.2024 10:08 β π 19 π 6 π¬ 1 π 1
We just deployed Qwen/QwQ-32B-Preview on HuggingChat! It's Qwen's latest experimental reasoning model.
It's super interesting to see the reasoning steps, and with really impressive results too. Feel free to try it out here: huggingface.co/chat/models/...
I'd love to get your feedback on it!
I'm disheartened by how toxic and violent some responses were here.
There was a mistake, a quick follow up to mitigate and an apology. I worked with Daniel for years and is one of the persons most preoccupied with ethical implications of AI. Some replies are Reddit-toxic level. We need empathy.
It's pretty sad to see the negative sentiment towards Hugging Face on this platform due to a dataset put by one of the employees. I want to write a small piece. π§΅
Hugging Face empowers everyone to use AI to create value and is against monopolization of AI it's a hosting platform above all.