Mobius Labs's Avatar

Mobius Labs

@mobius-labs.bsky.social

Making models fast, small and taming inference. Loves multimodality. Proponents of Open Source and Open Intelligence. https://blog.mobiuslabs.com/ for some of our recent work. X: https://x.com/Mobius_Labs

53 Followers  |  246 Following  |  5 Posts  |  Joined: 25.11.2024  |  1.5414

Latest posts by mobius-labs.bsky.social on Bluesky

Post image

Our re-distilled Deepseek R1 (1.5B) outperforms the original distilled model! Get it at huggingface.co/mobiuslabsgm.... Weโ€™re distilling more models and look forward to releasing them soon!

24.01.2025 17:32 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1
Lecture 34: Low Bit Triton Kernels
YouTube video by GPU MODE Lecture 34: Low Bit Triton Kernels

Watch this video for insights into our experience during development:
www.youtube.com/watch?v=7c3c...

05.12.2024 14:43 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Release 0.4.0 ยท mobiusml/gemlite Improved performance on the A100 and H100. Flexible bitpacking support (32-bit / 8-bit, over cols or rows). Best config caching over all kernels. Helper functions for easier usage. GEMV_SPLITK kern...

Introduced new kernels, max-autotuning, and several other improvements Check out the release details at github.com/mobiusml/gem...

05.12.2024 14:43 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
GitHub - mobiusml/gemlite: Fast low-bit matmul kernels in Triton Fast low-bit matmul kernels in Triton. Contribute to mobiusml/gemlite development by creating an account on GitHub.

Releasing a new version of Gemlite github.com/mobiusml/gem... significantly improved performance on datacenter GPUS (A100/H100) delivering up to 7โ€“8x faster prefill and 3โ€“6x faster batch decoding compared to PyTorch's tinygemm.

05.12.2024 14:42 โ€” ๐Ÿ‘ 4    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Preview
Release faster-whisper 1.1.0 ยท SYSTRAN/faster-whisper New Features New batched inference that is 4x faster and accurate, Refer to README on usage instructions. Support for the new large-v3-turbo model. VAD filter is now 3x faster on CPU. Feature Extr...

Really happy to contribute to the batched version of faster-whisper that is 4x faster and more accurate ๐Ÿš€๐Ÿš€๐Ÿš€

github.com/SYSTRAN/fast...

25.11.2024 11:32 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@mobius-labs is following 19 prominent accounts