5/ Each of the tech mentioned above has its own pros and cons. The processor that you are running in your system (a phone, a laptop, etc) will have a weighted sum of all the above.
It baffles me to think about all of this. π€
@arig23498.bsky.social
MLE @ Hugging Face
5/ Each of the tech mentioned above has its own pros and cons. The processor that you are running in your system (a phone, a laptop, etc) will have a weighted sum of all the above.
It baffles me to think about all of this. π€
4/N Multi Threading in Single Core
In a core, we can have multiple register blocks (context blocks) to access different instructions. This way if a process is stalled, the processor quickly jumps to another.
3/N SIMD paradigm
In a single core if we have duplicate ALUs we can operate of a bunch of data in a single clock tick. The catch? Each operation should be the same.
Single instruction Multiple Data
2/N Multi Core Processor:
A single processor consists of a control unit, arithmetic unit and some registers. How about we duplicate this block into multiple blocks? This is the multi-core architecture. As a programmer you would need to explicitly mention which code runs where.
1/N Superscalar processors:
Your program is a list of instructions. This list almost always has independent instructions. A superscalar processor would identify them and execute seperately in the same clock tick.
Some pointers on parallel computing:
A small thread π§΅π
HF model collection for transformers:
huggingface.co/collections/...
HF model collection for OpenCLIP and timm:
huggingface.co/collections/...
And of course big_vision checkpoints:
github.com/google-resea...
Paper:
arxiv.org/abs/2502.14786
HF blog post from @arig23498.bsky.social et al. with a gentle intro to the training recipe and a demo:
huggingface.co/blog/siglip2
Thread with results overview from Xiaohua (only on X, sorry - these are all in the paper):
x.com/XiaohuaZhai/...
π’2β£ Yesterday we released SigLIP 2!
TL;DR: Improved high-level semantics, localization, dense features, and multilingual capabilities via drop-in replacement for v1.
Bonus: Variants supporting native aspect and variable sequence length.
A thread with interesting resourcesπ
Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker! by @arig23498.bsky.social
Build a proof-of-concept API, hosting Qwen2.5-VL-7B-Instruct on Hugging Face Spaces using Docker.
huggingface.co/blog/ariG234...
I forgot to mention that you can use the same code to access any `warm` model on the Hub.
Here is a list of all the `warm` models: huggingface.co/models?infer...
Happy vibe checking π
[N/N]
I have created a simple and quick notebook to access this inference api and use `huggingface_hub` to access the model through it.
huggingface.co/datasets/ari...
[4/N]
But today it was my lucky day. I noticed that the model was already loaded on the Serverless Inference API and was ready to be used.
No more spinning up my GPUs and stress testing them (happy GPU noises)
[3/N]
My usual workflow is to visit the Hugging Face Hub model card (here that was hf[dot]co[dot]Qwen/QwQ-32B-Preview) and copy the working code sample.
I am sure this is how most of you work with a new model as well (if not, I would love to hear from you)
[2/N]
The Qwen team is doing so much for the community by keeping research open and constructive.
They listen to the community and put efforts in building competitive models.
I was intrigued by their latest `Qwen/QwQ-32B-Preview` model and wanted to play with it.
[1/N]
I've been exploring the latest Llama 3.2 releases and working on a couple of projects you may find interesting:
1οΈβ£ Understanding tool calling with Llama 3.2 π§
2οΈβ£ Using Text Generation Inference (TGI) with Llama models π¦
(links in the next post)
I like the evaluation part. Is there some evals you particularly like?
26.11.2024 11:18 β π 0 π 0 π¬ 1 π 0What is THE pain point in training Vision Language Models according to you?
I will go first, the data pipeline.
πββοΈ ariG23498
23.11.2024 15:38 β π 2 π 0 π¬ 0 π 0Re-caption your webdataset with Qwen2-VL
github.com/sayakpaul/si...
To the video generation enthusiats, Mochi 1 Preview is now supported in `diffusers`
15.11.2024 10:19 β π 6 π 0 π¬ 0 π 0awesome, thanks a lot for sharing π
13.11.2024 16:37 β π 1 π 1 π¬ 0 π 0`bitsandbytes` makes it really easy to quantize models
Note: MB should be GB in the diagram.
Read about the Qwen2.5-Coder Series
huggingface.co/blog/ariG234...
Training ranking models for better retrieval from stores is GOD level thinking.
08.11.2024 05:36 β π 0 π 0 π¬ 0 π 0I am diving head first into Vision Language Models. Comment below the papers that I definitely should read.
07.11.2024 05:52 β π 2 π 0 π¬ 0 π 0Welcome the @huggingface.bsky.social integration in PyCharm. From instant model cards to navigating the local cache, working with Hugging Face models becomes a lot easier with PyCharm.
Bonus: Claim a 3 month PyCharm subscription using PyCharm4HF
Blog Post: huggingface.co/blog/pycharm...