Aritra Roy Gosthipaty's Avatar

Aritra Roy Gosthipaty

@arig23498.bsky.social

MLE @ Hugging Face

1,258 Followers  |  131 Following  |  24 Posts  |  Joined: 06.11.2024  |  1.946

Latest posts by arig23498.bsky.social on Bluesky

5/ Each of the tech mentioned above has its own pros and cons. The processor that you are running in your system (a phone, a laptop, etc) will have a weighted sum of all the above.

It baffles me to think about all of this. πŸ€—

03.03.2025 18:05 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

4/N Multi Threading in Single Core

In a core, we can have multiple register blocks (context blocks) to access different instructions. This way if a process is stalled, the processor quickly jumps to another.

03.03.2025 18:05 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

3/N SIMD paradigm

In a single core if we have duplicate ALUs we can operate of a bunch of data in a single clock tick. The catch? Each operation should be the same.

Single instruction Multiple Data

03.03.2025 18:05 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

2/N Multi Core Processor:

A single processor consists of a control unit, arithmetic unit and some registers. How about we duplicate this block into multiple blocks? This is the multi-core architecture. As a programmer you would need to explicitly mention which code runs where.

03.03.2025 18:05 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

1/N Superscalar processors:

Your program is a list of instructions. This list almost always has independent instructions. A superscalar processor would identify them and execute seperately in the same clock tick.

03.03.2025 18:05 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Some pointers on parallel computing:

A small thread πŸ§΅πŸ‘‡

03.03.2025 18:05 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
SigLIP2 - a google Collection We’re on a journey to advance and democratize artificial intelligence through open source and open science.

HF model collection for transformers:
huggingface.co/collections/...

HF model collection for OpenCLIP and timm:
huggingface.co/collections/...

And of course big_vision checkpoints:
github.com/google-resea...

22.02.2025 15:34 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features We introduce SigLIP 2, a family of new multilingual vision-language encoders that build on the success of the original SigLIP. In this second iteration, we extend the original image-text training obje...

Paper:
arxiv.org/abs/2502.14786

HF blog post from @arig23498.bsky.social et al. with a gentle intro to the training recipe and a demo:
huggingface.co/blog/siglip2

Thread with results overview from Xiaohua (only on X, sorry - these are all in the paper):
x.com/XiaohuaZhai/...

22.02.2025 15:34 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ“’2⃣ Yesterday we released SigLIP 2!

TL;DR: Improved high-level semantics, localization, dense features, and multilingual capabilities via drop-in replacement for v1.

Bonus: Variants supporting native aspect and variable sequence length.

A thread with interesting resourcesπŸ‘‡

22.02.2025 15:34 β€” πŸ‘ 12    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
πŸš€ Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker! A Blog post by Aritra Roy Gosthipaty on Hugging Face

Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker! by @arig23498.bsky.social

Build a proof-of-concept API, hosting Qwen2.5-VL-7B-Instruct on Hugging Face Spaces using Docker.

huggingface.co/blog/ariG234...

29.01.2025 14:00 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
Controlling Language Model Generation with NVIDIA's LogitsProcessorZoo We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co/blog/logits-...

23.12.2024 10:53 β€” πŸ‘ 7    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
Models - Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science.

I forgot to mention that you can use the same code to access any `warm` model on the Hub.

Here is a list of all the `warm` models: huggingface.co/models?infer...

Happy vibe checking πŸ˜‡

[N/N]

03.12.2024 06:41 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
qwq-inference-api.ipynb Β· ariG23498/quick-notebooks at main We’re on a journey to advance and democratize artificial intelligence through open source and open science.

I have created a simple and quick notebook to access this inference api and use `huggingface_hub` to access the model through it.

huggingface.co/datasets/ari...

[4/N]

03.12.2024 06:41 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

But today it was my lucky day. I noticed that the model was already loaded on the Serverless Inference API and was ready to be used.

No more spinning up my GPUs and stress testing them (happy GPU noises)

[3/N]

03.12.2024 06:41 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

My usual workflow is to visit the Hugging Face Hub model card (here that was hf[dot]co[dot]Qwen/QwQ-32B-Preview) and copy the working code sample.

I am sure this is how most of you work with a new model as well (if not, I would love to hear from you)

[2/N]

03.12.2024 06:41 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The Qwen team is doing so much for the community by keeping research open and constructive.

They listen to the community and put efforts in building competitive models.

I was intrigued by their latest `Qwen/QwQ-32B-Preview` model and wanted to play with it.

[1/N]

03.12.2024 06:41 β€” πŸ‘ 10    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

I've been exploring the latest Llama 3.2 releases and working on a couple of projects you may find interesting:

1️⃣ Understanding tool calling with Llama 3.2 πŸ”§
2️⃣ Using Text Generation Inference (TGI) with Llama models πŸ¦™

(links in the next post)

29.11.2024 10:10 β€” πŸ‘ 12    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0

I like the evaluation part. Is there some evals you particularly like?

26.11.2024 11:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

What is THE pain point in training Vision Language Models according to you?

I will go first, the data pipeline.

26.11.2024 10:52 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0

πŸ™‹β€β™‚οΈ ariG23498

23.11.2024 15:38 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Adding support for Qwen model by ariG23498 Β· Pull Request #3 Β· sayakpaul/simple-image-recaptioning A working colab notebook

Re-caption your webdataset with Qwen2-VL

github.com/sayakpaul/si...

23.11.2024 12:48 β€” πŸ‘ 14    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Faster Text Generation with Self-Speculative Decoding We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co/blog/layerskip

20.11.2024 20:21 β€” πŸ‘ 8    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0
Post image

To the video generation enthusiats, Mochi 1 Preview is now supported in `diffusers`

15.11.2024 10:19 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

awesome, thanks a lot for sharing πŸ™Œ

13.11.2024 16:37 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image

`bitsandbytes` makes it really easy to quantize models

Note: MB should be GB in the diagram.

13.11.2024 12:03 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

Read about the Qwen2.5-Coder Series

huggingface.co/blog/ariG234...

12.11.2024 07:08 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Training ranking models for better retrieval from stores is GOD level thinking.

08.11.2024 05:36 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I am diving head first into Vision Language Models. Comment below the papers that I definitely should read.

07.11.2024 05:52 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Hugging Face + PyCharm We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Welcome the @huggingface.bsky.social integration in PyCharm. From instant model cards to navigating the local cache, working with Hugging Face models becomes a lot easier with PyCharm.

Bonus: Claim a 3 month PyCharm subscription using PyCharm4HF

Blog Post: huggingface.co/blog/pycharm...

06.11.2024 11:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@arig23498 is following 19 prominent accounts