Ivana Balazevic's Avatar

Ivana Balazevic

@ibalazevic.bsky.social

Senior Research Scientist at Google DeepMind, working on Gemini. PhD from University of Edinburgh. ibalazevic.github.io

913 Followers  |  134 Following  |  4 Posts  |  Joined: 16.11.2024  |  1.6937

Latest posts by ibalazevic.bsky.social on Bluesky

Disentanglement is an intriguing phenomenon that arises in generative latent variable models for reasons that are not fully understood.

If you’re interested in learning why, I highly recommend giving Carl’s blog a read!

18.12.2024 17:08 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
Research Scientist, Language London, UK

I am hiring for RS/RE positions! If you are interested in language-flavored multimodal learning, evaluation, or post-training apply here 🦎 boards.greenhouse.io/deepmind/job...

I will also be #NeurIPS2024 so come say hi! (Please email me to find time to chat)

06.12.2024 23:07 β€” πŸ‘ 28    πŸ” 7    πŸ’¬ 1    πŸ“Œ 1
Post image

Our big_vision codebase is really good! And it's *the* reference for ViT, SigLIP, PaliGemma, JetFormer, ... including fine-tuning them.

However, it's criminally undocumented. I tried using it outside Google to fine-tune PaliGemma and SigLIP on GPUs, and wrote a tutorial: lb.eyer.be/a/bv_tuto.html

03.12.2024 00:18 β€” πŸ‘ 118    πŸ” 19    πŸ’¬ 3    πŸ“Œ 2

I think this comes down to the model behind p(x,y). If features of x cause y, e.g. aspects of a website (x) -> clicks (y); age/health -> disease, then p(y|x) is a (regression) fn of x. But if x|y is a distrib'n of different y's (e.g. cats) then p(y|x) is given by Bayes rule (squint at softmax).

02.12.2024 08:20 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Read our paper:
Context-Aware Multimodal Pretraining

Now on ArXiv

Can you turn vision-language models into strong any-shot models?

Go beyond zero-shot performance in SigLixP (x for context)

Read @confusezius.bsky.social thread below…

And follow Karsten … a rising star!

28.11.2024 17:03 β€” πŸ‘ 36    πŸ” 4    πŸ’¬ 0    πŸ“Œ 0

We maintain strong zero-shot transfer of CLIP / SigLIP across model size and data scale, while achieving up to 4x few-shot sample efficiency and up to +16% performance gains!

Fun project with @confusezius.bsky.social, @zeynepakata.bsky.social, @dimadamen.bsky.social and
@olivierhenaff.bsky.social.

28.11.2024 14:43 β€” πŸ‘ 20    πŸ” 3    πŸ’¬ 0    πŸ“Œ 1

Just a heads up to everyone: @deep-mind.bsky.social is unfortunately a fake account and has been reported. Please do not follow it nor repost anything from it.

25.11.2024 23:24 β€” πŸ‘ 82    πŸ” 35    πŸ’¬ 9    πŸ“Œ 3

Could you add me please? :)

24.11.2024 20:33 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

Me too please :)

22.11.2024 00:30 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@ibalazevic is following 20 prominent accounts