Our #CVPR2025 workshop on Emergent Visual Abilities and Limits of Foundation Models (EVAL-FoMo) is taking place this afternoon (1-6pm) in room 210.
Workshop schedule: sites.google.com/view/eval-fo...
@aidanematzadeh.bsky.social
Research scientist at Google DeepMind.π¦ She/her. http://www.aidanematzadeh.me/
Our #CVPR2025 workshop on Emergent Visual Abilities and Limits of Foundation Models (EVAL-FoMo) is taking place this afternoon (1-6pm) in room 210.
Workshop schedule: sites.google.com/view/eval-fo...
Also at #ICLR2025: See this work in action! We're demoing "Gecko" showing how we use capability-based evaluators to help users customize evals & select models. π¦
Find us at the Google booth, Fri 4/26, 12:00-12:30 PM.
At #ICLR2025, we're diving into what makes prompt adherence evaluators work for image/video generation.
Check out our poster Friday at 3 PM: iclr.cc/virtual/2025/pβ¦ π¦
Generative models are powerful evaluators/verifiers, impacting evaluation and post-training. Yet, making them effective, particularly for highly similar models/checkpoints, is challenging. The devil is in the details.
23.04.2025 21:23 β π 5 π 1 π¬ 1 π 0I know multiple people who need to hear this piped into their offices during working hours
10.03.2025 05:02 β π 33 π 10 π¬ 1 π 0I wrote about why funding for the NSF and NIH is important to me and to my hometown (Charlottesville, VA) at @cvilletomorrow.bsky.social. Thank you to @sciencehomecoming.bsky.social for inspiring me to do this!
www.cvilletomorrow.org/if-federal-f...
Our 2nd Workshop on Emergent Visual Abilities and Limits of Foundation Models (EVAL-FoMo) is accepting submissions. We are looking forward to talks by our amazing speakers that include @saining.bsky.social, @aidanematzadeh.bsky.social, @lisadunlap.bsky.social, and @yukimasano.bsky.social. #CVPR2025
13.02.2025 16:02 β π 7 π 3 π¬ 0 π 1if you would like to attend #ICLR2025 but have financial barriers, apply for financial assistance!
our priority categories are student authors, and contributors from underrepresented demographic groups & geographic regions.
deadline is march 2nd.
iclr.cc/Conferences/...
Our representational alignment workshop returns to #ICLR2025! Submit your work on how ML/cogsci/neuro systems represent the world & what shapes these representations ππ§ π€
w/ @thisismyhat.bsky.social @dotadotadota.bsky.social, @sucholutsky.bsky.social @lukasmut.bsky.social @siddsuresh97.bsky.social
The RE application is now open: boards.greenhouse.io/deepmind/job...
And here is the link to the RS position:
boards.greenhouse.io/deepmind/job...
What was the most impactful/visible/useful release on evaluation in AI in 2024?
06.01.2025 12:10 β π 11 π 3 π¬ 2 π 0bye, felix
kyunghyuncho.me/bye-felix/
A brilliant colleague and wonderful soul Felix Hill recently passed away. This was a shock and in an effort to sort some things out, I wrote them down. Maybe this will help someone else, but at the very least it helped me. Rest in peace, Felix, you will be missed. www.janexwang.com/blog/2025/1/...
03.01.2025 04:02 β π 63 π 11 π¬ 2 π 0Felix Hill and some other DMers and I after cold water swimming at Parliament Hill Lido a few years ago
Felix Hill was such an incredible mentor β and occasional cold water swimming partner β to me. He's a huge part of why I joined DeepMind and how I've come to approach research. Even a month later, it's still hard to believe he's gone.
02.01.2025 19:01 β π 125 π 17 π¬ 7 π 5Beautiful crow against a black background
It seems to me that the time is ripe for a Bluesky thread about howβand maybe even whyβto befriend crows.
(1/n)
Here's Veo 2, the latest version of our video generation model, as well as a substantial upgrade for Imagen 3 π§βπ³π’
(Did I mention we are hiring on the Generative Media team, btw π)
blog.google/technology/g...
I've been getting a lot of questions about autoregression vs diffusion at #NeurIPS2024 this week! I'm speaking at the adaptive foundation models workshop at 9AM tomorrow (West Hall A), about what happens when we combine modalities and modelling paradigms.
adaptive-foundation-models.org
The released benchmark can also be used as a counting VQA benchmark and evaluation of auto-metrics.
Paper: arxiv.org/abs/2406.14774
Our main task categories
We design 3 main tasks with varying degrees of difficulty and evaluate 13 models across different families. Models show rudimentary numerical reasoning skills, limited to small numbers and simple prompt formats; many models are affected by non-numerical prompt manipulations.
09.12.2024 19:08 β π 0 π 0 π¬ 1 π 0What do text-to-image models know about numbers? Find out in our new paper π¦ "Evaluating Numerical Reasoning in text-to-image Models" to be presented at #NeurIPS2024 (Wed 4:30-7:30 PM, #5304).
Dataset: github.com/google-deepm... (1386 prompts, 52,721 images, 479,570 annotations)
Stop by our #NeurIPS tutorial on Experimental Design & Analysis for AI Researchers! π
neurips.cc/virtual/2024/tutorial/99528
Are you an AI researcher interested in comparing models/methods? Then your conclusions rely on well-designed experiments. We'll cover best practices + case studies. π
If you will be at #NeurIPS2024 @neuripsconf.bsky.social and would like to come see our models in action, come say hi π and check out our demo at the GDM booth!
Wednesday, Dec. 11th @ 9:30-10:00.
Lots of other great things to see as well! Check it out: π
deepmind.google/discover/blo...
I am hiring for RS/RE positions! If you are interested in language-flavored multimodal learning, evaluation, or post-training apply here π¦ boards.greenhouse.io/deepmind/job...
I will also be #NeurIPS2024 so come say hi! (Please email me to find time to chat)
Our big_vision codebase is really good! And it's *the* reference for ViT, SigLIP, PaliGemma, JetFormer, ... including fine-tuning them.
However, it's criminally undocumented. I tried using it outside Google to fine-tune PaliGemma and SigLIP on GPUs, and wrote a tutorial: lb.eyer.be/a/bv_tuto.html