π₯Introducing Gemini 2.5, our most intelligent model with impressive capabilities in advanced reasoning and coding.
Now integrating thinking capabilities, 2.5 Pro Experimental is our most performant Gemini model yet. Itβs #1 on the LM Arena leaderboard. π₯
25.03.2025 17:25 β π 215 π 65 π¬ 34 π 11
folks working on one or more of the following
πΌοΈ Image Descriptions to improve Image-Text alignment
AND/OR
π¬Multi/Cross Lingual image-text understanding/generation
AND/OR
πGeo-Cultural representation and learning
Please DM if you are willing to discuss the current state/challenges/future-work.
25.11.2024 06:57 β π 3 π 1 π¬ 1 π 0
New starter pack! go.bsky.app/GZ4hZzu
28.10.2024 09:43 β π 42 π 17 π¬ 6 π 5
Too soon but π€
24.11.2024 17:24 β π 1 π 0 π¬ 0 π 0
πββοΈ Could I be added ? Thanks :)
24.11.2024 16:53 β π 1 π 0 π¬ 0 π 0
We had a great experience presenting our work on ImageInWords to the community #EMNLP2024 . Thank you everyone for stopping byπ! Looking forward to future work and seeing image descriptions as a foundational multi-modal task! @emnlpmeeting.bsky.social @deep-mind.bsky.social #NLProc #Multimodal
23.11.2024 22:53 β π 9 π 0 π¬ 0 π 0
All the ACL chapters are here now: @aaclmeeting.bsky.social @emnlpmeeting.bsky.social @eaclmeeting.bsky.social @naaclmeeting.bsky.social #NLProc
19.11.2024 03:48 β π 107 π 37 π¬ 1 π 3
Research Engineer, GenMedia
Mountain View, California, US
hello new followers! weβre actively hiring on our generative media team in Mountain View: boards.greenhouse.io/deepmind/job...
we work on image, video, audio, etcβ¦ come work with us if youβre interested! apply asap :)
22.11.2024 06:08 β π 15 π 4 π¬ 1 π 0
ImageInWords: Unlocking Hyper-Detailed Image Descriptions
Despite the longstanding adage "an image is worth a thousand words," generating accurate hyper-detailed image descriptions remains unsolved. Trained on short web-scraped image text, vision-language mo...
π’ Excited to unveil our latest research, ImageInWords (IIW)! πWe're pushing the boundaries of image descriptions with a new seeded, sequential, human-in-the-loop approach producing SoTA, articulate, hyper-detailed descriptions.
arXiv: arxiv.org/abs/2405.02793
#NLProc #ComputerVision #Multimodal
21.11.2024 00:26 β π 7 π 1 π¬ 0 π 0
Workshop on Vision Language Models For All: Building Geo-Diverse and Culturally Aware Vision-Language Models @ CVPR 2025
https://sites.google.com/view/vlms4all
Senior Research Director at Google DeepMind in our San Francisco office. I created Magenta (magenta.withgoogle.com) and sometimes find time to be a musician.
Manchester Centre for AI FUNdamentals | UoM | Alumn UCL, DeepMind, U Alberta, PUCP | Deep Thinker | Posts/reposts might be non-deep | Carpe espresso β
Committed to the daily re-imagining of what a university press can be since 1962.
Website: https://mitpress.mit.edu // The Reader (our home for excerpts, essays, & interviews): https://thereader.mitpress.mit.edu
Research Scientist Meta/FAIR, Prof. University of Geneva, co-founder Neural Concept SA. I like reality.
https://fleuret.org
Google Chief Scientist, Gemini Lead. Opinions stated here are my own, not those of Google. Gemini, TensorFlow, MapReduce, Bigtable, Spanner, ML things, ...
Research Engineer @ Google DeepMind
Senior Scientist, Department of Biology, Emory University, Atlanta, GA.
Molecular Biologist. RNA Scientist. Yeast Geneticist π§¬. Britishπ¬π§. Runner ππ»β¦. Opinions are my own.
Interests: RNA Decay, RNA Processing, RNA & Disease.
UT Austin journalism professor; former NYT, WP. he/him.
Theoretical Astro/Physicist:
https://chanda.science
First book:
https://tinyurl.com/DisorderedCosmos
PREORDER MY NEXT BOOK:
https://tinyurl.com/EdgeOfSpaceTime
Newsletter:
news.chanda.science
all Black/all Jewish. π³οΈβπ/agender/woman.
Posts by/for meππ½
Professor for CS at the Tuebingen AI Center and affiliated Professor at MIT-IBM Watson AI lab - Multimodal learning and video understanding - GC for ICCV 2025 - https://hildekuehne.github.io/
Post All & Anything Liverpool Related.
Transfers | Updates | Rumours | #lfc #Liverpool
Liverpool content repost for bsky users.
Not affiliated with LFC.
A progressive voice for a metropolitan community.
Part of the Joe Media Group.
building chatgpt at openai
prev quizlet, twitter, mit
ankushg.com
data science @ openai
here for data, transit, urbanism, nyc
Engineering at OpenAI. Formerly working on Fuschia at Google
ai research @ thinking machines . realtime video+voice. i like trains and bikes. sometimes I climb rocks and throw pottery.