The Gemini 2.5 Technical Report is out: storage.googleapis.com/deepmind-med...
17.06.2025 20:09 β π 9 π 2 π¬ 0 π 0@j5b.bsky.social
ML & NLP at Google DeepMind
The Gemini 2.5 Technical Report is out: storage.googleapis.com/deepmind-med...
17.06.2025 20:09 β π 9 π 2 π¬ 0 π 0π₯Introducing Gemini 2.5, our most intelligent model with impressive capabilities in advanced reasoning and coding.
Now integrating thinking capabilities, 2.5 Pro Experimental is our most performant Gemini model yet. Itβs #1 on the LM Arena leaderboard. π₯
Weβve been teaching Gemini to think.
Try it here: aistudio.google.com/prompts/new_...
Happy birthday Gemini!
06.12.2024 22:10 β π 14 π 1 π¬ 0 π 0π’We release TΓΌlu 3, a family of fully-open state-of-the-art post-trained models, alongside its data, code, and training recipes, serving as a comprehensive guide for modern post-training techniques!
21.11.2024 17:29 β π 59 π 7 π¬ 2 π 1Good software is an enabler for good science! π₯π§ͺ
Inspired by the below post, I like to point people at libraries like github.com/patrick-kidg... as a template for what a modern Python library looks like: `pre-commit`, ruff, pyright, pyproject.toml, an open-source license, etc. π€
Fun, insightful, useful, cheap: Thinking Like A Large Language Model: Become an AI manager a.co/d/7xMTtJM
17.11.2024 17:04 β π 0 π 1 π¬ 0 π 0A comparison of LLMs mean rating average in presentational and epistemological dimensions.
We compared notable LLMs such as InstructGPT, ChatGPT, GPT4, PaLM2 (text-bison), and Falcon-180B. They excel at presenting climate information, but there's room for improvement in the epistemic qualities of their answers.
06.10.2023 17:28 β π 1 π 0 π¬ 0 π 0This is a tough task for human raters. Our study finds that AI can effectively assist human raters, offering promising avenues for scalable oversight on difficult problems like this.
06.10.2023 17:27 β π 1 π 0 π¬ 1 π 0Excited to share our latest paper: We explore how large language models tackle questions on climate change π, introducing an evaluation framework grounded in #SciComm research.Β
Read the preprint: arxiv.org/abs/2310.02932