Kuzman Ganchev @ganchev - Bluesky Profile

Latest posts by ganchev.bsky.social on Bluesky

Study in Nature: “Across 30 out of 32 evaluation axes from the specialist physician perspective & 25 out of 26 evaluation axes from the patient-actor perspective, AMIE [Google Medical LLM] was rated superior to PCPs [primary care docs] while being non-inferior on the rest.”

(& AIME is an older LLM)

04.05.2025 13:27 — 👍 70 🔁 15 💬 4 📌 7

Gemma explained: What’s new in Gemma 3- Google Developers Blog Google's Gemma 3 model includes vision-language support and architectural changes for resource-friendly multimodal language models.

Gemma 3 explained: Longer context, image support, and a new 1B model. → goo.gle/4lV8iaw

Other key enhancements:
🔸 Best model that fits in a single consumer GPU or TPU host
🔸 KV-cache memory reduction with 5-to-1 interleaved attention
🔸 And more!

Read the blog for the full details on Gemma 3.

30.04.2025 21:46 — 👍 22 🔁 8 💬 1 📌 0

There's a link to a really nice interactive viewer for a sample of the data (will only make sense after you read the post). There's some examples that I would have expected (where something is implied but not directly stated) but also a surprising number of kind of topical things.

17.12.2024 16:12 — 👍 3 🔁 1 💬 0 📌 0

Want to get started using PaliGemma 2?

🎤 developers.googleblog.com/en/introduci...
🤗 huggingface.co/blog/paligem...
💾 kaggle.com/models/googl...
🔧 github.com/google-resea...

7/7

05.12.2024 18:19 — 👍 7 🔁 1 💬 0 📌 0

GitHub - varungodbole/prompt-tuning-playbook: A playbook for effectively prompting post-trained LLMs A playbook for effectively prompting post-trained LLMs - varungodbole/prompt-tuning-playbook

Wanted to share that Varun Godbole recently released a prompting playbook. The title says prompt tuning, but this is text prompts, not soft prompts.

github.com/varungodbole...

11.11.2024 15:51 — 👍 14 🔁 7 💬 0 📌 0

ALTA: Compiler-Based Analysis of Transformers We propose a new programming language called ALTA and a compiler that can map ALTA programs to Transformer weights. ALTA is inspired by RASP, a language proposed by Weiss et al. (2021), and Tracr (Lin...

I’m pretty excited about this one!

ALTA is A Language for Transformer Analysis.

Because ALTA programs can be compiled to transformer weights, it provides constructive proofs of transformer expressivity. It also offers new analytic tools for *learnability*.

arxiv.org/abs/2410.18077

24.10.2024 03:31 — 👍 53 🔁 16 💬 2 📌 0

Zed - The editor for what's next Zed is a high-performance, multiplayer code editor from the creators of Atom and Tree-sitter.

Not news, but I recently saw the zed.dev demo and it looks amazing. Has anyone used it or something similar?

25.10.2024 14:43 — 👍 3 🔁 0 💬 0 📌 0

@ganchev is following 20 prominent accounts

Ethan Mollick
@emollick

Professor at Wharton, studying AI and its implications for education, entrepreneurship, and work. Author of Co-Intelligence. Book: https://a.co/d/bC2kSj1 Substack: https://www.oneusefulthing.org/ Web: https://mgmt.wharton.upenn.edu/profile/emollick

@gracaninja

Noah A. Smith
@nlpnoah

Researcher in NLP, ML, computer music. Prof @uwcse @uwnlp & helper @allen_ai @ai2_allennlp & familiar to two cats. Single reeds, tango, swim, run, cocktails, מאַמע־לשון, GenX. Opinions not your business.

Christopher Manning
@chrmanning

Stanford Linguistics and Computer Science. Director, Stanford AI Lab. Founder of @stanfordnlp.bsky.social . #NLP https://nlp.stanford.edu/~manning/

Yoav Artzi
@yoavartzi.com

LM/NLP/ML researcher ¯\_(ツ)_/¯ yoavartzi.com / associate professor @ Cornell CS + Cornell Tech campus @ NYC / nlp.cornell.edu / associate faculty director @ arXiv.org / researcher @ ASAPP / starting @colmweb.org / building RecNet.io

Hanna Wallach
@hannawallach

VP and Distinguished Scientist at Microsoft Research NYC. AI evaluation and measurement, responsible AI, computational social science, machine learning. She/her. One photo a day since January 2018: https://www.instagram.com/logisticaggression/

Jenn Wortman Vaughan
@jennwv

Sr. Principal Research Manager at Microsoft Research, NYC // Machine Learning, Responsible AI, Transparency, Intelligibility, Human-AI Interaction // WiML Co-founder // Former NeurIPS & current FAccT Program Co-chair // Brooklyn, NY // http://jennwv.com

Tal Linzen
@tallinzen

NYU professor, Google research scientist. Good at LaTeX.

Mohit Bansal
@mohitbansal

Parker Distinguished Professor, @UNC. Program Chair #EMNLP2024. Director http://MURGeLab.cs.unc.edu (@uncnlp). @Berkeley_AI @TTIC_Connect @IITKanpur #NLP #CV #AI #ML https://www.cs.unc.edu/~mbansal/

Luke Zettlemoyer
@lukezettlemoyer

Professor at UW; Researcher at Meta. LMs, NLP, ML. PNW life.

Mark Cuban
@mcuban

Entrepreneur Costplusdrugs.com

Eugene Vinitsky 🍒
@eugenevinitsky

Anti-cynic. Towards a weirder future. Reinforcement Learning, Autonomous Vehicles, transportation systems, the works. Asst. Prof at NYU https://emerge-lab.github.io https://www.admonymous.co/eugenevinitsky

Karen Ullrich (s/h) ✈️ Neurips
@karen-ullrich

Research scientist at FAIR NY ❤️ LLMs + Information Theory. Previously, PhD at UoAmsterdam, intern at DeepMind + MSRC.

Jaime Teevan
@teevan

Chief Scientist & Technical Fellow at Microsoft. Professor at UW. Mother to four wild boys. #AI #HCI #Productivity #FutureOfWork

Kushal Chauhan
@kushalchauhan

Research Engineer at Google DeepMind

Thomas Kipf
@tkipf

Research at Google DeepMind. Ex-Physicist. Controllable World Simulators (GNNs, Structured World Models, Neural Assets). TLM Veo Capabilities (Ingredients & more). 📍 San Francisco, CA

Kyle Levin
@kmlevin

data scientist at Google DeepMind

Hamza Merzić
@hamzamerzic.info

Staff Research Engineer @ Google DeepMind PhD Candidate @ UCL 🇵🇸

Jacob Eisenstein
@jacobeisenstein

natural language processing and computational linguistics at google deepmind.

Colin Carroll
@colcarroll

Runner, biker, hiker. Software engineer @DeepMind, and open source enthusiast. Sometimes crafts things out of wood. he/his.