Heiko Hotz's Avatar

Heiko Hotz

@heikohotz.bsky.social

AI Engineer @ Google ๐Ÿ‘จโ€๐Ÿ’ป โ€” Educator ๐Ÿ‘จโ€๐Ÿซ โ€” Traveller โœˆ๏ธ โ€” Hobby photographer ๐Ÿ“ท โ€” Foodie ๐ŸŒฎ โ€” Film fan ๐Ÿฟ โ€” Boardgamer ๐ŸŽฒ โ€” Londoner๐Ÿ’‚โ€โ™‚๏ธ Medium: https://heiko-hotz.medium.com/ Github: https://github.com/heiko-hotz LI: https://www.linkedin.com/in/heikohotz/

488 Followers  |  628 Following  |  466 Posts  |  Joined: 30.11.2024  |  1.9308

Latest posts by heikohotz.bsky.social on Bluesky

Post image

Oh, hello! May I meet you?

18.11.2025 15:58 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I really like tiny (you could even say "nano") bananas. They are so full of flavour ๐Ÿ˜‹

19.08.2025 07:07 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Advanced version of Gemini with Deep Think officially achieves gold-medal standard at the International Mathematical Olympiad Our advanced model officially achieved a gold-medal level performance on problems from the International Mathematical Olympiad (IMO), the worldโ€™s most prestigious competition for young...

Read more about it here: deepmind.google/discover/blo...

21.07.2025 22:23 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

This year, the advanced Gemini model operated end-to-end in natural language, producing rigorous mathematical proofs directly from the official problem descriptions โ€“ all within the 4.5-hour competition time limit.

21.07.2025 22:23 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

At IMO 2024, AlphaGeometry and AlphaProof required experts to first translate problems from natural language into domain-specific languages, such as Lean, and vice-versa for the proofs.

21.07.2025 22:23 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

This achievement is a significant advance over last yearโ€™s result.

21.07.2025 22:22 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

An advanced version was able to solve 5 out of 6 problems.

21.07.2025 22:22 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Official results are in - Gemini achieved gold-medal level in the International Mathematical Olympiad! ๐Ÿ†

21.07.2025 22:22 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Let AI Tune Your Voice Assistant | Towards Data Science A practical guide to automating prompt engineering for voice assistants

The result? Saves tons of time, money, and builds super reliable voice assistants that have undergone a rigorous evaluation process. No more guesswork! ๐Ÿ“ˆ
Full details + code here: towardsdatascience.com/let-ai-tune-...

15.07.2025 07:06 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

GOOD NEWS: I built an #AutomatedPromptEngineering (APE) pipeline specifically for voice AI! ๐Ÿค–โœจ My new @towardsdatascience blog post dives deep.
What it does:
โœ… Creates diverse audio tests
โœ… Automates performance eval
โœ… LLM optimizes your prompts! ๐Ÿ‘‡

15.07.2025 07:06 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Evaluating voice-driven agents got you pulling your hair out? ๐Ÿ˜ฉ Evaluating voice agents is WILD. Accents, noise, weird speech... how do you even test?! Manual prompt engineering for that? A total nightmare. ๐Ÿ‘‡

15.07.2025 07:05 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Thanks for sharing @towardsdatascience.com ๐Ÿค—

15.07.2025 07:03 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
WWDC INTERVIEW: Craig & Joz on Why Siri's Not Ready, AI Vision and iPadOS Shocker!
YouTube video by Tomโ€™s Guide WWDC INTERVIEW: Craig & Joz on Why Siri's Not Ready, AI Vision and iPadOS Shocker!

Tom's Guide: www.youtube.com/watch?v=Pt3q...

12.06.2025 15:41 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Apple Execs Defend Siri Delays, AI Plan and Apple Intelligence | WSJ
YouTube video by The Wall Street Journal Apple Execs Defend Siri Delays, AI Plan and Apple Intelligence | WSJ

WSJ interview: www.youtube.com/watch?v=NTLk...

12.06.2025 15:41 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Building a v1 GenAI app on an existing platform while overhauling the foundation for a better 'V2' is a common strategy. But explaining this to everyday consumers is challenging. These kinds of interviews really help communicate that effectively. What did you think?

12.06.2025 15:40 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

To give an example, right out of the gate she asks, "๐™‡๐™–๐™จ๐™ฉ ๐™ฎ๐™š๐™–๐™ง ๐™ฎ๐™ค๐™ช ๐™–๐™ฃ๐™ฃ๐™ค๐™ช๐™ฃ๐™˜๐™š๐™™ ๐™– ๐™จ๐™ข๐™–๐™ง๐™ฉ๐™š๐™ง ๐˜ผ๐™„-๐™™๐™ง๐™ž๐™ซ๐™š๐™ฃ ๐™Ž๐™ž๐™ง๐™ž. ๐™’๐™๐™š๐™ง๐™š ๐™ž๐™จ ๐™จ๐™๐™š?"
From a developer's point of view, Apple's answers made a lot of sense: a 'V1' worked, but didn't meet their high quality/reliability standards when users went 'off the beaten path'.

12.06.2025 15:40 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

This year, however, Craig Federighi and Greg Joswiak were interviewed by other outlets, including Tom's Guide, TechRadar, and The Wall Street Journal. I particularly liked Joanna Stern's interview and her style: direct, concise, and challenging.

12.06.2025 15:40 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

While it's not fair to characterise Gruber as an "Apple fanboy," I consistently found his questions too long-winded and too softball. By the end, it often felt (to me, at least) like just a few folk were a bit too cosy on stage.

12.06.2025 15:39 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

WWDC interviews with Apple executives just got a facelift - and it is refreshing!
For years, high-level Apple execs would come to John Gruber's (from Daring Fireball) Talk Show at WWDC. I often found these interviews less than insightful, and sometimes even annoying.

12.06.2025 15:39 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

i definitely hear you on that one ๐Ÿ˜… out of curiosity - what are the benefits you are looking to gain from an agent framework (in general)?

16.01.2025 15:19 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Video thumbnail

Introducing Gemini-Powered Slide Creation by Voice!

In this quick demo, Iโ€™ve integrated a โ€œSlide Creation Agentโ€ into my personal project, Project Pastra. Watch how it effortlessly generates slides based on voice instructions.

14.01.2025 07:47 โ€” ๐Ÿ‘ 4    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
GitHub - heiko-hotz/gemini-multimodal-live-dev-guide: A developer guide for Gemini's Multimodal Live API A developer guide for Gemini's Multimodal Live API - heiko-hotz/gemini-multimodal-live-dev-guide

Not perfect by any means, but much better already than "traditional" voice assistants, and we are only at the beginning of this journey.

You can try it yourself with the Developer Guide for Gemini's Multimodal Live API ๐Ÿค—

github.com/heiko-hotz/g...

08.01.2025 07:43 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I believe that multimodal AI models have the potential to change that. They allow me to speak much more freely about what I want them to do and oftentimes they understand and execute in the way I expected them to.

08.01.2025 07:43 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

But soon I realised that these voice assistants still require a rigid syntax: I would have to phrase commands in a very specific way for the voice assistant to understand what I meant.

08.01.2025 07:43 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

To me it was a magical moment when I got my first Amazon Echo in 2015 and could just shout words into the air and got a response.

08.01.2025 07:43 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Gemini's Multimodal Live API with Calendar Tool
YouTube video by Heiko Hotz Gemini's Multimodal Live API with Calendar Tool

Multimodal AI models have the potential to finally deliver on the dream of language being the ultimate human-computer interface ๐ŸŽ™๏ธ

youtu.be/0OEDHAjY6LM

08.01.2025 07:42 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Fade Out
YouTube video by Secret Level Fade Out

Fade Out. Directed by Jason Zada. Created with Googleโ€™s Veo 2.

youtu.be/9yQXkdA3u8k?...

30.12.2024 20:56 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
GitHub - heiko-hotz/gemini-multimodal-live-dev-guide: A developer guide for Gemini's Multimodal Live API A developer guide for Gemini's Multimodal Live API - heiko-hotz/gemini-multimodal-live-dev-guide

Check it out and start building your own voice assistant ๐Ÿค—

github.com/heiko-hotz/g...

30.12.2024 09:26 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

It is a full-featured web application for real-time conversations with audio and video input, memory, and tool use! And it works great on mobile phones, too.

30.12.2024 09:26 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

But Gemini's Multimodal Live API actually lets you build a comparable experience today! I'm proud to share a step-by-step developer guide that will help you build Project Pastra.

30.12.2024 09:26 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

@heikohotz is following 20 prominent accounts