View all the results on our blog and github
firebender.com/blog/kotlin-...
github.com/firebenders/...
View all the results on our blog and github
firebender.com/blog/kotlin-...
github.com/firebenders/...
Introducing Kotlin-bench V2, the first benchmark that evaluates agentic coding models for Android and Kotlin. π
Claude Opus 4.5 is the most intelligent model for Android.
Gemini 3 Flash delivers the best cost-to-intelligence ratio. Great intelligence at ~10x cheaper cost.
View the full leaderboard here: firebender.com/leaderboard
Try out Claude 4 Sonnet in Firebender: plugins.jetbrains.com/plugin/25224...
Claude 4 Sonnet is officially the best AI model for Android and Kotlin development π
Claude 4 Sonnet solved 26% of Kotlin-bench tasks, outperforming OpenAI's o3.
Claude 4 Sonnet & Opus are available in Firebender today for all users of JetBrains IDEs. Try them out and let us know what you think!
o4-mini and o3 is now in Firebender 0.9.20 on Android Studio/Intellij
agent benchmarks coming soon for kotlin-bench
Updated our Kotlin-bench leaderboard with results for Grok 3 and GPT-4.1!
TL;DR: Grok 3 is a very capable coding model for Android & Kotlin development. GPT-4.1 shows improvement but still trails behind other major competitors.
See the full leaderboard here:
firebender.com/leaderboard
I just released Kotlin-bench, the first-ever benchmark that evaluates LLMs against real-world Kotlin & Android Github issues.
Gemini 2.5 topped the leaderboard solving 14% of issues, with Claude 3.7 thinking solving 12% in 2nd place.
Code, datasets, and results here: firebender.com/blog/kotlin-...
Future of Android Development will be multi-agents that code, fix bugs, implement UI changes from figma all autonomously. Firebender 0.9.6 proves this, and hereβs why:
03.03.2025 23:23 β π 4 π 1 π¬ 1 π 0
First impressions after about an hour of using it.
1. Absolutely love how it fixes it's own errors. π
2. Autocomplete feels much faster than Copilot.
3. Very eager to make changes outside of the scope of the file I'm working on. Might be user error and might be fixable with rules
DMing
03.03.2025 01:26 β π 0 π 0 π¬ 0 π 0
Try it out and let us know what you think! We just made it IntelliJ-compatible in addition to our standard Android Studio offering.
Very curious how it helps with your KMP/CMP work
Yep!!
You can specify guidelines and rules for how you want the AI to write tests, what architecture pattern you expect, and more.
More info here
docs.firebender.com/context/rules