In light of the recent controversies around Metaβs AI bot of xAIβs Ani, but also the broader debate on model sycophancy, Iβm reposting this
18.08.2025 12:51 β π 0 π 0 π¬ 0 π 0@pitti.io.bsky.social
Just trying to kill boredom without killing anyone in the process | Anything unrelated to actual (super niche) area of expertise | Dubito ergo sum
In light of the recent controversies around Metaβs AI bot of xAIβs Ani, but also the broader debate on model sycophancy, Iβm reposting this
18.08.2025 12:51 β π 0 π 0 π¬ 0 π 0Anthropicβs recent evals for the release of Sonnet 4 and Opus 4 represent a good opportunity to re-share a blogpost from last year entitled βArtificial Intelligence : what everyone can agree onβ
It touched on benchmark-gaming
Itβs helpful when you know exactly what you needβ¦ and when what you need is reasonably simple β¦ and you know how to fix the buttons that do not work and the design.
I may use the as a starting point for something else actually
It feels more like early Replit than Cursor
I gave it another chance where I know that React Typescript could be a good option (and Iβve kind of worked on this already so I can judge the choices made)
24.05.2025 16:09 β π 0 π 0 π¬ 1 π 0I suppose that the wow effect is there for anyone who does not know how to codeβ¦ But for now it is not a productivity tool, itβs a POC
you can deploy to cloud run (not tested) so I think that Googleβs strategy is clear. They will invest in it. Letβs wait for the next upgrade
So you canβt actually work on YOUR project with your files. This just builds REACT apps from scratch for you. Thatβs it. I tried to ask for a Django app to check, I got another tsx app
24.05.2025 16:09 β π 0 π 0 π¬ 1 π 0UX is a bit weird (like closing right panel on the leftβ¦ many things like this). I never found how to save a project created from scratch
Projects come with specific files (even for empty templates), you canβt delete them
It would literally build an app based on a prompt. Create files, edit them, reorganize them⦠very cursor like experience at a shallow level.
24.05.2025 16:09 β π 0 π 0 π¬ 1 π 0Iβve been testing the new Google AI Studio feature βBuildβ as a promising mitigant to Geminiβs code slop tendency (which the AI Studio UI makes a horrible experience)
Unfortunately this is very far from ready to do actual work
Strongly advise to wait a couple of weeks (details below)
If I believe the marketing materials, the latest Claude models can work continuously for longer than the Twitter backend does
24.05.2025 13:48 β π 1 π 0 π¬ 0 π 0So jack morris is in fact john morris
21.05.2025 21:13 β π 2 π 0 π¬ 0 π 0Earlier this month, I wrote about the Cambrian explosion in the voice modality (TTS and STS) and the evolution of frontier labsβ commercial strategies
www.pitti.io/blogs/not-to...
Part Three tackles the conceptual dangers arising from the spread of sophisticated influence tools as AI continues to advance.
www.pitti.io/blogs/aligne...
Part Two sheds a technical light on the profound transformations in the information value chain over the past 25 years.
www.pitti.io/blogs/aligne...
Part One of this series explores the potential disconnect between technological and societal innovation, and what makes a tool truly transformative for both.
www.pitti.io/blogs/aligne...
One of my mutuals here suggested a Bluesky Wednesday as a way to revive the interest for this platform.
Reposting here a blog series bringing together various thoughts and fragments posted elsewhere around AI alignment.
Some context first
www.pitti.io/blogs/aligne...
Iβve not found this platform very welcoming last time but Iβm willing to give it another try.
21.02.2025 13:37 β π 0 π 0 π¬ 0 π 0Reposting here an essay from early Feb on existential economic questions for generations X, Y and Z.
Touching on:
- Aging population
- Sovereign debt
- Technology
- A new world order
www.pitti.io/articles/exi...
This place is clearly not a safe haven as its decentralized nature is a double-edged sword
AI seems even more divisive than on other platforms. No punches pulled over here (and itβs not great)
In September, I wrote about what everyone can agree on.
www.pitti.io/articles/ai-...
Thanks for point this out. Iβm running it locally in 8bit (in MLX), in theory it should be better than in the blog.
It is somewhat helpful but it makes so many small mistakes that it actually takes longer to fix than to write it from scratch with a bit of autocompletion
Qwen2.5-Coder-32B doesnβt really live up to expectations
Canβt handle a redux store in a fairly simple app with 3 components
When a new model comes out and I want to vibecheck it with a single prompt, my prompt is :
Please provide a detailed analysis of the strengths and weaknesses of the business model of a typical hairdresser
I feel that the situation is slightly different for commercial services where they force you to fill in a form to place an order, collect data that they donβt need and never delete them.
But I have no solution to offer. It is what it is. Same for cookies (itβs super bad but not as bad as banners)
Oh yes, I think there are a lot of lessons to be learnt from GDPR. Good intentions but completely disconnected to the practical realities (itβs even practically impossible to enforce the law). For social media platforms, Education of end-users is the only way to protect them.
28.11.2024 11:08 β π 0 π 0 π¬ 1 π 0lawmakers naively thought that, by putting so much burden on companies, companies will choose not to collect data and/or delete quickly
In reality they chose to put horrible banners to make users pay with their time for EU lawmakers shortsightedness and collected data like before and never deleting
And to be clear, the rationale for data laws in europe is not βdonβt touch my dataβ, itβs βguarantee data-ownership to individualsβ. The way it is implemented is a disaster. But the right ask for location, deletion and portability are good imo. Portability is underrated (and never enforced)
28.11.2024 10:38 β π 1 π 0 π¬ 1 π 0Presenting it like this probably misses a lot of the historical context but itβs an important factor. You understand why the Snowden revelations shocked Europe
Younger generations are more comfortable with it but they are not the ones making the law.
And what about a merge of 2 open LLMs trained on synthetic data produced by a open LLM trained on private data?
And what about the code to train the open LLM, which was itself 75% generated by a LLM?
Itβs over