(1/4) Ever wondered what tech policy might look like if it were informed by research on collective intelligence and complex systems? π§ π§βπ»
Join @jbakcoleman.bsky.social, @lukethorburn.com, and myself in San Diego on Aug 4th for the Collective Intelligence x Tech Policy workshop at @acmci.bsky.social!
19.05.2025 11:01 β π 18 π 12 π¬ 1 π 3
Why an overreliance on AI-driven modelling is bad for science
Without clear protocols to catch errors, artificial intelligenceβs growing role in science could do more harm than good.
New commentary in @nature.com from professor Arvind Narayanan (@randomwalker.bsky.social) & PhD candidate Sayash Kapoor (@sayash.bsky.social) about the risks of rapid adoption of AI in science - read: "Why an overreliance on AI-driven modelling is bad for science" π
#CITP #AI #science #AcademiaSky
09.04.2025 18:19 β π 18 π 10 π¬ 0 π 0
AI as Normal Technology
In a new essay from our "Artificial Intelligence and Democratic Freedoms" series, @randomwalker.bsky.social & @sayash.bsky.social make the case for thinking of #AI as normal technology, instead of superintelligence. Read here: knightcolumbia.org/content/ai-a...
15.04.2025 14:34 β π 38 π 17 π¬ 1 π 5
Why an overreliance on AI-driven modelling is bad for science
Without clear protocols to catch errors, artificial intelligenceβs growing role in science could do more harm than good.
βThe rush to adopt AI has consequences. As its use proliferatesβ¦some degree of caution and introspection is warranted.β
In a comment for @nature.com, @randomwalker.bsky.social and @sayash.bsky.social warn against an overreliance on AI-driven modeling in science: bit.ly/4icM0hp
16.04.2025 15:42 β π 6 π 4 π¬ 0 π 0
Why an overreliance on AI-driven modelling is bad for science
Without clear protocols to catch errors, artificial intelligenceβs growing role in science could do more harm than good.
Science is not collection of findings. Progress happens through theories.As we move from findings to theories things r less amenable to automation. Proliferation of scientific findings based on AI hasn't acceleratedβ& might even have inhibitedβhigher levels of progress www.nature.com/articles/d41...
09.04.2025 15:45 β π 124 π 49 π¬ 3 π 3
x.com
This is the specific use case I have in mind (Operator shouldn't be the *only* thing developers use, but rather that it can be a helpful addition to a suite of tools): x.com/random_walke...
03.02.2025 18:12 β π 2 π 0 π¬ 0 π 0
AI companies are pivoting from creating gods to building products. Good.
Turning models into products runs into five challenges
It is also better for end users. As
@randomwalker.bsky.social and I have argued, focusing on products (rather than just models) means companies must understand user demand and build tools people want. It leads to more applications that people can productively use: www.aisnakeoil.com/p/ai-compani...
03.02.2025 18:10 β π 3 π 0 π¬ 0 π 0
Finally, the new product launches from OpenAI (Operator, Search, Computer use, Deep research) show that it doesn't just want to be in the business of creating more powerful AI β it also wants a piece of the product pie. This is a smart move as models become commoditized.
03.02.2025 18:10 β π 2 π 0 π¬ 1 π 0
This also highlights the need for agent interoperability: who would want to teach a new agent 100s of tasks from scratch? If web agents become widespread, preventing agent lock-in will be crucial.
(I'm working on fleshing out this argument with
@sethlazar.org + Noam Kolt)
03.02.2025 18:10 β π 1 π 0 π¬ 1 π 0
Seen this way, Operator is a *tool* to easily create new web automation using natural language.
It could expand the web automation that businesses already use, making it easier to create new ones.
So it is quite surprising that Operator isn't available on ChatGPT Teams yet.
03.02.2025 18:09 β π 0 π 0 π¬ 1 π 0
OpenAI allows you to delegate daily tasks to Operator
Instead of thinking of Operator as a "universal assistant" that completes all tasks, it is better to think of it as a task template tool that automates specific tasks (for now).
Once a human has overseen a task a few times, we can estimate Operator's ability to automate it.
03.02.2025 18:09 β π 1 π 0 π¬ 1 π 0
screenshot of the save task template for Operator
OpenAI also allows you to "Save" tasks you completed using Operator. Once you complete a task and provide feedback to complete it successfully, you don't need to repeat it the next time.
I can imagine this becoming powerful (though it's not very detailed right now).
03.02.2025 18:09 β π 0 π 0 π¬ 1 π 0
x.com
3) In many cases, the challenge isn't Operator's ability to complete a task, it is eliciting human preferences. Chatbots aren't a great form factor for that.
But there are many tasks where reliability isn't important. This is where today's agents shine. For example: x.com/random_walke...
03.02.2025 18:08 β π 3 π 1 π¬ 1 π 0
Could more training data lead to automation without human oversight? Not quite:
1) Prompt injection remains a pitfall for web agents. Anyone who sends you an email can control your agent.
2) Low reliability means agents fail on edge cases
03.02.2025 18:08 β π 2 π 1 π¬ 1 π 0
But being able to see agent actions and give feedback with a human in the loop converts Operator from an unreliable agent, like the Humane Pin or Rabbit R1, to a workable but imperfect product.
Operator is as much as UX advance as it is a tech advance.
03.02.2025 18:08 β π 2 π 0 π¬ 1 π 0
In the end, Operator struggled to file my expense reports even after an hour of trying and prompting. Then I took over, and my reports were filed 5 minutes later.
This is the bind for web agents today: not reliable enough to be automatable, not quick enough to save time.
03.02.2025 18:08 β π 3 π 1 π¬ 1 π 1
OpenAI also trained Operator to ask the user for feedback before taking consequential actions, though I am not sure how robust this is β a simple instruction to avoid asking the user changed its behavior, and I can easily imagine this being exploited by prompt injection attacks.
03.02.2025 18:07 β π 0 π 0 π¬ 1 π 0
Operator tries to delete receipts.
But things went south quickly. It couldn't match the receipts to the amounts. Even after prompts directing it to missing receipts, it couldn't download them. It almost deleted previous receipts from other expenses!
03.02.2025 18:07 β π 0 π 0 π¬ 1 π 0
screenshot of concur with the categories for the expense filled in
It navigated to the correct URLs, asked me to log into my OpenAI and Concur accounts. Once in my accounts, it downloaded receipts from the correct URL, and even started uploading the receipts under the right headings!
03.02.2025 18:07 β π 0 π 0 π¬ 1 π 0
screenshot of a conversation with Operator
I asked Operator to file reports for my OpenAI and Anthropic API expenses for the last month. This is a task I do manually each month, so I knew exactly what it would need to do. To my surprise, Operator got the first few steps exactly right:
03.02.2025 18:06 β π 1 π 0 π¬ 1 π 0
screenshot of Operator writing "Hello World" in an online notepad.
OpenAI's Operator is a web agent that can solve arbitrary tasks on the internet *with human supervision*. It runs on a virtual machine (*not* your computer). Users can see what the agent is doing on the browser in real-time. It is available to ChatGPT Pro subscribers.
03.02.2025 18:05 β π 6 π 0 π¬ 1 π 0
Graph of web tasks along difficulty and severity (cost of errors)
I spent a few hours with OpenAI's Operator automating expense reports. Most corporate jobs require filing expenses, so Operator could save *millions* of person-hours every year if it gets this right.
Some insights on what worked, what broke, and why this matters for the future of agents π§΅
03.02.2025 18:04 β π 34 π 10 π¬ 6 π 3
Is AI progress slowing down?
Making sense of recent technology trends and claims
Excellent post discussing whether "AI progress is slowing down".
www.aisnakeoil.com/p/is-ai-prog...
And if you're not subscribed to @randomwalker.bsky.social and @sayash.bsky.social 's great newsletter, what are you waiting for?
19.12.2024 23:57 β π 56 π 15 π¬ 0 π 1
Book cover
Excited to share that AI Snake Oil is one of Nature's 10 best books of 2024! www.nature.com/articles/d41...
The whole first chapter is available online:
press.princeton.edu/books/hardco...
We hope you find it useful.
18.12.2024 12:12 β π 130 π 30 π¬ 4 π 6
Screenshot from the blog post
Improving the information environment is inextricably linked to the larger project of shoring up democracy and its institutions. No quick fix can βsolveβ our information problems. But we should reject the simplistic temptation to blame AI.
16.12.2024 15:10 β π 8 π 2 π¬ 1 π 0
But blaming technology is not a fix. Political polarization has led to greater mistrust of the media. People prefer sources that confirm their worldview and are less skeptical about content that fits their worldview. Journalism revenues have fallen drastically.
16.12.2024 15:09 β π 4 π 0 π¬ 1 π 0
So why do we keep hearing warnings about an AI-fueled misinformation apocalypse? Blaming technology is appealing since it makes solutions seem simple. If only we could roll back harmful tech, we could drastically improve the information environment!
16.12.2024 15:09 β π 10 π 2 π¬ 1 π 0
Screenshot of a 1912 news articles with the headline "would make use of fake photos crime", from https://newsletter.pessimistsarchive.org/p/the-1912-war-on-fake-photos
We've heard warnings about new tech leading to waves of misinfo before. GPT-2 in 2019, LLaMA in 2023, Pixel 9 this year, and even photo editing and re-touching back in 1912. None of the predicted waves of misinfo materialized.
16.12.2024 15:09 β π 9 π 5 π¬ 1 π 0
Screenshot from Rest Of World article about the impact of AI in the 2024 Indian elections: https://restofworld.org/2024/exporter-india-deepfake-trolls/
Screenshot from CIGI article about the impact of AI in the 2024 Indonesia elections: https://www.cigionline.org/articles/its-time-to-reframe-disinformation-indonesias-elections-show-why/
Similar trends were seen worldwide. In India, AI was used for trolling rather than misinformation. In Indonesia, AI was used to create cartoon avatars that softened a candidate's image. Of course, the cost of creating avatars without AI is minuscule for presidential campaigns.
16.12.2024 15:08 β π 5 π 0 π¬ 1 π 0
I write mostly about the intersection of tech & art/culture which these days means I spend nearly all my time trying to address the exploitation underlying current AI models. A secular humanist interrogating modern religions.
Applied scientist working on LLM evaluation and publishing in AI ethics. Formerly: technical writing, philosophy. Urbanism nerd in my spare time. Opinions here my own. he/they π³οΈββ§οΈ. https://boltzmann-brain.github.io/
Subscribe to www.lawdork.com for SCOTUS, Trump, LGBTQ, criminal justice, and other legal news. / Email: lawdorknews@gmail.com / Signal: crg.32 / About me: Sober. Queer. Bipolar. Buckeye. / He/him.
Hiker, AI Ethicist, Responsible Tech Advocate in Security
besides vlogging hikes and travels, I mostly advise and work on AI policy, ethics & human-centric tech in security (war, armed conflict & border control...)
Google Chief Scientist, Gemini Lead. Opinions stated here are my own, not those of Google. Gemini, TensorFlow, MapReduce, Bigtable, Spanner, ML things, ...
Searching for the numinous
Australian Canadian, currently living in the US
https://michaelnotebook.com
I lead an incredible team building voting machines everyone can trust. https://voting.works
Optimistic about judicious uses of tech. Systems, security, privacy, cryptography, and the web are my jam.
Previously: Clever, Square, Mozilla, Harvard, MIT.
Policy Director, Knight First Amendment Institute at Columbia University | Senior Non-Resident Fellow, Center for International Policy. Opinions mine.
Professor a NYU; Chief AI Scientist at Meta.
Researcher in AI, Machine Learning, Robotics, etc.
ACM Turing Award Laureate.
http://yann.lecun.com
PhD candidate @ University of Michigan School of Information
https://aminaxabdu.github.io/
π Misinformation, social media & the news ποΈ
π¨π Postdoc at the University of Zurich, previously Reuters Institute & ENS π«π·
Through August 11, new monthly or annual Sustaining Donors get an EFF35 Challenge Coin! With your help, EFF is here to stay. https://eff.org/35years
Journalist, currently at The New York Times. I cover privacy, technology, A.I., and the strange times we live in. Named after the Led Zeppelin song. Author of YOUR FACE BELONGS TO US. (Yes, in my head it will always be All Your Face Are Belong To Us)
At wired.com where tomorrow is realized || Sign up for our newsletters: https://wrd.cm/newsletters
Find our WIRED journalists here: https://bsky.app/starter-pack/couts.bsky.social/3l6vez3xaus27
Law professor and writer. Expert on tech, privacy, AI, and sarcasm.
π¦: https://twitter.com/tiffanycli
π§΅: https://www.threads.net/@tiffany.c.li
π: https://mastodon.social/@tiffanycli
TiffanyLi.com if you insist
Southmayd Prof @YaleLawSch + Philosophy @Yale. Ed, @LegalTheory + Stanford Encyclopedia of Phil. βLegalityβ, βThe Internationalistsβ (with @oonahathaway), βFancy Bear Goes Phishing.β Overuses βneurosymbolic.β
Astrophysicist turned advocate; guiding the globe to a safer, stronger Internet. Distinguished Technologist @internetsociety.bsky.social https://josephhall.org/
News and information from the European Commission. Social media and data protection policy: http://europa.eu/!MnfFmT
Law Prof at UC Davis Law; co-host of the 99pi Breakdown of the Constitution https://99percentinvisible.org/book-club/
From 1982 to 2015, I invested in tech. Since 2016, I have been working for reform of the tech industry.
Author of NYT bestseller Zucked: Waking Up to the Facebook Catastrophe.
Musician in Moonalice (https://www.moonalice.com/splash)
Phillies fan.