Sayash Kapoor

Sayash Kapoor

@sayash.bsky.social

CS PhD candidate at Princeton. I study the societal impact of AI. Website: cs.princeton.edu/~sayashk Book/Substack: aisnakeoil.com

8,347 Followers 950 Following 36 Posts Joined Jun 2023
1 week ago
Panel 1: Text: “Imagine an alternate universe in which people don’t have words for different forms of transportation, only the collective noun ‘vehicle’.” Illustration: a stick figure stands next to a much more detailed motorcycle, which a speech bubble saying “Woah! Sweet vehicle!”
Panel 2: Text: “They use that word to refer to: cars, buses, bikes, spacecraft, and all other ways of getting from place A to place B.” Illustration: A car, a school bus, a bicycle, and a space shuttle, all with a stamp that says “vehicle” on them. Panel 1: Text: “Conversations in this world are confusing.” Illustration: A speech bubble coming from the left: “Can you drive a vehicle?”, A speech bubble coming from the right: “Definitely!”, A illustration of a car crashed into a tree. Left speech bubble: “I thought you said you could drive!” Right speech bubble: “I can! I’m just used to ones with two wheels!”
Panel 2: Text: “There are furious debates about whether or not vehicles are environmentally friendly… even though no one realizes that one side is talking about bikes and the other is talking about trucks. Illustration: Left speech bubble: “Vehicles produce so much pollution!” Right speech bubble: “That’s an exaggeration! They are actually very green!” Panel 1: Text: “There is a breakthrough in rocketry, but the media focuses on how vehicles have gotten faster so people call there car (‘car’ is crossed out and replaced with ‘vehicle’) dealer to ask when faster models will be available.” Illustration: A TV news report with a picture of a rocket ship and a chyron saying “Breaking: Vehicles reach 1000 mph!”. Below that is a drawing of two stick figures talking at a car dealership. One says, “So I can take this to space, right?”
Panel 2: Text: “Meanwhile, fraudsters have capitalize on the fact that consumers don’t know what to believe when it comes to vehicle technology, so scams are rampant in the vehicle sector.” Illustration: A stick figure with a mean smile and a sparkle next to his eye pats a car that has plane wings taped to it. A speech bubble says, “Oh yeah! You can fly this baby across the ocean!” Panel 1: Text: “Now replace the word “vehicle” with “artificial intelligence” and we have a pretty good descriptor of the world we live in.” Illustration: One crowd of people say “AI is bad for the environment!” Underneath them is a large box that is labeled “Size of AI people are concerned about”, another crowd of people says “AI is used for climate research!” Underneath them is a much smaller box saying “Size of AI used for climate research”. In the foreground there is a person watching the debate with several question marks above it. 
Panel 2: Credits. “Text from AI Snake Oil: What Artificial Intelligence Can Do, What It Can’t, and How to Tell the Difference by Arvin’s Narayanan and Sayash Kapoor. Art by Ayla Taylor. www.aylataylor.com”

A silly little comic based on the opening section of AI Snake Oil by @randomwalker.bsky.social and @sayash.bsky.social.

26 12 1 1
1 month ago
Preview
AI Won’t Automatically Make Legal Services Cheaper Three bottlenecks between AI capability and access to justice.

Advanced AI will not automatically help consumers achieve desired legal outcomes at lower costs. Justin Curl, @sayash.bsky.social, & @randomwalker.bsky.social identify three bottlenecks that stand between AI capability advances and more accessible legal services in the latest Lawfare Research paper.

13 4 1 0
9 months ago

(1/4) Ever wondered what tech policy might look like if it were informed by research on collective intelligence and complex systems? 🧠🧑‍💻

Join @jbakcoleman.bsky.social, @lukethorburn.com, and myself in San Diego on Aug 4th for the Collective Intelligence x Tech Policy workshop at @acmci.bsky.social!

17 12 1 3
11 months ago
Preview
Why an overreliance on AI-driven modelling is bad for science Without clear protocols to catch errors, artificial intelligence’s growing role in science could do more harm than good.

New commentary in @nature.com from professor Arvind Narayanan (@randomwalker.bsky.social) & PhD candidate Sayash Kapoor (@sayash.bsky.social) about the risks of rapid adoption of AI in science - read: "Why an overreliance on AI-driven modelling is bad for science" 🔗

#CITP #AI #science #AcademiaSky

21 12 0 0
11 months ago
Preview
AI as Normal Technology

In a new essay from our "Artificial Intelligence and Democratic Freedoms" series, @randomwalker.bsky.social & @sayash.bsky.social make the case for thinking of #AI as normal technology, instead of superintelligence. Read here: knightcolumbia.org/content/ai-a...

38 17 1 6
10 months ago
Preview
Why an overreliance on AI-driven modelling is bad for science Without clear protocols to catch errors, artificial intelligence’s growing role in science could do more harm than good.

“The rush to adopt AI has consequences. As its use proliferates…some degree of caution and introspection is warranted.”

In a comment for @nature.com, @randomwalker.bsky.social and @sayash.bsky.social warn against an overreliance on AI-driven modeling in science: bit.ly/4icM0hp

6 4 0 0
11 months ago
Preview
Why an overreliance on AI-driven modelling is bad for science Without clear protocols to catch errors, artificial intelligence’s growing role in science could do more harm than good.

Science is not collection of findings. Progress happens through theories.As we move from findings to theories things r less amenable to automation. Proliferation of scientific findings based on AI hasn't accelerated—& might even have inhibited—higher levels of progress www.nature.com/articles/d41...

121 49 3 3
1 year ago
x.com

This is the specific use case I have in mind (Operator shouldn't be the *only* thing developers use, but rather that it can be a helpful addition to a suite of tools): x.com/random_walke...

2 0 0 0
1 year ago
Preview
AI companies are pivoting from creating gods to building products. Good. Turning models into products runs into five challenges

It is also better for end users. As
@randomwalker.bsky.social and I have argued, focusing on products (rather than just models) means companies must understand user demand and build tools people want. It leads to more applications that people can productively use: www.aisnakeoil.com/p/ai-compani...

3 0 0 0
1 year ago

Finally, the new product launches from OpenAI (Operator, Search, Computer use, Deep research) show that it doesn't just want to be in the business of creating more powerful AI — it also wants a piece of the product pie. This is a smart move as models become commoditized.

2 0 1 0
1 year ago

This also highlights the need for agent interoperability: who would want to teach a new agent 100s of tasks from scratch? If web agents become widespread, preventing agent lock-in will be crucial.

(I'm working on fleshing out this argument with
@sethlazar.org + Noam Kolt)

1 0 1 0
1 year ago

Seen this way, Operator is a *tool* to easily create new web automation using natural language.

It could expand the web automation that businesses already use, making it easier to create new ones.

So it is quite surprising that Operator isn't available on ChatGPT Teams yet.

0 0 1 0
1 year ago
OpenAI allows you to delegate daily tasks to Operator

Instead of thinking of Operator as a "universal assistant" that completes all tasks, it is better to think of it as a task template tool that automates specific tasks (for now).

Once a human has overseen a task a few times, we can estimate Operator's ability to automate it.

1 0 1 0
1 year ago
screenshot of the save task template for Operator

OpenAI also allows you to "Save" tasks you completed using Operator. Once you complete a task and provide feedback to complete it successfully, you don't need to repeat it the next time.

I can imagine this becoming powerful (though it's not very detailed right now).

1 0 1 0
1 year ago
x.com

3) In many cases, the challenge isn't Operator's ability to complete a task, it is eliciting human preferences. Chatbots aren't a great form factor for that.

But there are many tasks where reliability isn't important. This is where today's agents shine. For example: x.com/random_walke...

4 1 1 0
1 year ago

Could more training data lead to automation without human oversight? Not quite:

1) Prompt injection remains a pitfall for web agents. Anyone who sends you an email can control your agent.
2) Low reliability means agents fail on edge cases

3 1 1 0
1 year ago

But being able to see agent actions and give feedback with a human in the loop converts Operator from an unreliable agent, like the Humane Pin or Rabbit R1, to a workable but imperfect product.

Operator is as much as UX advance as it is a tech advance.

3 0 1 0
1 year ago

In the end, Operator struggled to file my expense reports even after an hour of trying and prompting. Then I took over, and my reports were filed 5 minutes later.

This is the bind for web agents today: not reliable enough to be automatable, not quick enough to save time.

4 1 1 1
1 year ago

OpenAI also trained Operator to ask the user for feedback before taking consequential actions, though I am not sure how robust this is — a simple instruction to avoid asking the user changed its behavior, and I can easily imagine this being exploited by prompt injection attacks.

1 0 1 0
1 year ago
Operator tries to delete receipts.

But things went south quickly. It couldn't match the receipts to the amounts. Even after prompts directing it to missing receipts, it couldn't download them. It almost deleted previous receipts from other expenses!

1 0 1 0
1 year ago
screenshot of concur with the categories for the expense filled in

It navigated to the correct URLs, asked me to log into my OpenAI and Concur accounts. Once in my accounts, it downloaded receipts from the correct URL, and even started uploading the receipts under the right headings!

1 0 1 0
1 year ago
screenshot of a conversation with Operator

I asked Operator to file reports for my OpenAI and Anthropic API expenses for the last month. This is a task I do manually each month, so I knew exactly what it would need to do. To my surprise, Operator got the first few steps exactly right:

2 0 1 0
1 year ago
screenshot of Operator writing "Hello World" in an online notepad.

OpenAI's Operator is a web agent that can solve arbitrary tasks on the internet *with human supervision*. It runs on a virtual machine (*not* your computer). Users can see what the agent is doing on the browser in real-time. It is available to ChatGPT Pro subscribers.

7 0 1 0
1 year ago
Graph of web tasks along difficulty and severity (cost of errors)

I spent a few hours with OpenAI's Operator automating expense reports. Most corporate jobs require filing expenses, so Operator could save *millions* of person-hours every year if it gets this right.

Some insights on what worked, what broke, and why this matters for the future of agents 🧵

38 10 6 3
1 year ago
Preview
Is AI progress slowing down? Making sense of recent technology trends and claims

Excellent post discussing whether "AI progress is slowing down".

www.aisnakeoil.com/p/is-ai-prog...

And if you're not subscribed to @randomwalker.bsky.social and @sayash.bsky.social 's great newsletter, what are you waiting for?

56 15 0 1
1 year ago
Book cover

Excited to share that AI Snake Oil is one of Nature's 10 best books of 2024! www.nature.com/articles/d41...
The whole first chapter is available online:
press.princeton.edu/books/hardco...
We hope you find it useful.

135 30 5 6
1 year ago
Preview
We Looked at 78 Election Deepfakes. Political Misinformation is not an AI Problem. Technology Isn’t the Problem—or the Solution.

Grateful to @katygb.bsky.social for feedback on the draft. Read the full essay (w/@randomwalker.bsky.social): www.aisnakeoil.com/p/we-looked-...

4 0 0 0
1 year ago
Screenshot from the blog post

Improving the information environment is inextricably linked to the larger project of shoring up democracy and its institutions. No quick fix can “solve” our information problems. But we should reject the simplistic temptation to blame AI.

8 2 1 0
1 year ago

But blaming technology is not a fix. Political polarization has led to greater mistrust of the media. People prefer sources that confirm their worldview and are less skeptical about content that fits their worldview. Journalism revenues have fallen drastically.

4 0 1 0
1 year ago

So why do we keep hearing warnings about an AI-fueled misinformation apocalypse? Blaming technology is appealing since it makes solutions seem simple. If only we could roll back harmful tech, we could drastically improve the information environment!

10 2 1 0