Nick Vincent's Avatar

Nick Vincent

@nickmvincent.bsky.social

Studying people and computers (https://www.nickmvincent.com/) Blogging about data and steering AI (https://dataleverage.substack.com/)

275 Followers  |  299 Following  |  58 Posts  |  Joined: 07.04.2023  |  2.0971

Latest posts by nickmvincent.bsky.social on Bluesky

Preview
Oh Canada! An AI Happy Hour @ ICML 2025 Β· Luma Whether you're Canadian or one of our friends from around the world, please join us for some drinks and conversation to chat about life, papers, AI, and...…

Around ICML with loose evening plans and an interest in "public AI", Canadian sovereign AI, or anything related? Swing by the Internet Archive Canada between 5p and 7p lu.ma/7rjoaxts

16.07.2025 23:30 β€” πŸ‘ 3    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
On AI-driven Job Apocalypses and Collective Bargaining for Information Reacting to a fresh wave of discussion about AI's impact on the economy and power concentration, and reiterating the potential role of collective bargaining.

Finally, I recently shared a preprint that relates deeply to the above ideas, on Collective Bargaining for Information: arxiv.org/abs/2506.10272, and have a blog post on this as well: dataleverage.substack.com/p/on-ai-driv...

24.06.2025 12:33 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
Algorithmic Collective Action With Two Collectives [crosspost] This post was written by Aditya Karan, with support from Nick Vincent and Karrie Karahalios to accompany a FAccT 2025 paper. It was originally published on Jun 19, 2025 via the Crowd Dynamics Lab blog...

And we have a blog post on algorithmic collective action with multiple collectives! dataleverage.substack.com/p/algorithmi...

24.06.2025 12:33 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Each Instance of "AI Utility" Stems from Some Human Act(s) of Information Recording and Ranking It's ranking information all the way down.

These blog posts expand on attentional agency:
- genAI as ranking chunks of info: dataleverage.substack.com/p/google-and...
- utility of AI stems from people: dataleverage.substack.com/p/each-insta...
- connection to evals: dataleverage.substack.com/p/how-do-we-...

24.06.2025 12:33 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Algorithmic Collective Action with Two Collectives | Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency You will be notified whenever a record that you have chosen has been cited.

[FAccT-related link round-up]: It was great to present on measuring Attentional Agency with Zachary Wojtowicz at FAccT. Here's our paper on ACM DL: dl.acm.org/doi/10.1145/...

On Thurs Aditya Karan will present on collective action dl.acm.org/doi/10.1145/... at 10:57 (New Stage A)

24.06.2025 12:33 β€” πŸ‘ 5    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

β€œAttentional agency” β€” talk in new stage b at facct in the session right now!

24.06.2025 07:48 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Off to FAccT; Excited to see faces old and new!

21.06.2025 21:50 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
On AI-driven Job Apocalypses and Collective Bargaining for Information Reacting to a fresh wave of discussion about AI's impact on the economy and power concentration, and reiterating the potential role of collective bargaining.

Another blog post: a link roundup on AI's impact on jobs and power concentration, another proposal for Collective Bargaining for Information, and some additional thoughts on the topic:

dataleverage.substack.com/p/on-ai-driv...

05.06.2025 17:25 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Each Instance of "AI Utility" Stems from Some Human Act(s) of Information Recording and Ranking It's ranking information all the way down.

Post 2: dataleverage.substack.com/p/each-insta...

28.05.2025 20:43 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Do some aspects seem wrong (in the next 2 posts, I get into how these ideas interact w/ reinforcement learning)?

27.05.2025 15:45 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Push and Pull: A Framework for Measuring Attentional Agency on Digital Platforms We propose a framework for measuring attentional agency, which we define as a user's ability to allocate attention according to their own desires, goals, and intentions on digital platforms that use s...

arxiv.org/abs/2405.14614

Follow ups coming very soon (already drafted): would love to discuss these ideas with folks. Is this all repetitive with past data labor/leverage work? Are some aspects obvious to you?

27.05.2025 15:45 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

This has implications for Internet policy, for understanding where the value in AI comes from, and for thinking about why we might even consider a certain model to be "good"!

This first post leans heavily on recent work with Zachary Wojtowicz and Shrey Jain, to appear at this upcoming FAccT

27.05.2025 15:45 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Google and TikTok rank bundles of information; ChatGPT ranks grains. Google and others solve our attentional problem by ranking discrete bundles of information, whereas ChatGPT ranks more granular chunks. This lens can help us reason about AI policy.

New data leverage post: "Google and TikTok rank bundles of information; ChatGPT ranks grains."

dataleverage.substack.com/p/google-and...

This will be post 1/3 in a series about viewing many AI products as all competing around the same task: ranking bundles or grains of records made by people.

27.05.2025 15:45 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1
Preview
Algorithmic Collective Action with Two Collectives Given that data-dependent algorithmic systems have become impactful in more domains of life, the need for individuals to promote their own interests and hold algorithms accountable has grown. To have ...

Pre-print now on arxiv and to appear at FAccT 2025:

arxiv.org/abs/2505.00195

"Algorithmic Collective Action with Two Collectives --
Aditya Karan, Nicholas Vincent, Karrie Karahalios, Hari Sundaram"

02.05.2025 18:44 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image

Sharing a new paper (led by Aditya Karan):

there's growing interest in algorithmic collective action, when a "collective" acts through data to impact a recommender system, classifier, or other model.

But... what happens if two collectives act at the same time?

02.05.2025 18:44 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Public AI, Data Appraisal, and Data Debates A consortium of Public AI labs can substantially improve data pricing, which may also help to concretize debates about the ethics and legality of training practices.

New early draft post: "Public AI, Data Appraisal, and Data Debates"

"A consortium of Public AI labs can substantially improve data pricing, which may also help to concretize debates about the ethics and legality of training practices."

dataleverage.substack.com/p/public-ai-...

03.04.2025 17:52 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Model Plurality Current research in β€œplural alignment” concentrates on making AI models amenable to diverse human values. But plurality is not simply a safeguard against bias or an engine of efficiency: it’s a key in...

β€œAlgo decision making systems are β€œleviathans”, harmful not for their arbitrariness or opacity, but systemacity of decisions"

- @christinalu.bsky.social on need for plural #AI model ontologies (sounds technical, but has big consequences for human #commons)

www.combinationsmag.com/model-plural...

02.04.2025 07:57 β€” πŸ‘ 6    πŸ” 2    πŸ’¬ 2    πŸ“Œ 0
Preview
Evaluation Data Leverage: Advances like "Deep Research" Highlight a Looming Opportunity for Bargaining Power Research agents and increasingly general reasoning models open the door for immense "evaluation data leverage".

New Data Leverage newsletter post. It's about... data leverage (specifically, evaluation-focused bargaining) and products du jour (deep research, agents).

dataleverage.substack.com/p/evaluation...

03.03.2025 18:26 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Here's my round-up as a markdown file: github.com/nickmvincent...

Here's the newsletter post, Tipping Points for Content Ecosystems: dataleverage.substack.com/p/tipping-po...

14.02.2025 18:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I have some new co-authored writing to share, along with a round-up of important articles for the "content ecosystems and AI" space.

I'm doing an experiment with microblogging directly to a GitHub repo that I can share across platforms...

14.02.2025 18:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
PS Events: AI Action Summit
YouTube video by Project Syndicate PS Events: AI Action Summit

Global Dialogues has launched at the Paris #AIActionSummit.

Watch @audreyt.org give the announcement via @projectsyndicate.bsky.social

youtu.be/XkwqYQL6V4A?... (starts at 02:47:30)

10.02.2025 19:57 β€” πŸ‘ 6    πŸ” 4    πŸ’¬ 2    πŸ“Œ 1
Preview
AI Labs Could Open Source Data Protection Technologies There's still incredible tension in the current data paradigm, but sharing "data protection" technologies, like those used by OpenAI to accuse DeepSeek of model theft, can help cut a path forward.

AI labs and tech companies should open-source their data protection techniques so that content creators can benefit from new and old advances in this space: dataleverage.substack.com/p/ai-labs-co...

31.01.2025 19:26 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Given it seems clear that data protection technologies (such as the techniques OpenAI used to gather this evidence) will play a role in the near-term, I put together another post with a simple proposal that could reduce some of the tension in the current paradigm.

31.01.2025 19:26 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

On Mon, wrote a post on the live-by-the-sword, die-by-the-sword nature of the current data paradigm. On Wed, there was quite a development on this front -- OpenAI came out with a statement that they have evidence that DeepSeek "used" OpenAI models in some fashion (this was faster than I expected!)

31.01.2025 19:26 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
AI Labs Could Open Source Data Protection Technologies There's still incredible tension in the current data paradigm, but sharing "data protection" technologies, like those used by OpenAI to accuse DeepSeek of model theft, can help cut a path forward.

Really appreciate all the AI lab data paradigm / hypocrisy discussion on the show! BTW, you might enjoy this academic-y newsletter post (dataleverage.substack.com/p/ai-labs-co...) in which I quote your recent tweet on the topic (and the prequel from monday dataleverage.substack.com/p/live-by-th...)

31.01.2025 19:22 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

For other kinds of benchmarks, influence much more localized (small set of data that contributes directly to eg factual history knowledge). So reasoning is highly collective (we’ve all contributed) but in theory still ablatable / subject to leverage and scaling

28.01.2025 14:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Don’t disagree directly w these points! But basically would add, for reasoning, influence is widely distributed amongst training data (but I’d guess that eg code and philosophy materials punch above their weight). But even for this, data scaling applies(more data -> better at a set of such examples)

28.01.2025 14:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Maybe I created confusion in my first response -- I'm not particularly attached to the compositor framing, and am definitely not trying to argue for a plagiarism framing. Rather, unlike humans, it's much easier (I think!) to attribute the "reasoning breakpoint" to specific documents and efforts

28.01.2025 14:30 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

(i say this is bordering on tautology because it's effectively true for any data-dependent system that I could "ablate" down to having only one training document -- but I think is relevant as its part of the point I want to appear *more* in public discussions of AI policy)

28.01.2025 14:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The bordering-on-tautological longer argument I'd make is: given enough resources, I'm confident a team could eventually do enough data ablations to remove this capability, and doing so more accurately pinpoint the specific upstream human efforts

28.01.2025 14:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@nickmvincent is following 20 prominent accounts