Nick Vincent's Avatar

Nick Vincent

@nickmvincent.bsky.social

Studying people and computers (https://www.nickmvincent.com/) Blogging about data and steering AI (https://dataleverage.substack.com/)

288 Followers  |  302 Following  |  69 Posts  |  Joined: 07.04.2023  |  1.8181

Latest posts by nickmvincent.bsky.social on Bluesky

RSL: Really Simple Licensing The open content licensing standard for the AI-first Internet

Anyone compiling discussions/thoughts on emerging licensing schemes and preference signals? eg rslstandard.org and github.com/creativecomm... ? externalizing some notes here datalicenses.org, but want to find where these discussions are happening!

18.09.2025 18:43 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Excited to be giving a talk on data leverage to the Singapore AI Safety Hub. Trying to capture updated thoughts from recent years, and have long wanted to better connect leverage/collective bargaining to the safety context.

14.08.2025 08:05 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
About the workshop – ACA@NeurIPS

About a week away from the deadline to submit to the

✨ Workshop on Algorithmic Collective Action (ACA) ✨

acaworkshop.github.io

at NeurIPS 2025!

14.08.2025 07:56 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
How do we know our AI output is good? Double checks, bar charts, vibes, and training data. Connecting evaluation and dataset documentation via the lens of "AI as ranking".

Follow up, tying together "AI as ranking chunks of human records" with "eval leverage" and "dataset details as quality signals": dataleverage.substack.com/p/how-do-we-...

And related, "eval leverage": dataleverage.substack.com/p/evaluation...

08.08.2025 22:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

(1) ongoing challenges in benchmarking, (2) challenges in communicating benchmarks to the public, (3) dataset documentation, and (4) post-hoc dataset "reverse engineering"

The original post: dataleverage.substack.com/p/selling-ag...

08.08.2025 22:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

who paid that Dr for a verified attestation with provenance can use this attestation as a quality signal; a promise to consumers about the exact nature of the evaluation. A "9/10 dentists recommend" for a chatbot.

More generally, I think there are interesting connections between current discourse &

08.08.2025 22:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

For some types of info, we can maybe treat as open and focus on selling convenient/"nice" packages (ala Wikimedia Enterprise)

But attestations provide another object to transact over. Valuable info (a Dr giving thumbs up/down on medical responses) may leak, but the AI developer

08.08.2025 22:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

So in a post-AI world, to help people transact over work that produces information, we likely need:
- individual property-ish rights over info (not a great way to go, IMO)
- rights that enable collective bargaining (good!)
- or...

08.08.2025 22:31 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The core challenge: many inputs into AI are information, and thus hard to design efficient markets for. Info is hard to exclude (pre-training data remains very hard to exclude, but even post-training data may be hard without sufficient effort)

08.08.2025 22:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

It looks like some skepticism was warranted (not much progress towards this vision yet). I do think "dataset details as quality signals" is still possible though, and could play a key role in addressing looming information economics challenges.

08.08.2025 22:31 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

🧡In several recent posts, I speculated that eventually, dataset details may become an important quality signal for consumers choosing AI products.

"This model is good for asking health questions, because 10,000 doctors attested to supporting training and/or eval". Etc.

08.08.2025 22:31 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Oh Canada! An AI Happy Hour @ ICML 2025 Β· Luma Whether you're Canadian or one of our friends from around the world, please join us for some drinks and conversation to chat about life, papers, AI, and...…

Around ICML with loose evening plans and an interest in "public AI", Canadian sovereign AI, or anything related? Swing by the Internet Archive Canada between 5p and 7p lu.ma/7rjoaxts

16.07.2025 23:30 β€” πŸ‘ 3    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
On AI-driven Job Apocalypses and Collective Bargaining for Information Reacting to a fresh wave of discussion about AI's impact on the economy and power concentration, and reiterating the potential role of collective bargaining.

Finally, I recently shared a preprint that relates deeply to the above ideas, on Collective Bargaining for Information: arxiv.org/abs/2506.10272, and have a blog post on this as well: dataleverage.substack.com/p/on-ai-driv...

24.06.2025 12:33 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
Algorithmic Collective Action With Two Collectives [crosspost] This post was written by Aditya Karan, with support from Nick Vincent and Karrie Karahalios to accompany a FAccT 2025 paper. It was originally published on Jun 19, 2025 via the Crowd Dynamics Lab blog...

And we have a blog post on algorithmic collective action with multiple collectives! dataleverage.substack.com/p/algorithmi...

24.06.2025 12:33 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Each Instance of "AI Utility" Stems from Some Human Act(s) of Information Recording and Ranking It's ranking information all the way down.

These blog posts expand on attentional agency:
- genAI as ranking chunks of info: dataleverage.substack.com/p/google-and...
- utility of AI stems from people: dataleverage.substack.com/p/each-insta...
- connection to evals: dataleverage.substack.com/p/how-do-we-...

24.06.2025 12:33 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Algorithmic Collective Action with Two Collectives | Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency You will be notified whenever a record that you have chosen has been cited.

[FAccT-related link round-up]: It was great to present on measuring Attentional Agency with Zachary Wojtowicz at FAccT. Here's our paper on ACM DL: dl.acm.org/doi/10.1145/...

On Thurs Aditya Karan will present on collective action dl.acm.org/doi/10.1145/... at 10:57 (New Stage A)

24.06.2025 12:33 β€” πŸ‘ 5    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

β€œAttentional agency” β€” talk in new stage b at facct in the session right now!

24.06.2025 07:48 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Off to FAccT; Excited to see faces old and new!

21.06.2025 21:50 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
On AI-driven Job Apocalypses and Collective Bargaining for Information Reacting to a fresh wave of discussion about AI's impact on the economy and power concentration, and reiterating the potential role of collective bargaining.

Another blog post: a link roundup on AI's impact on jobs and power concentration, another proposal for Collective Bargaining for Information, and some additional thoughts on the topic:

dataleverage.substack.com/p/on-ai-driv...

05.06.2025 17:25 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Each Instance of "AI Utility" Stems from Some Human Act(s) of Information Recording and Ranking It's ranking information all the way down.

Post 2: dataleverage.substack.com/p/each-insta...

28.05.2025 20:43 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Do some aspects seem wrong (in the next 2 posts, I get into how these ideas interact w/ reinforcement learning)?

27.05.2025 15:45 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Push and Pull: A Framework for Measuring Attentional Agency on Digital Platforms We propose a framework for measuring attentional agency, which we define as a user's ability to allocate attention according to their own desires, goals, and intentions on digital platforms that use s...

arxiv.org/abs/2405.14614

Follow ups coming very soon (already drafted): would love to discuss these ideas with folks. Is this all repetitive with past data labor/leverage work? Are some aspects obvious to you?

27.05.2025 15:45 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

This has implications for Internet policy, for understanding where the value in AI comes from, and for thinking about why we might even consider a certain model to be "good"!

This first post leans heavily on recent work with Zachary Wojtowicz and Shrey Jain, to appear at this upcoming FAccT

27.05.2025 15:45 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Google and TikTok rank bundles of information; ChatGPT ranks grains. Google and others solve our attentional problem by ranking discrete bundles of information, whereas ChatGPT ranks more granular chunks. This lens can help us reason about AI policy.

New data leverage post: "Google and TikTok rank bundles of information; ChatGPT ranks grains."

dataleverage.substack.com/p/google-and...

This will be post 1/3 in a series about viewing many AI products as all competing around the same task: ranking bundles or grains of records made by people.

27.05.2025 15:45 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1
Preview
Algorithmic Collective Action with Two Collectives Given that data-dependent algorithmic systems have become impactful in more domains of life, the need for individuals to promote their own interests and hold algorithms accountable has grown. To have ...

Pre-print now on arxiv and to appear at FAccT 2025:

arxiv.org/abs/2505.00195

"Algorithmic Collective Action with Two Collectives --
Aditya Karan, Nicholas Vincent, Karrie Karahalios, Hari Sundaram"

02.05.2025 18:44 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image

Sharing a new paper (led by Aditya Karan):

there's growing interest in algorithmic collective action, when a "collective" acts through data to impact a recommender system, classifier, or other model.

But... what happens if two collectives act at the same time?

02.05.2025 18:44 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Public AI, Data Appraisal, and Data Debates A consortium of Public AI labs can substantially improve data pricing, which may also help to concretize debates about the ethics and legality of training practices.

New early draft post: "Public AI, Data Appraisal, and Data Debates"

"A consortium of Public AI labs can substantially improve data pricing, which may also help to concretize debates about the ethics and legality of training practices."

dataleverage.substack.com/p/public-ai-...

03.04.2025 17:52 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Model Plurality Current research in β€œplural alignment” concentrates on making AI models amenable to diverse human values. But plurality is not simply a safeguard against bias or an engine of efficiency: it’s a key in...

β€œAlgo decision making systems are β€œleviathans”, harmful not for their arbitrariness or opacity, but systemacity of decisions"

- @christinalu.bsky.social on need for plural #AI model ontologies (sounds technical, but has big consequences for human #commons)

www.combinationsmag.com/model-plural...

02.04.2025 07:57 β€” πŸ‘ 6    πŸ” 2    πŸ’¬ 2    πŸ“Œ 0
Preview
Evaluation Data Leverage: Advances like "Deep Research" Highlight a Looming Opportunity for Bargaining Power Research agents and increasingly general reasoning models open the door for immense "evaluation data leverage".

New Data Leverage newsletter post. It's about... data leverage (specifically, evaluation-focused bargaining) and products du jour (deep research, agents).

dataleverage.substack.com/p/evaluation...

03.03.2025 18:26 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Here's my round-up as a markdown file: github.com/nickmvincent...

Here's the newsletter post, Tipping Points for Content Ecosystems: dataleverage.substack.com/p/tipping-po...

14.02.2025 18:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@nickmvincent is following 20 prominent accounts