(2/2) Insurers profit by preventing losses, not paying claimsโso they'll invest in figuring out what actually makes AI safer. Working with Fathom, we're proposing legislation where government sets acceptable risk levels and private evaluators verify companies meet them.
20.11.2025 00:00 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
Insurance companies are trying to avoid big payouts by making AI safer
As government regulation lags, some insurance companies see a business case for pushing AI companies to minimize risk and adopt stronger guardrails.
(1/2) 99% of surveyed businesses have lost money from AI failuresโtwo-thirds lost over $1M, according to Ernst & Young. Insurance companies are stepping in: meet verifiable safety standards, get coverage. Don't meet them, you're on your own. I spoke with NBC News: buff.ly/wkmPooC
20.11.2025 00:00 โ ๐ 4 ๐ 2 ๐ฌ 1 ๐ 0
Current approaches conceptualize the alignment challenge as one of eliciting individual human preferences and training models to choose outputs that that satisfy those preferences. To the extentโฆ
Gillian Hadfield - Alignment is social: lessons from human alignment for AI
The recording of my keynote from #COLM2025 is now available!
06.11.2025 21:35 โ ๐ 10 ๐ 3 ๐ฌ 0 ๐ 0
Gillian Hadfield and Thomas Friedman stand together smiling in an office, each holding their respective books - Hadfield holds 'Rules for a Flat World' and Friedman holds 'The World is Flat.'
I finally got a chance to meet @thomaslfriedman.bsky.social, whose book The World Is Flat inspired my own Rules for a Flat World. I had a great conversation with him and Andrew Freedman about the challenge we find the world facing: how do we build rules for AI that work in a complex world?
31.10.2025 22:39 โ ๐ 7 ๐ 1 ๐ฌ 0 ๐ 0
Grateful to keynote at #COLM2025. Here's what we're missing about AI alignment: Humans donโt cooperate just by aggregating preferences, we build social processes and institutions to generate norms that make it safe to trade with strangers. AI needs to play by these same systems, not replace them.
15.10.2025 23:00 โ ๐ 15 ๐ 3 ๐ฌ 1 ๐ 0
We suspect RLHF training creates sycophantic behavior, models trained to be agreeable may prioritize consensus over critical evaluation. This suggests current alignment techniques might undermine collaborative reasoning.
23.09.2025 17:06 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
Stronger agents were more likely to change from correct to incorrect answers in response to weaker agents' reasoning than vice versa. Models showed a tendency toward favoring agreement over critical evaluation, creating an echo chamber instead of an actual debate.
23.09.2025 17:06 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
However, we still observed performance gains on math problems under most conditions, suggesting debate effectiveness depends heavily on the type of reasoning required.
23.09.2025 17:06 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
The impact varies significantly by task type. On CommonSenseQAโa dataset we newly examinedโdebate reduced performance across ALL experimental conditions.
23.09.2025 17:06 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Even when stronger models outweighed weaker ones, group accuracy decreased over successive debate rounds. Introducing weaker models into debates produced results that were worse than the results when agents hadnโt engaged in discussion at all.
23.09.2025 17:06 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
We tested debate effectiveness across three tasks (CommonSenseQA, MMLU, GSM8K) using three different models (GPT-4o-mini, LLaMA-3.1-8B, Mistral-7B) in various configurations.
23.09.2025 17:06 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
We found that multi-agent debate among large language models can sometimes harm performance rather than improve it, contradicting the assumption that more discussion can lead to better outcomes.
23.09.2025 17:06 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
My lab members Harsh Satija and Andrea Wynn and I have a new preprint examining AI multi-agent debate among diverse models, based on our ICML MAS 2025 workshop.
23.09.2025 17:06 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
Talk Isn't Always Cheap: Understanding Failure Modes in Multi-Agent Debate
While multi-agent debate has been proposed as a promising strategy for improving AI reasoning ability, we find that debate can sometimes be harmful rather than helpful. The prior work has exclusivelyโฆ
Using debate among AI agents has been proposed as a promising strategy for improving AI reasoning capabilities. Our new research shows that this strategy can often have the opposite effect - and the implications for AI deployment are significant. (1/10) arxiv.org/abs/2509.05396
23.09.2025 17:06 โ ๐ 7 ๐ 1 ๐ฌ 1 ๐ 1
Jobs
I have postdoc and staff openings for our lab at the Johns Hopkins University in either Baltimore, MD or Washington, DC.Postdoctoral FellowWe are hiring an interdisciplinary scholar with a track reโฆ
These roles will shape the conversation on AI and provide the opportunity for rich, interdisciplinary collaboration with colleagues and researchers in the Department of Computer Science and the School of Government and Policy.
Please spread the word in your network! 5/5
gillianhadfield.org/jobs/
16.06.2025 18:18 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
We're recruiting for a Postdoctoral fellow with a track record in computational modeling that explores AI systems and autonomous AI agent dynamics, and experience with ML systems to investigate the foundations of human normativity, and how to build AI systems aligned with human values. 4/5
16.06.2025 18:17 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
We're hiring an AI Communications Associate to craft and execute a multi-channel strategy that turns leading computer science and public policy research into accessible content for a broad audience of stakeholders. 3/5
16.06.2025 18:16 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
Jobs
I have postdoc and staff openings for our lab at the Johns Hopkins University in either Baltimore, MD or Washington, DC.Postdoctoral FellowWe are hiring an interdisciplinary scholar with a track reโฆ
We're hiring an AI Policy Researcher to conduct in-depth research into the technical and policy challenges in AI alignment, safety, and governance, and to produce high-quality research reports, white papers, and policy recommendations. 2/5
16.06.2025 18:15 โ ๐ 2 ๐ 0 ๐ฌ 0 ๐ 0
Jobs
I have postdoc and staff openings for our lab at the Johns Hopkins University in either Baltimore, MD or Washington, DC.Postdoctoral FellowWe are hiring an interdisciplinary scholar with a track reโฆ
My lab @johnshopkins is recruiting research and communications professionals, and AI postdocs to advance our work ensuring that AI is safe and aligned to human well-being worldwide. 1/5
16.06.2025 18:15 โ ๐ 1 ๐ 1 ๐ฌ 4 ๐ 0
destabilize or harm our communities, economies, or politics.Together with @djjrjr.bsky.social and @torontosri.bsky.social we held a design workshop last year with a stunning group of experts from AI labs, regulatory technology startups, enterprise clients, civil society, academia,and government.2/3
12.06.2025 00:33 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Six years ago @jackclarksf.bsky.social and I proposed regulatory markets as a new model for AI governance that would attract more investment---money and brainsโin a democratically legitimate way, fostering AI innovation while ensuring these powerful technologies donโt 1/2
12.06.2025 00:32 โ ๐ 4 ๐ 1 ๐ฌ 1 ๐ 0
Interview with Gillian Hadfield: Normative infrastructure for AI alignment - ฮฮhub
In this insightful interview, AIhub ambassador Kumar Kshitij Patel met @ghadfield.bsky.socialโฌ, keynote speaker at โช@ijcai.org, to find out more about her interdisciplinary research, career trajectory, AI alignment, and her thoughts on AI systems in general.
aihub.org/2025/05/22/i...
23.05.2025 14:47 โ ๐ 5 ๐ 3 ๐ฌ 0 ๐ 0
AIhub monthly digest: May 2025 โ materials design, object state classification, and real-time monitoring for healthcare data - ฮฮhub
Our latest monthly digest features:
-Ananya Joshi on healthcare data monitoring
-AI alignment with @ghadfield.bsky.socialโฌ
-Onur Boyar on drug and material design
-Object state classification with Filippos Gouidis
aihub.org/2025/05/30/a...
04.06.2025 15:13 โ ๐ 3 ๐ 1 ๐ฌ 0 ๐ 0
Voters Were Right About the Economy. The Data Was Wrong.
Hereโs why unemployment is higher, wages are lower and growth less robust than government statistics suggest.
Everyone, including those who think we're building powerful AI to improve lives for everyone, should take seriously how poorly our current economic indicators (unemployment, earnings, inflation) capture the well-being of low- and moderate-income folks. www.politico.com/news/magazin...
15.02.2025 15:58 โ ๐ 4 ๐ 0 ๐ฌ 0 ๐ 1
China, US should fight rogue AI risks together, despite tensions: ex-diplomat
Open-source AI models like DeepSeek allow collaborators to find security vulnerabilities more easily, Fu Ying tells Parisโ AI Action Summit.
I was at this meeting Mon, and the quality & seriousness of discussion made it a highlight. But Fu Ying is right that forging the cooperation needed, even limited to the extreme risks that threaten everyone, is becoming ever harder. We must keep trying.
www.scmp.com/news/china/d...
14.02.2025 12:15 โ ๐ 2 ๐ 1 ๐ฌ 0 ๐ 0
I think that would only require โreadโ access
05.02.2025 02:18 โ ๐ 3 ๐ 0 ๐ฌ 0 ๐ 0
Do we think Musk is using treasury payments data to train, fine tune or do inference on AI models? @caseynewton.bsky.social
04.02.2025 21:20 โ ๐ 8 ๐ 1 ๐ฌ 1 ๐ 0
Create and share social media content anywhere, consistently.
Built with ๐ by a global, remote team.
โฌ๏ธ Learn more about Buffer & Bluesky
https://buffer.com/bluesky
Powerful Artificial Intelligence may be coming. Society is not prepared.
FBPE, Scotland, Canada, Humanism, Human Rights, CooperativeAI, Hygge.
Research scientist in AI alignment at Google DeepMind. Co-founder of Future of Life Institute. Views are my own and do not represent GDM or FLI.
PhD candidate @utoronto.ca and @vectorinstitute.ai | Soon: Postdoc @princetoncitp.bsky.socialโฌ | Reliable, safe, trustworthy machine learning.
Dog dad and Georgetown law prof.
machine learning asst prof at @cs.ubc.ca and amii
statistical testing, kernels, learning theory, graphs, active learningโฆ
she/her, ๐ณ๏ธโโง๏ธ, @queerinai.com
working for the French gov on AI by day (compar:IA), making figma screens for a participatory democracy company by evening (Open Source Politics), building fun little AI projects by night (TamagoCHAT, Bertrand, Tortue)
Gravel biking on weekends ๐ด
Exploring humanity's emergent understanding, research, and perceptions of artificial intelligence. Promoting AI Safety & Ethics.
Professor and Canada CIFAR AI Chair (Amii) at the University of Alberta, Dept. Medicine, BLINCLab. Corporate director. Previously: office co-lead at DeepMind Alberta. https://pilarski.github.io
Science journalist and author of THE UFO FILES (Quarto, 2025). I talk about AI, fascinating research, and walks in the woods. Anything that makes me say โwow!โ She/her
Founder & CEO @HackerNoon.com HackerNoon.com Blogs โ> Blog.DavidSmooke.net
Authoritative global coverage of rule of law and human rights. Powered by law students and graduates from six continents. Rigorous legal training, nuanced takes.
Law prof @UICLaw. Civ Pro, access to justice, consumer law, state courts, big data, A2J & legal tech. Former consumer rights legal aid attorney & EJW fellow. Researcher at the Debt Collection Lab. Motorcycle rider (๐๏ธ & ๐๏ธ).
Researching Al and Machine Learning in the Finance sector, conference speaker, co-author of The Al book
AI, Ethics, SmartCity
๐ฆ๐ฎ๐ฝ๐ถ๐ฒ๐ป๐๐ฎ ๐จ๐ป๐ถ๐๐ฒ๐ฟ๐๐ถ๐๐: lecturer Planning and Strategic Management
๐๐ก๐๐: member of National Authority for AI association
๐ง๐๐ : Account Manager
๐ฐ https://doi.org/10.1108/TG-04-2024-0096
๐ https://www.linkedin.com/in/vriccardi
PhD Candidate at Cambridge | ex Meta, Amazon | Studying diversity in multi-agent and multi-robot learning
https://matteobettini.com/
Partner at Messner Reeves LLP. JD/PhD. Innovation, strategy, AI & running after my kids.
Interested in all aspects of AI. Some AI art.