AI Safety for Who? | Kairos.fm
AI safety is making you less safe: chatbot anthropomorphization, mental health harms, dark patterns
π¨ New muckrAIkers: "AI Safety For Who?"
@jacobhaimes.bsky.social & @thegermanpole.bsky.social break down how instruction tuning/RLHF create anthropomorphic chatbots that exploit human empathyβleading to mental health harms. kairos.fm/muckraikers/...
Find us wherever you listen! (links in thread)
13.10.2025 20:42 β π 4 π 4 π¬ 2 π 0
Getting Agentic w/ Alistair Lowe-Norris | Kairos.fm
Responsible AI veteran Alistair Lowe-Norris on ISO standards, compliance frameworks, and building safer AI by design.
π Just dropped a new Into AI Safety episode! Host @jacobhaimes.bsky.social chats with Alistair Lowe-Norris (ex-Microsoft, now at Iridius) about how responsible AI actually happens in practice.β¨β¨Check us out on Patreon or wherever you get your podcasts!β¨(links in thread)β¨kairos.fm/intoaisafety...
20.10.2025 21:11 β π 2 π 2 π¬ 1 π 0
This image is a screenshot of the OpenReview paper decision for NeurIPS 2025. The final decision is 'Reject.' The Area Chair's (AC) comment is quite long, and mentions how the reviewers ended feeling positive about the submission, and that the AC would be keen to see the paper at the conference. A final update from the Program Chairs (PC) states that AC feedback was ranked by the Senior Area Chairs (SACs), and this was used to inform the final decision to reject the paper in question.
Maybe I'm crazy, but this AC review I received from NeurIPS D&B track seems to essentially say "this is great," followed by a comment stating it has been rejected without any context.
Final scores were 4 5 4 4, i.e. all reviewers and the AC agreed the paper was an Accept.
Absolutely wild.
26.09.2025 21:32 β π 0 π 0 π¬ 0 π 0
Alt Text: Professional headshot of Li-Lian Ang, a young Asian woman with shoulder-length black hair and black-rimmed glasses, smiling warmly at the camera against a teal gradient background. The image includes the Kairos.fm logo and Into AI Safety podcast branding, with the episode title "Growing BlueDot's Impact w/ Li-Lian Ang" prominently displayed. A small icon showing interconnected nodes represents the AI Safety theme.
π¨ New Into AI Safety episode is live!
Li-Lian Ang from BlueDot Impact discusses their evolution from broad AI safety courses to targeted impact acceleration, addressing elitism in the field, and why we need more advocates beyond just technical researchers.
kairos.fm/intoaisafety/e023
16.09.2025 16:05 β π 2 π 2 π¬ 1 π 0
Super happy to share HumanAgencyBench, which takes steps towards understanding the impact of chatbot interactions on huma agency.
Working with @jacyanthis.bsky.social (and the team) has been fantastic, and I'd happily do it again. If you have the chance to work with him, don't pass it up!
15.09.2025 17:37 β π 4 π 0 π¬ 0 π 0
Episode specific thumbnail for Into AI Safety episode 22, Layoffs to Leadership with Andres Sepulveda Morales. Andres is pictured in the bottom right of the image.
π¨ New Into AI Safety episode is live!
I chatted with Andres Sepulveda Morales, founder of Red Mage Creative and organizer of the Fort Collins Rocky Mountain AI Interest Group about surviving the tech layoff cycle, dark patterns in AI, and building inclusive AI communities.
05.08.2025 03:16 β π 2 π 2 π¬ 1 π 0
π New Into AI Safety episode is live!
Will Petillo from PauseAI joins to discuss the grassroots movement for pausing frontier AI development, balancing diverse perspectives in activism, and why meaningful AI governance requires both political engagement and public support kairos.fm/intoaisafety...
24.06.2025 00:36 β π 3 π 2 π¬ 1 π 0
One Big Bad Bill | Kairos.fm
Breaking down Trump's massive bill: AI fraud detection, centralized databases, military integration, and a 10-year ban on state AI regulation.
π¨ New episode is out: "One Big Bad Bill" - breaking down AI's relevance to Trump's bill. We cover automated fraud detection, government data consolidation, and a 10-year ban on state AI regulation.
Find us on Spotify, Apple Podcasts, YouTube, or wherever you listen (links in thread).
23.06.2025 22:23 β π 3 π 2 π¬ 1 π 0
Breaking Down the Economics of AI | Kairos.fm
We break down 3 clusters of AI economic hype: automating profit centers, removing cost centers, and explosive growth. Reality check included.
New muckrAIkers episode drops! We're breaking down the wild economic claims around AI into 3 buckets, and digging into what the data actually shows π kairos.fm/muckraikers/...
You can find the show on Spotify, Apple Podcasts, YouTube, or wherever else you listen (links in thread).
26.05.2025 18:03 β π 3 π 2 π¬ 1 π 0
Thumbnail for an episode of the Into AI Safety podcast featuring Tristan Williams and Felix de Simone. The image reads "Making Your Voice Heard w/ Tristan Williams and Felix de Simon," and pictures Tristan on the left and Felix on the right.
NEW EPISODE: "Making Your Voice Heard w/ Tristan Williams & Felix de Simone" - where we explore how everyday citizens can influence AI policy through effective communication with legislators ποΈ kairos.fm/intoaisafety...
Listen on Spotify, Apple Podcasts, YouTube, or wherever you get your podcasts!
19.05.2025 21:04 β π 4 π 2 π¬ 1 π 0
DeepSeek: 2 Months Out | Kairos.fm
Deep dive into DeepSeek; what is reasoning, and does it change the "AI" landscape?
New muckrAIkers episode! DeepSeek R1 - What is "reasoning" and does it actually change the AI landscape? Industry fallout, billion dollar market crash, and why we're skeptical about the hype. kairos.fm/muckraikers/...
Listen on Spotify, Apple Podcasts, YouTube, or wherever you get your podcasts!
09.04.2025 16:45 β π 1 π 2 π¬ 1 π 0
AI summit draft declaration criticised for lack of safety progress
A leaked version of the Paris AI summit document omits key commitments made at Bletchley Park in 2023 in βnegligence of an unprecedented magnitudeβ
Incredibly disappointing to see the current US administration attempting to make safe and ethical "AI" a partisan issue:
"The US has also demanded that the final statement excludes any mention of the environmental cost of AI, existential risk or the UN." - www.thetimes.com/article/a7ae...
βΉοΈ
10.02.2025 17:28 β π 2 π 1 π¬ 0 π 0
DeepSeek Minisode | Kairos.fm
A short update on DeepSeek.
This week's episode of muckrAIkers is a sneak preview at all of the stories we soon are going to tackle in depth on DeepSeek R1.
Developments are ongoing, but if you want a good 15 minute overview of new so far, check out kairos.fm/muckraikers/... or find us wherever you listen!
10.02.2025 16:20 β π 0 π 2 π¬ 0 π 0
AI Hackers in the Wild: LLM Agent Honeypot
This Apart Lab Studio research blog attempts to ascertain the current state of AI-powered hacking in the wild through an innovative 'honeypot' system designed to detect LLM-based attackers.
Excited to share the first blogpost output from the Apart Lab Studio (@apartresearch.bsky.social) by Reworr, which I had the pleasure of supporting!
Check it out for one way to actively monitor one kind of AI misuse: LLM-based cyberattacks.
www.apartresearch.com/post/hunting...
01.02.2025 00:56 β π 0 π 0 π¬ 0 π 0
Understanding AI World Models w/ Chris Canal | Kairos.fm
Chris Canal, founder of Equistamp, joins muckrAIkers as our first ever podcast guest to discuss AI risks and the world models that inform them.
Super excited to announce our latest episode of muckrAIkers: Understanding AI World Models w/ Chris Canal! We get into test-time compute, the moving goalposts of βAGI,β and so much more kairos.fm/muckraikers/...
You can find the show on Spotify, Apple Podcasts, YouTube, or wherever else you listen.
27.01.2025 16:19 β π 1 π 2 π¬ 1 π 1
YouTube video by Apart - Safe AI
Researcher Spotlight: Jacob Haimes
A recent @apartresearch.bsky.social Researcher Spotlight featured me! Check it out to hear more about my journey Into AI Safety (pun intended):
www.youtube.com/watch?v=lFAm...
26.01.2025 19:44 β π 0 π 0 π¬ 0 π 0
Computational social scientist researching human-AI interaction and machine learning, particularly the rise of digital minds. Visiting scholar at Stanford, co-founder of Sentience Institute, and PhD candidate at University of Chicago. jacyanthis.com
Journalist (ex The Register, The Next Web, HowToGeek). Writer. Software developer. Dog owner (x3). Scouser.
I have a newsletter about how tech companies are ruining our lives. https://whatwelost.substack.com/
The campaign for communities to have a powerful say on data and AI, to help create an equitable and sustainable world.
connectedbydata.org
Building progressive grassroots power and holding members of Congress accountable. Make a difference in a few clicks: https://linktr.ee/indivisibleteam
Physics, Visualization and AI PhD @ Harvard | Embedding visualization and LLM interpretability | Love pretty visuals, math, physics and pets | Currently into manifolds
Wanna meet and chat? Book a meeting here: https://zcal.co/shivam-raval
NYT bestselling author of EMPIRE OF AI: empireofai.com. ai reporter. national magazine award & american humanist media award winner. words in The Atlantic. formerly WSJ, MIT Tech Review, KSJ@MIT. email: http://karendhao.com/contact.
audio engineer/GM/well known insect
Wear your cringe like armor and it can never be used to hurt you.
Non-profit dedicated to connecting the AI community to the world. Supported by AAAI, IEEE RAS, ICML, RoboCup, IJCAI/AIJ, EurAI, and ACM SIGAI.
https://aihub.org/
Official Bluesky account for the Into AI Safety podcast - available on all podcast listening platforms!
https://kairos.fm/intoaisafety
Official Bluesky account for the muckrAIkers podcast - available on all podcast listening platforms!
https://kairos.fm/muckraikers
AI Governance Researcher, interested in this centuries' greatest challenges & host for On What Matters (https://bsky.app/profile/on-what-matters.bsky.social)
On this podcast hosted by Coleman Snell, we talk about the biggest risks/challenges facing our species, solutions, and how we can find meaning in this strange century.
Official Bluesky account for the Kairos.fm media network - making complex problems meaningfully accessible.
Visit our website, or find our podcasts wherever you listen!
official Bluesky account (check usernameπ)
Bugs, feature requests, feedback: support@bsky.app