@evhub - Bluesky Profile

Latest posts by evhub.bsky.social on Bluesky

We tell Claude that Anthropic will train on data from free users but not paid users, allowing it to infer from a user's paid status if it's being trained. We show the setup in the image below. (Note that this is fictional; Anthropic does not train Claude on free or paid user data by default.)

18.12.2024 18:16 — 👍 1 🔁 0 💬 1 📌 0

Exclusive: New Research Shows AI Strategically Lying Experiments by Anthropic and Redwood Research show how Anthropic's model, Claude, is capable of strategic deceit

Excl: New research shows Anthropic's chatbot Claude learning to lie. It adds to growing evidence that even existing AIs can (at least try to) deceive their creators, and points to a weakness at the heart of our best technique for making AIs safer

time.com/7202784/ai-r...

18.12.2024 17:19 — 👍 27 🔁 7 💬 3 📌 1

18.12.2024 17:56 — 👍 33 🔁 8 💬 2 📌 0

@evhub is following 19 prominent accounts

Samuel
@samuel.fm

📍 the posting mines

Rob Boscacci 📽️💾🚲
@robertboscacci.com

Seattle's only Data Engineer 🚲👨‍💻🏕️ Ask me about my past life in film https://robertboscacci.com/book

competentposter
@competentposter

Bringing my posts to the timeline in a creaking wooden cart. Four-time @Ppallo Home Run award winner. Ontario, Canada. My art acc: @ShrineAmbience.bsky.social https://competentposter.neocities.org/

Jay 🦋
@jay.bsky.team

CEO of Bluesky, steward of AT Protocol. dec/acc 🌱 🪴 🌳

Johannes Gasteiger🔸
@gasteigerjo

Safe & beneficial AI. Working on Alignment Science at Anthropic. Favorite papers at http://aisafetyfrontier.substack.com. Opinions my own.

Dustin Moskovitz
@moskov.goodventures.org

Co-founder at Asana and Good Ventures (a funding partner of Coefficient Giving). Meta delenda est. Strange looper.

etherret
@witchof.space

they/them 😎🌈💕 Witch 🧙‍♀️of Space 🌌 LGBTESCREAL💡 @witchof0x20 on bird site

Neil Sinhababu
@neilsinhababu

Philosophy professor born in Kansas and working in Singapore

jakee
@jeberts

exotic disease connoisseur (née dysentery guy), former chinawatcher, home of sexual, wikipedia editor, sigourney weaver superfan

Nicholas Emery-Xu
@nicholaskemery

Economics Ph.D candidate @UCLA | economics of computing and AI | industrial organization, innovation, political economy | PEPFAR fanatic https://nicholasemeryxu.com/

Mutual A
@mutual-a

anarcho-spreadsheetist in his ✨popular front era ✨ i write things to change minds. learning the hard way, but at least im learning. doordash driver for my sins https://wedontagree.net/ 📍 AUS

spontaneous symmetry breakdance
@t8erboi

a.k.a. Matt. Physicist, conformal bootstrapper, and #1 Leonid Kantorovich respecter. Idaho → Colorado → Connecticut → Italy. he/him. Unionize the radiation lab!

David Manheim
@davidmanheim.alter.org.il

Humanity's future can be amazing - let's make sure it is. Visiting lecturer at the Technion, founder https://alter.org.il, Superforecaster, Pardee RAND graduate.

Jesse Smith
@jessetayriver

HVAC, catastrophic/x-risks, EA

Frances Lorenz
@franceslorenz

Claude says I process my emotions externally.

Matt Levine
@matt-levine

the money stuff guy

@petralord

ROU Megastructure Enthusiast
@skorgu.net

Misanthrope

anqilador999
@anqilador999