Cameron Holmes's Avatar

Cameron Holmes

@cameronholmes.bsky.social

AI Alignment Research Manager @MATS Market participant, EA.

485 Followers  |  172 Following  |  28 Posts  |  Joined: 16.09.2024  |  2.0393

Latest posts by cameronholmes.bsky.social on Bluesky


Post image

New paper: Finetuning on narrow domains leaves traces behind. By looking at the difference in activations before and after finetuning, we can interpret what it was finetuned for. And so can our interpretability agent! 🧡

20.10.2025 15:11 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1

I think it's pretty hard to disentangle them really, I was initially skeptical of the (very convenient) argument from the labs about them not being orthogonal, but I'm increasingly buying it.

01.01.2025 09:30 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
When will the first general AI system be devised, tested, and publicly announced?

Right, predictions were 30+ years well into 2020.

www.metaculus.com/questions/51...

24.12.2024 21:27 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Banger

11.12.2024 23:23 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Congratulations!

01.12.2024 23:59 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

It would have been an all-too-convenient refrain for the "I don't believe in that sci-fi nonsense" AI Safety scepticism line

01.12.2024 19:55 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

She shall know your ways as if born to them

24.11.2024 17:47 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Truly an excellent milestone.

Although concessions do follow in (incorrect) episode preferences. Sleepytime falling out of favour was crushing.

23.11.2024 23:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Yeah I try to follow a similar approach

23.11.2024 18:56 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The fora draw a clear distinction between upvotes / agreement votes so I think the culture of upvoting contributions stems from that maybe?

23.11.2024 18:49 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Bizarre that was included in the screenshot, doesn't seem like it belongs to the same class as the others at all.

23.11.2024 13:33 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Tradle | Have fun with OEC Data | The Observatory of Economic Complexity Became an instant classic, guess the country based on an export distribution | In OEC we work hard and also party hard.

Crushing it

#Tradle #992 1/6
🟩🟩🟩🟩🟩
oec.world/en/games/tra...

22.11.2024 16:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Now grappling with whether I'd be in that group or not.

21.11.2024 23:56 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Not everyone, then we'd have to read them. If only those inclined to make one did then the mere existence of the doc would probably clarify 90% of scenarios.

"Oh they've got one of those docs, we are probably cool"

21.11.2024 23:52 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The Settlers of Catan Problem

21.11.2024 21:40 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image Post image Post image

I've been really noticing autumn this year

21.11.2024 14:31 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Will Google sell or divest Chrome by 2029? 25% chance. Resolves to YES if Google sells or divests Chrome, for any reason, by Jan 1, 2029. Resolves to NO if it does not. Asking because of this: https://x.com/unusual_whales/status/185864351...

Prediction markets not giving great odds of this going through:

manifold.markets/ZviMowshowit...

19.11.2024 08:01 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Hoping this generalizes into alignment research

17.11.2024 19:40 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The HS2 bat tunnel is even worse value for money once you factor in the updated direct cash transfer effectiveness estimates.

13.11.2024 21:56 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Animal Welfare vs Global Health Debate Week - EA Forum Animal Welfare vs Global Health Debate Week was an event which ran from 7-13 October (2024) on the EA Forum.Β  Posts with this tag can be arguments, investigations, research summaries, book-reviews, q...

It's an edit of EA forum debate week interface:

forum.effectivealtruism.org/topics/anima...

13.11.2024 09:37 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Draft Amnesty Week - EA Forum Draft Amnesty Week is an event running on the EA Forum between March 11th and 17th 2024, where Forum users can publish scrappy, draft-y, or incomplete posts with impunity.Β  This tag is to be used on ...

Good idea, for more details that might help: forum.effectivealtruism.org/topics/draft...

13.11.2024 09:35 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image 13.11.2024 06:26 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Oh yes, an entirely intentional error on my part in the spirit of #DAW 😬

13.11.2024 06:17 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

deleting dating apps because i want to meet someone the old fashioned way (we caught a wild pig together while not sharing a common language, then met 12 years later under the tree we planted)

13.11.2024 06:13 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

This is a Draft Amnesty Week #DAW draft. It may not be polished, up to my usual standards, fully thought through, or fully fact-checked.

13.11.2024 06:12 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 1

FYI it's Draft Amnesty Week on TPOB, where users can publish scrappy, draft-y, or incomplete posts with impunity. #DAW

13.11.2024 06:08 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 4    πŸ“Œ 0

Circles, but it's just a different app for each emigrating TPOT

13.11.2024 05:56 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

On new beginnings: This week I handed in my notice, ending 10 years in Product Management, capital markets to start as an Alignment Research Manager in January! πŸŽ‰

09.11.2024 07:34 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Good ~Morning Agus

09.11.2024 07:29 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@cameronholmes is following 20 prominent accounts