's Avatar

@kalomaze.bsky.social

deep yearning | ML engineering

529 Followers  |  113 Following  |  26 Posts  |  Joined: 16.11.2024  |  1.9729

Latest posts by kalomaze.bsky.social on Bluesky

the people angry about the huggingface scraping are not long term thinkers. we are talking about people who have not even begun to think about what the second order consequences of highly centralized information access look like

28.11.2024 03:59 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

(context)

28.11.2024 00:53 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Elon the type of guy to actually consider building the Absolutely Safe Capsule from Mother 3

28.11.2024 00:41 โ€” ๐Ÿ‘ 9    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

i always come fast

28.11.2024 00:35 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

"free culture for me but not for thee"

27.11.2024 23:56 โ€” ๐Ÿ‘ 10    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

*the lack of learned optimization

there is a ton of information in the past changes of gradients that we could be doing more with than running averages!

27.11.2024 23:52 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

i also think learned optimization may end up being far more of a bottleneck in the long term compared to the architectural structure of neural networks in terms of sample efficient learning

arxiv.org/abs/1606.04474

27.11.2024 23:49 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process Recent advances in language models have demonstrated their capability to solve mathematical reasoning problems, achieving near-perfect accuracy on grade-school level math benchmarks like GSM8K. In thi...

arxiv.org/abs/2407.20311

27.11.2024 23:46 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

clearly we can observe, that the deeper the network is, the better the heuristics that form in the network when it comes to generalizing to "like data".

so blanketly describing the solutions that dnns make as "poorly generalizable" is a little bizarre to me tbh

27.11.2024 23:37 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

yeah i respect him a lot more than Gary Marcus in this regard. i just think there's a lack of humility involved when we abstract things to reductionist observations on the level of, "its just curve fitting bro"

27.11.2024 23:31 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

i mean, how can you coherently assess the formation of approximated functions that don't actually exist in the data as a form of "memorization" if the internal heuristics of the network look nothing like the data but are formed by an attempt to match it?

27.11.2024 23:27 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

yeah i think more broadly this is a useful thing to think about for the record

there are some people who are extremely dead set on their current interpretation of how dnns "actually" work (Gary Marcus, Franรงois Chollet) and refuse to look through any other lens

pretty unfortunate

27.11.2024 23:21 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
Post image

a what now?

27.11.2024 23:06 โ€” ๐Ÿ‘ 9    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

does attending to the attention scores of the "happy" token imbue a consistent expression of this semantically? does it only imbue it for a small subspace of attention comparisons?

assessing this seriously forces you to unpack a black box of unfalsifiable metaphysics questions

27.11.2024 23:04 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

my objection to this is not on the basis of humans being "special" but on the basis of Transformers being state simulators

whenever Claude says it is "happy", if most of the distribution it did a PRNG diceroll over is an equal spread of emotions, how can we say that there was intent to express it?

27.11.2024 23:01 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

the world is a ghetto with big guns and picket signs,

but it can do whatever it want, whenever it want, i don't mind

27.11.2024 22:56 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

the dynamics of public posts on a decentralized social media network are nothing like the dynamics of physically owned private property

27.11.2024 22:34 โ€” ๐Ÿ‘ 8    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

The Basilisk Cometh

27.11.2024 22:32 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

can someone please eloquently explain to me how copying the posts from a database into another database is violence, or otherwise allows for exorbitant harms?

27.11.2024 21:55 โ€” ๐Ÿ‘ 8    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

the vague insinuations of this being a threat to people's safety is especially ???
wtf is this leap in logic?

27.11.2024 21:51 โ€” ๐Ÿ‘ 21    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

getting mad at the thing relative to what harm it actually causes instead of on the principle of it being for AI would require critical thinking instead of acting like a reactionary and rolling with my gut instinct though.

27.11.2024 21:49 โ€” ๐Ÿ‘ 21    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

this is the logical consequence of the design of the website. it was always meant to be open. the firehose in particular is completely free

27.11.2024 21:39 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

you guys do realize that all the leading companies were gonna do this privately regardless whether or not you asked politely, right?
and the only difference is, this is public and researchers/hobbyists can tinker with it more easily?

27.11.2024 21:34 โ€” ๐Ÿ‘ 50    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

ever since i read the bluesky api was completely open, i knew someone was gonna do this and people would respond with the most artificial faux outrage possible

27.11.2024 21:32 โ€” ๐Ÿ‘ 47    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

"publicly available data was used by people when i posted it online. this is an ethical crime,"

27.11.2024 21:27 โ€” ๐Ÿ‘ 54    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
a cartoon character wearing a purple hat and gloves is jumping over a checkered floor . ALT: a cartoon character wearing a purple hat and gloves is jumping over a checkered floor .

deep learning is hitting a wall but what if if hit on ME instead?

19.11.2024 04:58 โ€” ๐Ÿ‘ 10    ๐Ÿ” 0    ๐Ÿ’ฌ 5    ๐Ÿ“Œ 0

@kalomaze is following 19 prominent accounts