Can you share some of those bad code examples? What model did you use?
18.05.2025 03:18 β π 1 π 0 π¬ 0 π 0@ragavan.bsky.social
Deep learning since 2012. Leading a new AI lab at Jack Dorsey's Block
Can you share some of those bad code examples? What model did you use?
18.05.2025 03:18 β π 1 π 0 π¬ 0 π 0Reminds me of this
17.05.2025 05:56 β π 2 π 0 π¬ 0 π 0From what I heard viral infections can sometimes actually make allergies or the immune system worse. It's possible some are beneficial but I'm not aware (I'm also not an expert). Alternatively to viruses, you can also get tons of exposure to bacteria, fungi, etc for the hygiene hypothesis
24.12.2024 02:23 β π 5 π 0 π¬ 1 π 0I feel like in old school tensorflow 1.x it wouldn't do the common graph nodes twice. Don't know about pytorch though
19.12.2024 17:43 β π 2 π 0 π¬ 1 π 0At Neurips this year, no one seemed to bat an eye at me wearing an N95 all day. I actually feel like the prevalence of N95s was higher at the conference than the general population!
19.12.2024 17:00 β π 2 π 0 π¬ 1 π 0Human-computer interaction already sounds risque enough!
19.12.2024 13:13 β π 0 π 0 π¬ 0 π 0Oh nevermind, I misread your post. 90 percent of discussions, not people. My bad
18.12.2024 14:43 β π 0 π 0 π¬ 0 π 0Who are you talking to such that 90% say CICO as a simple statement isn't true? From my sampling, almost everyone says it's true. There's a small percentage of doctors that have a nuanced point of view (that it's technically true, but not a good strategy, which is different than saying it's false).
18.12.2024 14:15 β π 0 π 0 π¬ 1 π 0I heard visa problems for many :(
14.12.2024 06:45 β π 0 π 0 π¬ 1 π 0How about Neonatal Insane Clown Unit?
Scary in multiple ways.
I imagine it's worse for you, probably getting harassed a lot?
08.12.2024 13:34 β π 1 π 0 π¬ 0 π 0Although @karpathy.bsky.social and Jurgen are right the REAL EARLIEST form of multiheaded self attention was first used in the early neolithic period by heating bones and looking where the cracks formed. The number of cracks as the heads and the something something farming
07.12.2024 02:50 β π 2 π 0 π¬ 0 π 0Is it even true for base models 100% of the time? I can imagine with certain architectures/training, you don't actually get a pure statistical representation of the words, you get some learned function/hypothesis that is much simpler/compressed
05.12.2024 16:32 β π 0 π 0 π¬ 0 π 0"I must be a robot. Why else would human women refuse to date me"
05.12.2024 14:17 β π 1 π 0 π¬ 0 π 0Ah I see. So you're saying if you ran an experiment, and showed a bunch of people the same paragraph, but told half of them, the paragraph was written by an LLM model and told the other half it was by a human writer. They would feel different about it?
I imagine that's true. Fun experiment to try
I thought I understood, but now I'm not sure.
I do think LLM writing is weird/hollow at times. Just sounding/looking right but, might not say a lot.
But why can't future models learn the gap?
That's cool! But I just meant Alex did cuda kernels for DL before CUDAMat. Not before everyone else.
My other claim was just that Alex's implementation in cuda (for DL) was very well engineered and his was the most efficient of ones I'm aware of (for DL).
I see where we crossed wires now
That's the only EyeTap project I remembered! But I didn't know they used deep learning with cuda so early. When did they start doing that? Were the GPUs remote? Or just experiments?
01.12.2024 22:44 β π 0 π 0 π¬ 1 π 0I genuinely can't remember anymore.
Catastrophic forgetting.
I do remember hearing about "Artificial Neural Networks" (the old school term no one uses anymore[?]) and ZISC when I was young and remember thinking they sounded super cool. But that's all I remember haha
Which EyeTap project are you referring to?
01.12.2024 17:45 β π 0 π 0 π¬ 1 π 0Separately, Alex was writing cuda kernels for neural nets before CUDAMat.
Also I think Alex's code was much faster for deep conv nets.
I'm not saying the next shift has to be for NNs. But I can imagine just like NNs were a dark horse before, there could be other old techniques that have issues/gaps that need to be addressed to trigger another shift
01.12.2024 16:06 β π 0 π 0 π¬ 0 π 0What I meant by another AlexNet moment was a huge proof point and shift to a major new algo. Deep nets were around for many years before. GPUs for DNNs were used as well. But AlexNet was on a major unsolved problem and triggered a shift for everyone to move away from old techniques to new.
01.12.2024 16:04 β π 0 π 0 π¬ 0 π 0Extreme analog:
Testing a XOR gate vs an LLM.
You can fully specify all inputs and outputs for the XOR gate. Hard with LLM.
But maybe I'm taking safety a little too literally.
From a system (not model) point of view and for medical/critical system applications, I can imagine end to end safety makes sense if it's AI or not.
the difference I can imagine is the degree of testing is very different if theres a huge combination of possible outputs (like we see in LLMs)
I imagine big corps will always have a large advantage
Maybe not if:
1. Another AlexNet moment that causes another major HW/trade secret switch. And that would only last a short while.
2. DL gains slow massively; efficient algos make up the difference
New open GPU algos would just get adopted.
I didn't realize bash.org is gone. If only someone made a dataset of it :(
27.11.2024 23:43 β π 1 π 0 π¬ 0 π 0Could you use the firehose API for this? Someone did a more visual version: firehose3d.theo.io
26.11.2024 16:06 β π 2 π 0 π¬ 0 π 0I'm certain this is true. There's plenty of things that humans can't do that deep learning models could do. For example predicting health events from signals like ECG, or even imagery. I think there was a model that predicted covid (vs other) based on audio recordings. Humans can't do that.
26.11.2024 16:01 β π 3 π 0 π¬ 0 π 0Thinking about it more, would the labeling data be public? If so then the bots could potentially make adversarial models to sneak through.
Sounds fun though!