Ragavan Thurairatnam πŸ›Έ NeurlPS's Avatar

Ragavan Thurairatnam πŸ›Έ NeurlPS

@ragavan.bsky.social

Deep learning since 2012. Leading a new AI lab at Jack Dorsey's Block

82 Followers  |  475 Following  |  33 Posts  |  Joined: 09.12.2023  |  2.6578

Latest posts by ragavan.bsky.social on Bluesky

Can you share some of those bad code examples? What model did you use?

18.05.2025 03:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

Reminds me of this

17.05.2025 05:56 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

From what I heard viral infections can sometimes actually make allergies or the immune system worse. It's possible some are beneficial but I'm not aware (I'm also not an expert). Alternatively to viruses, you can also get tons of exposure to bacteria, fungi, etc for the hygiene hypothesis

24.12.2024 02:23 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I feel like in old school tensorflow 1.x it wouldn't do the common graph nodes twice. Don't know about pytorch though

19.12.2024 17:43 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

At Neurips this year, no one seemed to bat an eye at me wearing an N95 all day. I actually feel like the prevalence of N95s was higher at the conference than the general population!

19.12.2024 17:00 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Human-computer interaction already sounds risque enough!

19.12.2024 13:13 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Oh nevermind, I misread your post. 90 percent of discussions, not people. My bad

18.12.2024 14:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Who are you talking to such that 90% say CICO as a simple statement isn't true? From my sampling, almost everyone says it's true. There's a small percentage of doctors that have a nuanced point of view (that it's technically true, but not a good strategy, which is different than saying it's false).

18.12.2024 14:15 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I heard visa problems for many :(

14.12.2024 06:45 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

How about Neonatal Insane Clown Unit?

Scary in multiple ways.

08.12.2024 22:01 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I imagine it's worse for you, probably getting harassed a lot?

08.12.2024 13:34 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Although @karpathy.bsky.social and Jurgen are right the REAL EARLIEST form of multiheaded self attention was first used in the early neolithic period by heating bones and looking where the cracks formed. The number of cracks as the heads and the something something farming

07.12.2024 02:50 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Is it even true for base models 100% of the time? I can imagine with certain architectures/training, you don't actually get a pure statistical representation of the words, you get some learned function/hypothesis that is much simpler/compressed

05.12.2024 16:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

"I must be a robot. Why else would human women refuse to date me"

05.12.2024 14:17 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Ah I see. So you're saying if you ran an experiment, and showed a bunch of people the same paragraph, but told half of them, the paragraph was written by an LLM model and told the other half it was by a human writer. They would feel different about it?

I imagine that's true. Fun experiment to try

04.12.2024 00:33 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I thought I understood, but now I'm not sure.

I do think LLM writing is weird/hollow at times. Just sounding/looking right but, might not say a lot.

But why can't future models learn the gap?

04.12.2024 00:17 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

That's cool! But I just meant Alex did cuda kernels for DL before CUDAMat. Not before everyone else.

My other claim was just that Alex's implementation in cuda (for DL) was very well engineered and his was the most efficient of ones I'm aware of (for DL).

I see where we crossed wires now

01.12.2024 23:36 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

That's the only EyeTap project I remembered! But I didn't know they used deep learning with cuda so early. When did they start doing that? Were the GPUs remote? Or just experiments?

01.12.2024 22:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I genuinely can't remember anymore.
Catastrophic forgetting.

I do remember hearing about "Artificial Neural Networks" (the old school term no one uses anymore[?]) and ZISC when I was young and remember thinking they sounded super cool. But that's all I remember haha

01.12.2024 18:24 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Which EyeTap project are you referring to?

01.12.2024 17:45 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Separately, Alex was writing cuda kernels for neural nets before CUDAMat.

Also I think Alex's code was much faster for deep conv nets.

01.12.2024 16:12 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I'm not saying the next shift has to be for NNs. But I can imagine just like NNs were a dark horse before, there could be other old techniques that have issues/gaps that need to be addressed to trigger another shift

01.12.2024 16:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

What I meant by another AlexNet moment was a huge proof point and shift to a major new algo. Deep nets were around for many years before. GPUs for DNNs were used as well. But AlexNet was on a major unsolved problem and triggered a shift for everyone to move away from old techniques to new.

01.12.2024 16:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Extreme analog:

Testing a XOR gate vs an LLM.
You can fully specify all inputs and outputs for the XOR gate. Hard with LLM.

But maybe I'm taking safety a little too literally.

01.12.2024 07:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

From a system (not model) point of view and for medical/critical system applications, I can imagine end to end safety makes sense if it's AI or not.

the difference I can imagine is the degree of testing is very different if theres a huge combination of possible outputs (like we see in LLMs)

01.12.2024 07:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I imagine big corps will always have a large advantage

Maybe not if:
1. Another AlexNet moment that causes another major HW/trade secret switch. And that would only last a short while.
2. DL gains slow massively; efficient algos make up the difference

New open GPU algos would just get adopted.

01.12.2024 05:24 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I didn't realize bash.org is gone. If only someone made a dataset of it :(

27.11.2024 23:43 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Bluesky Firehose in 3D (live)

Could you use the firehose API for this? Someone did a more visual version: firehose3d.theo.io

26.11.2024 16:06 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I'm certain this is true. There's plenty of things that humans can't do that deep learning models could do. For example predicting health events from signals like ECG, or even imagery. I think there was a model that predicted covid (vs other) based on audio recordings. Humans can't do that.

26.11.2024 16:01 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Thinking about it more, would the labeling data be public? If so then the bots could potentially make adversarial models to sneak through.

Sounds fun though!

25.11.2024 18:31 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@ragavan is following 20 prominent accounts