Gowthami Somepalli's Avatar

Gowthami Somepalli

@gowthami.bsky.social

PhD-ing at UMD. Knows a little about multimodal generative models. Check out my website to know more - https://somepago.github.io/

2,251 Followers  |  190 Following  |  29 Posts  |  Joined: 19.08.2023  |  1.5789

Latest posts by gowthami.bsky.social on Bluesky

Preview
Domain Ontologies: Indispensable for Knowledge Graph Construction AI slop is all around and increasingly extraction of useful information will face difficulties as we start to feed more noise into the already noisy world of knowledge. We are in an era of unpreced…

What’s the right resolution for such ontologies? 1,000-10,000 seems like the sweet spot.

H/t @aneeshsathe.com
aneeshsathe.com/2025/01/15/d...

21.01.2025 17:47 β€” πŸ‘ 5    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Deep Learning Classics and Trends - Google Groups

About to send my last DLCT email of the year today (in 2 hours).

Join the 7-year-old mailing list if you haven't heard of it. (And if you have heard of it but haven't joined, I trust that it's a well thought decision that suits you the best.)

groups.google.com/g/deep-learn...

19.12.2024 16:12 β€” πŸ‘ 13    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

The recording of my #NeurIPS2024 workshop talk on multimodal iterative refinement is now available to everyone who registered: neurips.cc/virtual/2024...

My talk starts at 1:10:45 into the recording.

I believe this will be made publicly available eventually, but I'm not sure when exactly!

18.12.2024 04:38 β€” πŸ‘ 36    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0
[M2L 2024] Transformers - Lucas Beyer
YouTube video by Mediterranean Machine Learning (M2L) summer school [M2L 2024] Transformers - Lucas Beyer

One of the best tutorials for understanding Transformers!

πŸ“½οΈ Watch here: www.youtube.com/watch?v=bMXq...

Big thanks to @giffmana.ai for this excellent content! πŸ™Œ

08.12.2024 09:58 β€” πŸ‘ 54    πŸ” 8    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

Anne Gagneux, Ségolène Martin, @quentinbertrand.bsky.social Remi Emonet and I wrote a tutorial blog post on flow matching: dl.heeere.com/conditional-... with lots of illustrations and intuition!

We got this idea after their cool work on improving Plug and Play with FM: arxiv.org/abs/2410.02423

27.11.2024 09:00 β€” πŸ‘ 354    πŸ” 102    πŸ’¬ 12    πŸ“Œ 11
Post image Post image Post image

congratulations, @ian-goodfellow.bsky.social, for the test-of-time award at @neuripsconf.bsky.social!

this award reminds me of how GAN started with this one email ian sent to the Mila (then Lisa) lab mailing list in May 2014. super insightful and amazing execution!

27.11.2024 18:31 β€” πŸ‘ 188    πŸ” 27    πŸ’¬ 3    πŸ“Œ 3

Maybe I’m cynical πŸ™ˆ, this feels more like a KPI meeting activity than something that’s actually useful. There are 1000s of open datasets on HF which are barely used which are curated with a task in mind.

27.11.2024 01:01 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Trying to build a "books you must read" list for my lab that everyone gets when they enter. Right now its:

- Sutton and Barto
- The Structure of Scientific Revolutions
- Strunk and White
- Maybe "Prediction, Learning, and Games", TBD

Kinda curious what's missing in an RL / science curriculum

25.11.2024 17:43 β€” πŸ‘ 141    πŸ” 11    πŸ’¬ 36    πŸ“Œ 1
Preview
On Subjective Uncertainty Quantification and Calibration in Natural Language Generation Applications of large language models often involve the generation of free-form responses, in which case uncertainty quantification becomes challenging. This is due to the need to identify task-specif...

This is a simple and good paper, which somehow nobody working on these things cites, or even seems to be aware of arxiv.org/abs/2406.05213 It is simple idea that seems useful; it formulates the subjective uncertainty for natural language generation in a decision-theoretic setup.

25.11.2024 02:16 β€” πŸ‘ 27    πŸ” 3    πŸ’¬ 2    πŸ“Œ 1
Post image Post image

A real-time (or very fast) open-source txt2video model dropped: LTXV.

HF: huggingface.co/Lightricks/L...
Gradio: huggingface.co/spaces/Light...
Github: github.com/Lightricks/L...

Look at that prompt example though. Need to be a proper writer to get that quality.

23.11.2024 20:03 β€” πŸ‘ 89    πŸ” 9    πŸ’¬ 6    πŸ“Œ 1
1. Computing standard errors of the mean using the Central Limit Theorem

2. When questions are drawn in related groups, computing clustered standard errors

3. Reducing variance by resampling answers and by analyzing next-token probabilities

4. When two models are being compared, conducting statistical inference on the questionlevel paired differences, rather than the population-level summary statistics

5. Using power analysis to determine whether an eval (or a random subsample) is capable of testing a hypothesis of interest

1. Computing standard errors of the mean using the Central Limit Theorem 2. When questions are drawn in related groups, computing clustered standard errors 3. Reducing variance by resampling answers and by analyzing next-token probabilities 4. When two models are being compared, conducting statistical inference on the questionlevel paired differences, rather than the population-level summary statistics 5. Using power analysis to determine whether an eval (or a random subsample) is capable of testing a hypothesis of interest

Perhaps an unpopular opinion, but I don't think the problem with Large Language Model evaluations is the lack of error bars.

22.11.2024 14:25 β€” πŸ‘ 111    πŸ” 5    πŸ’¬ 9    πŸ“Œ 2
Post image

let me say it once more: "the gap between OAI/Anthropic/Meta/etc. and a large group of companies all over the world you've never cared to know of, in terms of LM pre-training? tiny"

22.11.2024 15:29 β€” πŸ‘ 77    πŸ” 8    πŸ’¬ 12    πŸ“Œ 1

πŸ‘‹

22.11.2024 19:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

The return of the Autoregressive Image Model: AIMv2 now going multimodal.
Excellent work by @alaaelnouby.bsky.social & team with code and checkpoints already up:

arxiv.org/abs/2411.14402

22.11.2024 09:44 β€” πŸ‘ 46    πŸ” 8    πŸ’¬ 1    πŸ“Œ 0
Preview
Extending Video Masked Autoencoders to 128 frames Video understanding has witnessed significant progress with recent video foundation models demonstrating strong performance owing to self-supervised pre-training objectives; Masked Autoencoders (MAE) ...

Interesting paper on arxiv this morning: arxiv.org/abs/2411.13683
It's a video masked autoencoder in which you learn which tokens to mask to process fewer of them and scale to longer videos. It's a #NeurIPS2024 apparently.
I wonder if there could be such strategy in the pure generative setup.

22.11.2024 07:56 β€” πŸ‘ 47    πŸ” 4    πŸ’¬ 0    πŸ“Œ 0

Very true. Completely forgot about this. However I don’t believe this model is a true reflection of VLMs trained from scratch are capable of though… or maybe my hypothesis is wrong. πŸ€·β€β™€οΈ

22.11.2024 07:22 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

πŸ’― ! Haven’t seen a single VLM where everything is trained from scratch

22.11.2024 06:01 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0

I have the same thing on, and it’s giving me follow notifications but not comments (which is very stupid! πŸ₯²)

22.11.2024 05:58 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I’m not getting notifications for comments here, anyone facing the same issue?

22.11.2024 03:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
What softwares do I actually use on my Mac as a software enthusiast? β€’ Mimansa Jaiswal With several years of using a Mac, it took me time to settle down on a set of apps and softwares that I can heartily recommend. I try almost 30 a month, but end up using around 20 in total for everyth...

You might enjoy this list I have:

22.11.2024 01:06 β€” πŸ‘ 15    πŸ” 3    πŸ’¬ 3    πŸ“Œ 0

@kampta.bsky.social is a relevant add.

21.11.2024 21:34 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - kuleshov-group/awesome-discrete-diffusion-models: A curated list for awesome discrete diffusion models resources. A curated list for awesome discrete diffusion models resources. - kuleshov-group/awesome-discrete-diffusion-models

Discrete diffusion has become a very hot topic again this year. Dozens of interesting ICLR submissions and some exciting attempts at scaling. Here's a bibliography on the topic from the Kuleshov group (my open office neighbors).

github.com/kuleshov-gro...

21.11.2024 18:39 β€” πŸ‘ 76    πŸ” 10    πŸ’¬ 1    πŸ“Œ 0

Ofcourse! :)

21.11.2024 20:25 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Added you.

21.11.2024 18:14 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Added you!

21.11.2024 18:14 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Added!

21.11.2024 18:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Done!

21.11.2024 18:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Added!

21.11.2024 18:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I only got to know today this awesome diffusion starter pack exists! I’ll try to fill up my generative models pack with some complementary folks. :)

21.11.2024 18:10 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Can people create accounts here without invite now? πŸ€”

21.11.2024 07:56 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 5    πŸ“Œ 0

@gowthami is following 20 prominent accounts