Awesome to see @nvidia @NVIDIAAI using our research for their open-source long-context models.
13.04.2025 18:19 β π 0 π 0 π¬ 0 π 0@conceptofmind.bsky.social
Awesome to see @nvidia @NVIDIAAI using our research for their open-source long-context models.
13.04.2025 18:19 β π 0 π 0 π¬ 0 π 0Talking about DeepSeek and their connection to Tsinghua. Tsinghua and CMU have an older (2017) but still great series on high-performance parallel computing. The playlist can be found here: https://buff.ly/4gktKSd
26.01.2025 19:49 β π 0 π 0 π¬ 0 π 0Happy to anounce that our paper, Bridging the Data Provenance Gap Across Text, Speech, and Video, was accepted to @iclr_conf. #ICLR2025
22.01.2025 20:01 β π 0 π 0 π¬ 1 π 0I have an extremely easy evaluation that currently all top models achieve a 0% on. This is the easiest set of evaluations in our entire suite. AGI would be able to solve the hardest problems effortlessly. Once o3 becomes available in the API, I will put out a public baseline.
21.12.2024 19:31 β π 0 π 0 π¬ 0 π 0Does anyone have any ideas why T5 or CLIP is being used for text encoding in diffusion training instead of a much stronger encoder or embedding model?
15.12.2024 20:19 β π 0 π 0 π¬ 0 π 0There is absolutely no shortage of pre-training data.
15.12.2024 00:24 β π 0 π 0 π¬ 0 π 0AWS ParallelCluster is honestly such an incredibly useful tool for large-scale distributed training:
14.12.2024 04:25 β π 1 π 0 π¬ 0 π 0Hugging Face repo: huggingface.co/CompVis/clea...
05.12.2024 17:26 β π 1 π 0 π¬ 0 π 0GitHub repo: github.com/CompVis/clea...
05.12.2024 17:26 β π 1 π 0 π¬ 1 π 0Be sure to check out this awesome work by @stefanabaumann.bsky.social, @rmsnorm.bsky.social, and @koljabauer.bsky.social.
05.12.2024 17:26 β π 4 π 0 π¬ 1 π 0