conceptofmind @conceptofmind

Awesome to see @nvidia @NVIDIAAI using our research for their open-source long-context models.

13.04.2025 18:19 — 👍 0 🔁 0 💬 0 📌 0

Talking about DeepSeek and their connection to Tsinghua. Tsinghua and CMU have an older (2017) but still great series on high-performance parallel computing. The playlist can be found here: https://buff.ly/4gktKSd

26.01.2025 19:49 — 👍 0 🔁 0 💬 0 📌 0

Bridging the Data Provenance Gap Across Text, Speech and Video Progress in AI is driven largely by the scale and quality of training data. Despite this, there is a deficit of empirical analysis examining the attributes of well-established datasets beyond text.…

Paper link: https://buff.ly/3WtuwWi

22.01.2025 20:01 — 👍 0 🔁 0 💬 0 📌 0

Happy to anounce that our paper, Bridging the Data Provenance Gap Across Text, Speech, and Video, was accepted to @iclr_conf. #ICLR2025

22.01.2025 20:01 — 👍 0 🔁 0 💬 1 📌 0

I have an extremely easy evaluation that currently all top models achieve a 0% on. This is the easiest set of evaluations in our entire suite. AGI would be able to solve the hardest problems effortlessly. Once o3 becomes available in the API, I will put out a public baseline.

21.12.2024 19:31 — 👍 0 🔁 0 💬 0 📌 0

Does anyone have any ideas why T5 or CLIP is being used for text encoding in diffusion training instead of a much stronger encoder or embedding model?

15.12.2024 20:19 — 👍 0 🔁 0 💬 0 📌 0

There is absolutely no shortage of pre-training data.

15.12.2024 00:24 — 👍 0 🔁 0 💬 0 📌 0