ritheshkumar - Bluesky Statics

bskyView

@ritheshkumar.bsky.social

Researcher in audio and speech generative models (SampleRNN, MelGAN, DAC, …) Research Scientist @AdobeResearch. Ex @DescriptApp, @Mila_Quebec https://ritheshkumar.com

94 Followers | 183 Following | 1 Posts | Joined: 20.11.2024

Posts Following

Posts by (@ritheshkumar.bsky.social)

Thanks for this! Would love to be added

26.12.2024 18:10 — 👍 1 🔁 0 💬 0 📌 0

The code for Simplified and Generalized Masked Diffusion for Discrete Data (Jiaxin Shi et al) has been released and a lecture by @arnauddoucet.bsky.social on this topic is also available!

🐍 Code: github.com/google-deepm...
📄 Article: arxiv.org/abs/2406.04329
📼 Video: www.youtube.com/watch?v=qj9B...

14.12.2024 12:47 — 👍 26 🔁 6 💬 0 📌 0

new paper! 🗣️Sketch2Sound💥

Sketch2Sound can create sounds from sonic imitations (i.e., a vocal imitation or a reference sound) via interpretable, time-varying control signals.

paper: arxiv.org/abs/2412.08550
web: hugofloresgarcia.art/sketch2sound

12.12.2024 14:43 — 👍 23 🔁 9 💬 2 📌 5

🎥 Introducing MultiFoley, a video-aware audio generation method with multimodal controls! 🔊
We can
⌨️Make a typewriter sound like a piano 🎹
🐱Make a cat meow like a lion roars! 🦁
⏱️Perfectly time existing SFX 💥 to a video.

arXiv: arxiv.org/abs/2411.17698
website: ificl.github.io/MultiFoley/

27.11.2024 02:58 — 👍 42 🔁 12 💬 2 📌 6

I initiated a starter pack for Audio ML. Let me know if you'd like to be added/removed.
go.bsky.app/LGmct4z

18.11.2024 04:46 — 👍 68 🔁 22 💬 46 📌 1

Made a feed that tries to index paper threads only: bsky.app/profile/psee.... To get into the feed, make a post with "arxiv.org" in the post somewhere + don't be a bot. My tiny contribution to the recent migration! Built w/ @skyfeed.app. Planning on some paper threads of my own soon...

24.11.2024 04:01 — 👍 7 🔁 2 💬 0 📌 1