Thanks for this! Would love to be added
26.12.2024 18:10 — 👍 1 🔁 0 💬 0 📌 0@ritheshkumar.bsky.social
Researcher in audio and speech generative models (SampleRNN, MelGAN, DAC, …) Research Scientist @AdobeResearch. Ex @DescriptApp, @Mila_Quebec https://ritheshkumar.com
Thanks for this! Would love to be added
26.12.2024 18:10 — 👍 1 🔁 0 💬 0 📌 0The code for Simplified and Generalized Masked Diffusion for Discrete Data (Jiaxin Shi et al) has been released and a lecture by @arnauddoucet.bsky.social on this topic is also available!
🐍 Code: github.com/google-deepm...
📄 Article: arxiv.org/abs/2406.04329
📼 Video: www.youtube.com/watch?v=qj9B...
new paper! 🗣️Sketch2Sound💥
Sketch2Sound can create sounds from sonic imitations (i.e., a vocal imitation or a reference sound) via interpretable, time-varying control signals.
paper: arxiv.org/abs/2412.08550
web: hugofloresgarcia.art/sketch2sound
🎥 Introducing MultiFoley, a video-aware audio generation method with multimodal controls! 🔊
We can
⌨️Make a typewriter sound like a piano 🎹
🐱Make a cat meow like a lion roars! 🦁
⏱️Perfectly time existing SFX 💥 to a video.
arXiv: arxiv.org/abs/2411.17698
website: ificl.github.io/MultiFoley/
I initiated a starter pack for Audio ML. Let me know if you'd like to be added/removed.
go.bsky.app/LGmct4z
Made a feed that tries to index paper threads only: bsky.app/profile/psee.... To get into the feed, make a post with "arxiv.org" in the post somewhere + don't be a bot. My tiny contribution to the recent migration! Built w/ @skyfeed.app. Planning on some paper threads of my own soon...
24.11.2024 04:01 — 👍 7 🔁 2 💬 0 📌 1