Ziyang Chen @czyang - Bluesky Profile

Ziyang Chen

@czyang.bsky.social

Ph.D. Student @ UMich EECS. Multimodal learning, audio-visual learning and computer vision. Prev research Intern @Adobe and @Meta https://ificl.github.io/

60 Followers | 22 Following | 4 Posts | Joined: 26.11.2024 | 1.7603

Latest posts by czyang.bsky.social on Bluesky

This work is done during my internship at Adobe Research. Big thanks to all my collaborators @pseeth.bsky.social, Bryan Russell, @urinieto.bsky.social, David Bourgin, @andrewowens.bsky.social, and @justinsalamon.bsky.social!

27.11.2024 02:58 — 👍 5 🔁 0 💬 0 📌 0

We jointly train our model on high-quality text-audio pairs as well as videos, enabling our model to generate full-bandwidth professional audio with fine-grained creative control and synchronization.

27.11.2024 02:58 — 👍 3 🔁 0 💬 1 📌 0

MultiFoley is a unified framework for video-guided audio generation leveraging text, audio, and video conditioning within a single model. As a result, we can do text-guided foley, audio-guided foley (e.g. sync your favorite sample with the video), and foley audio extension.

27.11.2024 02:58 — 👍 2 🔁 0 💬 1 📌 0

🎥 Introducing MultiFoley, a video-aware audio generation method with multimodal controls! 🔊
We can
⌨️Make a typewriter sound like a piano 🎹
🐱Make a cat meow like a lion roars! 🦁
⏱️Perfectly time existing SFX 💥 to a video.

arXiv: arxiv.org/abs/2411.17698
website: ificl.github.io/MultiFoley/

27.11.2024 02:58 — 👍 41 🔁 12 💬 1 📌 6

@czyang is following 20 prominent accounts

Dima Damen
@dimadamen

Professor of Computer Vision, @BristolUni. Senior Research Scientist @GoogleDeepMind - passionate about the temporal stream in our lives. http://dimadamen.github.io

Melissa Franch, PhD
@mfranch

Postdoc in the Hayden lab at Baylor College of Medicine studying neural computations of natural language & communication in humans. Sister to someone with autism. she/her. melissafranch.com

Jordi Pons
@jordiponsdotme

Music, audio, and deep learning research at Stability AI ~ Building bridges between audio signal processing wisdom and deep learning. artintech.substack.com www.jordipons.me

Nicholas J Bryan
@nicholasjbryan

@xzhai

Lucas Beyer (bl16)
@giffmana.ai

Researcher (OpenAI. Ex: DeepMind, Brain, RWTH Aachen), Gamer, Hacker, Belgian. Anon feedback: https://admonymous.co/giffmana 📍 Zürich, Suisse 🔗 http://lucasb.eyer.be

@dangengdg

Aleksander Hołyński
@holynski

UC Berkeley + Google DeepMind holynski.org

Sander Dieleman
@sedielem

Blog: https://sander.ai/ 🐦: https://x.com/sedielem Research Scientist at Google DeepMind (WaveNet, Imagen 3, Veo, ...). I tweet about deep learning (research + software), music, generative models (personal account).

Hilde Kuehne
@hildekuehne

Professor for CS at the Tuebingen AI Center and affiliated Professor at MIT-IBM Watson AI lab - Multimodal learning and video understanding - GC for ICCV 2025 - https://hildekuehne.github.io/

Faro Stöter
@faroit

AudioML research scientist at https://audioshake.ai, before: post-doc @inria@social.numerique.gouv.fr, Editor at https://bsky.app/profile/joss-openjournals.bsky.social All in 17.68% of grey, located in Frankfurt (Germany)

Hao-Wen (Herman) Dong 董皓文
@hermandong

Assistant Professor at University of Michigan | PhD from UC San Diego | Human-Centered Generative AI for Content Creation

Joan Serrà
@serrjoa

Does research on machine learning at Sony AI, Barcelona. Works on audio analysis, synthesis, and retrieval. Likes tennis, music, and wine. https://serrjoa.github.io/

Jia-Bin Huang
@jbhuang0604

Associate Professor at UMD CS. YouTube: https://youtube.com/@jbhuang0604 Interested in how computers can learn and see.

@ritheshkumar

Researcher in audio and speech generative models (SampleRNN, MelGAN, DAC, …) Research Scientist @AdobeResearch. Ex @DescriptApp, @Mila_Quebec https://ritheshkumar.com

Gautham Mysore
@gauthamjmysore

Head of Audio and Video AI Research at Adobe Research

Oriol (Uri) Nieto
@urinieto

Researcher at Adobe Research. Machine learning on audio. Screamer. Oaklander born in Barcelona. Titan. He/they 🌈 www.urinieto.com

Prem Seetharaman
@pseeth

Researcher in computer audition, machine learning, and HCI. Sr. Research Scientist, @AdobeResearch. Previously @DescriptApp, @Northwestern. https://pseeth.github.io/

Andrew Owens
@andrewowens

Associate professor @ Cornell Tech

Jon Barron
@jonbarron

Principal research scientist at Google DeepMind. Synthesized views are my own. 📍SF Bay Area 🔗 http://jonbarron.info This feed is a mostly-incomplete mirror of https://x.com/jon_barron, I recommend you just follow me there.