's Avatar

@riccardofosco.bsky.social

Sound effects, audio & video | PhD at @ISPAMM, @Sapienza | Former @C4DM, @QMUL

46 Followers  |  130 Following  |  3 Posts  |  Joined: 20.12.2024
Posts Following

Posts by (@riccardofosco.bsky.social)

Yet another pre-Christmas release!! πŸŽ…πŸŽ„
Here is π’π­πšπ›π₯𝐞-π•πŸπ€ which generates sound effects from silent video frames showing semantic and temporal alignment.
🎢πŸ₯πŸŽ›οΈ

Huge thanks to @riccardofosco.bsky.social Christian Marinoni and all co-authors 🀟

23.12.2024 14:01 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Super interesting work on #GenAI #Video2Audio with impressive results from my friends @riccardofosco.bsky.social @Christian Marinoni together with @emilianpos.bsky.social @mcomunita.bsky.social Luca Cosmo, Joshua Reiss and @dacom.bsky.social !

πŸ‘‡ Go check it out!

20.12.2024 18:37 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

A great work with Christian Marinoni, @emilianpos.bsky.social, @mcomunita.bsky.social, Luca Cosmo, Joshua D. Reiss and @dacom.bsky.social

20.12.2024 18:20 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

This project explores how to generate realistic sound effects for a silent video. Our model combines:
πŸ”Ή Video-based RMS envelope prediction, and
πŸ”Ή Audio synthesis with Stable Audio and ControlNet, enabling high-quality sound design synchronized to the visual input.

20.12.2024 18:19 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Stable-V2A: Synchronized Sound Effects Synthesis Stable-V2A is a two-stage model for synthesizing synchronized sound effects with support for temporal and semantic controls.

🌟 Excited to Share Our Latest Work! πŸŽ₯🎢

Here we present Stable-V2A: Synthesis of Synchronized Sound Effects with Temporal and Semantic Controls

arxiv: arxiv.org/abs/2412.15023
Video presentation and results: ispamm.github.io/Stable-V2A

20.12.2024 18:18 β€” πŸ‘ 5    πŸ” 2    πŸ’¬ 1    πŸ“Œ 2