's Avatar

@riccardofosco.bsky.social

Sound effects, audio & video | PhD at @ISPAMM, @Sapienza | Former @C4DM, @QMUL

45 Followers  |  130 Following  |  3 Posts  |  Joined: 20.12.2024  |  1.5394

Latest posts by riccardofosco.bsky.social on Bluesky

Yet another pre-Christmas release!! ๐ŸŽ…๐ŸŽ„
Here is ๐’๐ญ๐š๐›๐ฅ๐ž-๐•๐Ÿ๐€ which generates sound effects from silent video frames showing semantic and temporal alignment.
๐ŸŽถ๐Ÿฅ๐ŸŽ›๏ธ

Huge thanks to @riccardofosco.bsky.social Christian Marinoni and all co-authors ๐ŸคŸ

23.12.2024 14:01 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Super interesting work on #GenAI #Video2Audio with impressive results from my friends @riccardofosco.bsky.social @Christian Marinoni together with @emilianpos.bsky.social @mcomunita.bsky.social Luca Cosmo, Joshua Reiss and @dacom.bsky.social !

๐Ÿ‘‡ Go check it out!

20.12.2024 18:37 โ€” ๐Ÿ‘ 4    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

A great work with Christian Marinoni, @emilianpos.bsky.social, @mcomunita.bsky.social, Luca Cosmo, Joshua D. Reiss and @dacom.bsky.social

20.12.2024 18:20 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

This project explores how to generate realistic sound effects for a silent video. Our model combines:
๐Ÿ”น Video-based RMS envelope prediction, and
๐Ÿ”น Audio synthesis with Stable Audio and ControlNet, enabling high-quality sound design synchronized to the visual input.

20.12.2024 18:19 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Stable-V2A: Synchronized Sound Effects Synthesis Stable-V2A is a two-stage model for synthesizing synchronized sound effects with support for temporal and semantic controls.

๐ŸŒŸ Excited to Share Our Latest Work! ๐ŸŽฅ๐ŸŽถ

Here we present Stable-V2A: Synthesis of Synchronized Sound Effects with Temporal and Semantic Controls

arxiv: arxiv.org/abs/2412.15023
Video presentation and results: ispamm.github.io/Stable-V2A

20.12.2024 18:18 โ€” ๐Ÿ‘ 5    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2

@riccardofosco is following 19 prominent accounts