(@riccardofosco) — Bluesky Profile

1 year ago

Yet another pre-Christmas release!! 🎅🎄
Here is 𝐒𝐭𝐚𝐛𝐥𝐞-𝐕𝟐𝐀 which generates sound effects from silent video frames showing semantic and temporal alignment.
🎶🥁🎛️

Huge thanks to @riccardofosco.bsky.social Christian Marinoni and all co-authors 🤟

3 1 0 0

1 year ago

Super interesting work on #GenAI #Video2Audio with impressive results from my friends @riccardofosco.bsky.social @Christian Marinoni together with @emilianpos.bsky.social @mcomunita.bsky.social Luca Cosmo, Joshua Reiss and @dacom.bsky.social !

👇 Go check it out!

4 1 0 0

1 year ago

A great work with Christian Marinoni, @emilianpos.bsky.social, @mcomunita.bsky.social, Luca Cosmo, Joshua D. Reiss and @dacom.bsky.social

2 0 0 0

1 year ago

This project explores how to generate realistic sound effects for a silent video. Our model combines:
🔹 Video-based RMS envelope prediction, and
🔹 Audio synthesis with Stable Audio and ControlNet, enabling high-quality sound design synchronized to the visual input.

2 0 1 0

1 year ago

Stable-V2A: Synchronized Sound Effects Synthesis Stable-V2A is a two-stage model for synthesizing synchronized sound effects with support for temporal and semantic controls.

🌟 Excited to Share Our Latest Work! 🎥🎶

Here we present Stable-V2A: Synthesis of Synchronized Sound Effects with Temporal and Semantic Controls

arxiv: arxiv.org/abs/2412.15023
Video presentation and results: ispamm.github.io/Stable-V2A

5 2 1 2