Glad to have been presented the FakeMusicCaps dataset and the risks and issues behind deepfake music detection!
Link to the paper --> arxiv.org/pdf/2409.10684
Link to the dataset --> zenodo.org/records/1506...
#AIMusic #GenerativeAI #Deepfake #AI #GenerativeMusic
20.05.2025 14:26 โ ๐ 3 ๐ 1 ๐ฌ 0 ๐ 0
Riccardo Passoni, Francesca Ronchini, Luca Comanducci, Romain Serizel, Fabio Antonacci: Diffused Responsibility: Analyzing the Energy Consumption of Generative Text-to-Audio Diffusion Models https://arxiv.org/abs/2505.07615 https://arxiv.org/pdf/2505.07615 https://arxiv.org/html/2505.07615
13.05.2025 06:01 โ ๐ 4 ๐ 6 ๐ฌ 1 ๐ 0
Diffused Responsibility: Analyzing the Energy Consumption of Generative Text-to-Audio Diffusion Models
Riccardo Passoni, Francesca Ronchini, Luca Comanducci, Romain Serizel, Fabio Antonacci
Energy consumption of 7 text-to-audio diffusion models was analyzed, identifying the trade-off between audio quality and energy usage at inference time by finding Pareto-optimal solutions.
13.05.2025 10:16 โ ๐ 0 ๐ 1 ๐ฌ 0 ๐ 0
Towards HRTF Personalization using Denoising Diffusion Models
Juan Camilo Albarracรญn Sรกnchez, Luca Comanducci, Mirco Pezzoli, Fabio Antonacci
DDPMs, conditioned on anthropometric measurements, generated personalized HRIRs, demonstrating feasibility for HRTF personalization and comparable performance to existing methods.
07.01.2025 09:48 โ ๐ 2 ๐ 1 ๐ฌ 0 ๐ 0
thanks to all co-authors!!!
21.12.2024 12:54 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
In Towards HRTF Personalization using Denoising Diffusion Models we use #DDPMS to generated personalized Head-related Transfer functions #HRTFs by conditioning the model using anthropometric measurements and working directly in the raw audio domain.
21.12.2024 12:54 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
In MambaFoley we combine #DDPMS and the #Mamba selective state-space model to generate foley sounds directly in the waveform domain, while controlling the timing of the generative sounds.
21.12.2024 12:54 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
2 Papers accepted at #ICASSP25 ๐ฎ๐ณ hope to see you in india!
- MambaFoley: Foley Sound Generation using Selective State-Space Models (arxiv.org/pdf/2409.09162)
- Towards HRTF Personalization using Denoising Diffusion Models (arxiv soon)
#ICASSP25 #ICASSP #generativeai
21.12.2024 12:52 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
new paper! ๐ฃ๏ธSketch2Sound๐ฅ
Sketch2Sound can create sounds from sonic imitations (i.e., a vocal imitation or a reference sound) via interpretable, time-varying control signals.
paper: arxiv.org/abs/2412.08550
web: hugofloresgarcia.art/sketch2sound
12.12.2024 14:43 โ ๐ 23 ๐ 9 ๐ฌ 2 ๐ 5
Challenge tasks for DCASE2025 - DCASE
The DCASE Steering Group has reviewed the task proposals...
The tasks for DCASE challenge 2025 have been announced.
dcase.community/articles/cha...
Stay tuned for more details.
10.12.2024 10:13 โ ๐ 7 ๐ 4 ๐ฌ 0 ๐ 1
While the technology behind these experiments may now be a little outdated I still believe that the chapter is a nice snapshot of experimental research in NMP from a few years ago!
03.12.2024 07:48 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
Here we consider the case of chamber music and experiment with adaptive metronomes, to enhance the musicians synchronicity, and with binaural audio, to make the playing experience more immersive!
03.12.2024 07:48 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
nice idea! would love to be added if you think it might be useful!
21.11.2024 20:13 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
PhD, Senior Research Scientist @Sony
Former @Microsoft, @Musixmatch
Working on #DeepLearning, #SignalProcessing, #GenerativeModels.
Deep learning for audio signal processing and acoustics at Bang&Olufsen
francesclluis.com
Research Scientist @SonyAI
PhD from Seoul National University
Previous intern @MERL, @Sony, and @Supertone
The shortest distance between you and science. Brought to you by the science, health and environmental reporting students at NYU Journalism. https://scienceline.org/
A close-knit community, a rich academic heritage, a creative powerhouse, a thought-provoking place.
Assistant Professor of Psychology and Music Technology at NYU. Associate Director of MARL (https://steinhardt.nyu.edu/marl). CLaME affiliate (https://clame.nyu.edu/).
Cognitive neuroscience of music, reward, and language.
https://www.ripolleslab.com/
Blog: https://sander.ai/
๐ฆ: https://x.com/sedielem
Research Scientist at Google DeepMind (WaveNet, Imagen 3, Veo, ...). I tweet about deep learning (research + software), music, generative models (personal account).
Audio Tech | XR | Engineer @ Dolby
The School of Electronic Engineering and Computer Science at Queen Mary University of London. Sharing news, opportunities and events for students, staff, alumni and friends - https://www.qmul.ac.uk/eecs/
Research scientist at Google DeepMind
universal musical approximator. research scientist at gorgle derpmind, magenta team. https://ethman.github.io
Professor in Design Engineering and Music, Imperial College London. Researcher, composer, violist, engineer, instrument designer.
Leader of the Augmented Instruments Laboratory instrumentslab.org.
Co-founder and director of Bela.io.
Prof in computer science at Sorbonne University. Director of the lab "Science and Technology of Music and Sound" at Ircam in Paris.