Yiming Yang, Guangyong Wang, Haixin Guan, Yanhua Long: Enroll-on-Wakeup: A First Comparative Study of Target Speech Extraction for Seamless Interaction in Real Noisy Human-Machine Dialogue Sc... https://arxiv.org/abs/2602.15519 https://arxiv.org/pdf/2602.15519 https://arxiv.org/html/2602.15519
18.02.2026 06:35 β π 0 π 1 π¬ 0 π 0
Takao Kawamura, Daisuke Niizumi, Nobutaka Ono: What Do Neurons Listen To? A Neuron-level Dissection of a General-purpose Audio Model https://arxiv.org/abs/2602.15307 https://arxiv.org/pdf/2602.15307 https://arxiv.org/html/2602.15307
18.02.2026 06:35 β π 0 π 1 π¬ 0 π 0
Sonal Kumar, Prem Seetharaman, Ke Chen, Oriol Nieto, Jiaqi Su, Zhepei Wang, Rithesh Kumar, Dinesh Manocha, Nicholas J. Bryan, Zeyu Jin, Justin Salamon: TAC: Timestamped Audio Captioning https://arxiv.org/abs/2602.15766 https://arxiv.org/pdf/2602.15766 https://arxiv.org/html/2602.15766
18.02.2026 06:34 β π 0 π 0 π¬ 0 π 0
Jonah Casebeer, Ge Zhu, Zhepei Wang, Nicholas J. Bryan: A Generative-First Neural Audio Autoencoder https://arxiv.org/abs/2602.15749 https://arxiv.org/pdf/2602.15749 https://arxiv.org/html/2602.15749
18.02.2026 06:34 β π 0 π 1 π¬ 0 π 0
Qiangong Zhou, Nagasaka Tomohiro: UniTAF: A Modular Framework for Joint Text-to-Speech and Audio-to-Face Modeling https://arxiv.org/abs/2602.15651 https://arxiv.org/pdf/2602.15651 https://arxiv.org/html/2602.15651
18.02.2026 06:34 β π 0 π 2 π¬ 0 π 0
Samir Sadok, Laurent Girin, Xavier Alameda-Pineda: The Equalizer: Introducing Shape-Gain Decomposition in Neural Audio Codecs https://arxiv.org/abs/2602.15491 https://arxiv.org/pdf/2602.15491 https://arxiv.org/html/2602.15491
18.02.2026 06:34 β π 0 π 1 π¬ 0 π 0
Zineb Lahrichi, Ga\"etan Hadjeres, Ga\"el Richard, Geoffroy Peeters: S-PRESSO: Ultra Low Bitrate Sound Effect Compression With Diffusion Autoencoders And Offline Quantization https://arxiv.org/abs/2602.15082 https://arxiv.org/pdf/2602.15082 https://arxiv.org/html/2602.15082
18.02.2026 06:34 β π 0 π 2 π¬ 0 π 0
Wanyu Zang, Yang Yu, Meng Yu: Structure-Aware Piano Accompaniment via Style Planning and Dataset-Aligned Pattern Retrieval https://arxiv.org/abs/2602.15074 https://arxiv.org/pdf/2602.15074 https://arxiv.org/html/2602.15074
18.02.2026 06:34 β π 0 π 1 π¬ 0 π 0
[2026-02-18 Wed (UTC), 6 new articles found for csSD Sound]
18.02.2026 06:34 β π 0 π 0 π¬ 0 π 0
Yacouba Kaloga, Marina Laganaro, Ina Kodrasi: CLAP-Based Automatic Word Naming Recognition in Post-Stroke Aphasia https://arxiv.org/abs/2602.14584 https://arxiv.org/pdf/2602.14584 https://arxiv.org/html/2602.14584
17.02.2026 06:35 β π 0 π 1 π¬ 0 π 0
Sandy H. S. Herho, Rusmawan Suwarman, Nurjanna J. Trilaksono, Iwan P. Anwar, Faiz R. Fajary: Preliminary sonification of ENSO using traditional Javanese gamelan scales https://arxiv.org/abs/2602.14560 https://arxiv.org/pdf/2602.14560 https://arxiv.org/html/2602.14560
17.02.2026 06:49 β π 0 π 2 π¬ 0 π 0
Jandad Jahani, Mursal Dawodi, Jawid Ahmad Baktash: From Scarcity to Scale: A Release-Level Analysis of the Pashto Common Voice Dataset https://arxiv.org/abs/2602.14062 https://arxiv.org/pdf/2602.14062 https://arxiv.org/html/2602.14062
17.02.2026 06:30 β π 0 π 1 π¬ 0 π 0
Ligong Lei, Wenwen Lu, Xudong Pang, Zaokere Kadeer, Aishan Wumaier: Multimodal Consistency-Guided Reference-Free Data Selection for ASR Accent Adaptation https://arxiv.org/abs/2602.13263 https://arxiv.org/pdf/2602.13263 https://arxiv.org/html/2602.13263
17.02.2026 06:29 β π 0 π 2 π¬ 0 π 0
Parth Khadse, Sunil Kumar Kopparapu: Probing Human Articulatory Constraints in End-to-End TTS with Reverse and Mismatched Speech-Text Directions https://arxiv.org/abs/2602.14664 https://arxiv.org/pdf/2602.14664 https://arxiv.org/html/2602.14664
17.02.2026 06:34 β π 0 π 0 π¬ 0 π 0
H. M. Shadman Tabib, et al.: Bengali-Loop: Community Benchmarks for Long-Form Bangla ASR and Speaker Diarization https://arxiv.org/abs/2602.14291 https://arxiv.org/pdf/2602.14291 https://arxiv.org/html/2602.14291
17.02.2026 06:34 β π 0 π 1 π¬ 0 π 0
Ma, Xu, Ma, Yang, Li, Kim, Xu, Li, Busso, Yu, Chng, Chen: The Interspeech 2026 Audio Reasoning Challenge: Evaluating Reasoning Process Quality for Audio Reasoning Models and Agents https://arxiv.org/abs/2602.14224 https://arxiv.org/pdf/2602.14224 https://arxiv.org/html/2602.14224
17.02.2026 06:34 β π 0 π 2 π¬ 0 π 0
Keinichi Fujita, Yusuke Ijima: Investigation for Relative Voice Impression Estimation https://arxiv.org/abs/2602.14172 https://arxiv.org/pdf/2602.14172 https://arxiv.org/html/2602.14172
17.02.2026 06:34 β π 0 π 2 π¬ 0 π 0
Reda Bensaid, Amine Ouasfi, Yassir Bendou, Ilyass Moummad, Vincent Gripon, Fran\c{c}ois Leduc-Primeau, Adnane Boukhayma: MUKA: Multi Kernel Audio Adaptation Of Audio-Language Models https://arxiv.org/abs/2602.14127 https://arxiv.org/pdf/2602.14127 https://arxiv.org/html/2602.14127
17.02.2026 06:34 β π 0 π 0 π¬ 0 π 0
Zhang, Lei, Hu, He, Deng, Luo, Zhu, Feng, Liu, He, Sun, Wu, Wang: Eureka-Audio: Triggering Audio Intelligence in Compact Language Models https://arxiv.org/abs/2602.13954 https://arxiv.org/pdf/2602.13954 https://arxiv.org/html/2602.13954
17.02.2026 06:34 β π 0 π 1 π¬ 0 π 0
Aju Ani Justus, Ruchit Agrawal, Sudarsana Reddy Kadiri, Shrikanth Narayanan: voice2mode: Phonation Mode Classification in Singing using Self-Supervised Speech Models https://arxiv.org/abs/2602.13928 https://arxiv.org/pdf/2602.13928 https://arxiv.org/html/2602.13928
17.02.2026 06:34 β π 0 π 1 π¬ 0 π 0
Shen, Jayashankar, Hanna, Kanda, Wang, \v{Z}mol\'ikov\'a, Xie, Moritz, Xu, Gaur, Wornell, He, Wu: GSRM: Generative Speech Reward Model for Speech RLHF https://arxiv.org/abs/2602.13891 https://arxiv.org/pdf/2602.13891 https://arxiv.org/html/2602.13891
17.02.2026 06:34 β π 0 π 1 π¬ 0 π 0
Sripathi Sridhar, Prem Seetharaman, Oriol Nieto, Mark Cartwright, Justin Salamon: Audiocards: Structured Metadata Improves Audio Language Models For Sound Design https://arxiv.org/abs/2602.13835 https://arxiv.org/pdf/2602.13835 https://arxiv.org/html/2602.13835
17.02.2026 06:34 β π 0 π 0 π¬ 0 π 0
Minhui Lu, Joshua D. Reiss: Learning Vocal-Tract Area and Radiation with a Physics-Informed Webster Model https://arxiv.org/abs/2602.13834 https://arxiv.org/pdf/2602.13834 https://arxiv.org/html/2602.13834
17.02.2026 06:34 β π 0 π 1 π¬ 0 π 0
Picinali, Baumgartner, Gaveau, Greco, Liebe, Oomen, Braun: Enhancing spatial hearing with cochlear implants: exploring the role of AI, multimodal interaction and perceptual training https://arxiv.org/abs/2602.13787 https://arxiv.org/pdf/2602.13787 https://arxiv.org/html/2602.13787
17.02.2026 06:34 β π 0 π 1 π¬ 0 π 0
Siqian Tong, Xuan Li, Yiwei Wang, Baolong Bi, Yujun Cai, Shenghua Liu, Yuchen He, Chengpeng Hao: AuTAgent: A Reinforcement Learning Framework for Tool-Augmented Audio Reasoning https://arxiv.org/abs/2602.13685 https://arxiv.org/pdf/2602.13685 https://arxiv.org/html/2602.13685
17.02.2026 06:34 β π 0 π 1 π¬ 0 π 0
Zhe Ye, Xiangui Kang, Jiayi He, Chengxin Chen, Wei Zhu, Kai Wu, Yin Yang, Jiwu Huang: BreathNet: Generalizable Audio Deepfake Detection via Breath-Cue-Guided Feature Refinement https://arxiv.org/abs/2602.13596 https://arxiv.org/pdf/2602.13596 https://arxiv.org/html/2602.13596
17.02.2026 06:34 β π 0 π 1 π¬ 0 π 0
Xu Zhang, Longbing Cao, Runze Yang, Zhangkai Wu: Learning Physiology-Informed Vocal Spectrotemporal Representations for Speech Emotion Recognition https://arxiv.org/abs/2602.13259 https://arxiv.org/pdf/2602.13259 https://arxiv.org/html/2602.13259
17.02.2026 06:34 β π 0 π 2 π¬ 0 π 0
[2026-02-17 Tue (UTC), 14 new articles found for csSD Sound]
17.02.2026 06:34 β π 0 π 0 π¬ 0 π 0
Giovanni Bologni, Nicol\'as Arrieta Larraza, Richard Heusdens, Richard C. Hendriks: A two-step approach for speech enhancement in low-SNR scenarios using cyclostationary beamforming and DNNs https://arxiv.org/abs/2602.12986 https://arxiv.org/pdf/2602.12986 https://arxiv.org/html/2602.12986
16.02.2026 06:35 β π 0 π 1 π¬ 0 π 0
Louise Zhuang, Samuel Beuret, Ben Frey, Saachi Munot, Jeremy J. Dahl: A Wavefield Correlation Approach to Improve Sound Speed Estimation in Ultrasound Autofocusing https://arxiv.org/abs/2602.12805 https://arxiv.org/pdf/2602.12805 https://arxiv.org/html/2602.12805
16.02.2026 06:48 β π 0 π 2 π¬ 0 π 0