@naoyukikandaslp - Bluesky Profile

Latest posts by naoyukikandaslp.bsky.social on Bluesky

I was just notified that our E2 TTS paper received the Best Paper Award at IEEE #SLT2024! Many thanks to all the remarkable collaborators who made this happen!

Paper: arxiv.org/abs/2406.18009
Demo: aka.ms/e2tts

05.12.2024 03:38 — 👍 5 🔁 2 💬 0 📌 0

Ah, no, TS3-Codec was trained with 10-second audio segments, while BigCodec-S was trained with 2.5-second audio segments (Section 4.5). This was a somewhat tricky (and perhaps debatable) part of the configuration, and we did our best to tune the hyperparameters within the constraints of GPU memory.

03.12.2024 06:18 — 👍 1 🔁 0 💬 0 📌 0

Thanks! To the extent that we checked, yes. The important point is limiting the attention window.

03.12.2024 06:04 — 👍 0 🔁 0 💬 1 📌 0

TS3-Codec: yet another audio codec from my former team—simple, fast, and high-quality.

Simple—just a stack of Transformer and linear layers; no convolutions.

Faster and better—superior audio reconstruction quality with fewer MACs compared to strong convolution-based baselines.

03.12.2024 03:53 — 👍 0 🔁 0 💬 1 📌 0

Research Scientist Intern, AI Research - Speech & Audio (PhD) Meta's mission is to build the future of human connection and the technology that makes it possible.

Our GenAI-Speech team at Meta is hiring RS interns for summer 2025 to work on speech, LLMs, dialog generation, and other exciting stuff! Check out the job posting here: www.metacareers.com/jobs/3841154...

22.11.2024 03:41 — 👍 10 🔁 1 💬 0 📌 0

@naoyukikandaslp is following 20 prominent accounts

Hideto Kazawa
@hidetokazawa

語りえぬものが気になります / Google DeepMind Research Engineer

Daisuke Okanohara / 岡野原大輔
@hillbig

Co-founder, and CER of Preferred Networks (PFN). CEO of PFCC. Interested in deep learning and AI, science, and business.

Samuele Cornell
@popcornell

WAVLab@CMU
@wavlab

Shinji Watanabe's Audio and Voice Lab | WAVLab @LTIatCMU @SCSatCMU | Speech Recognition, Speech Enhancement, Spoken Language Understanding, and more.

Shinji Watanabe
@shinjiw

I'm working at CMU (2021-). I was working at NTT (2001-2011), MERL (2012-2017), and JHU (2017-2020). Speech and Audio Processing is my main research topic.

William Chen
@wanchichen

PhD Student @ltiatcmu.bsky.social I work in speech processing. wanchichen.github.io

Yuya Unno
@unnonouno

@yoshiki-masuyama

Greta Tuckute
@gretatuckute

Studying language in biological brains and artificial ones at the Kempner Institute at Harvard University. www.tuckute.com

Catherine Breslin
@catherinebreslin

AI scientist & consultant :: prev Amazon Alexa, Toshiba, Cam Uni :: voice & language tech :: powered by coffee :: photographer :: Cambridge UK https://www.catherinebreslin.co.uk

Ramon Astudillo
@ramon-astudillo

Principal Research Scientist at IBM Research AI in New York. Speech, Formal/Natural Language Processing. Currently LLM post-training, structured SDG and RL. Opinions my own and non stationary. ramon.astudillo.com

Odette Scharenborg
@odettes

Full professor of inclusive speech communication at TU Delft, The Netherlands. Former president of the International Speech Communication Association (ISCA). General Chair of @interspeech.bsky.social Rotterdam, 2025. Mother of 3🌈

Eric Fosler-Lussier
@ericfos

Professor/Admin @ Ohio State. All opinions expressed on this channel are my personal opinions and do not represent that of my employer.

Jonathan Le Roux
@jonathanleroux

Speech and audio research scientist @MERL. saneworkshop.org co-founder. IguanaTex developer. 🌐 jonathanleroux.org 🐙 github.com/Jonathan-LeRoux/ 🎓 scholar.google.com/citations?user=aUpxty8AAAAJ&hl=en

Jesse Engel
@jesseengel

Guitarist, Researcher Google DeepMind. Opinions are my own.

@markbcartwright

Prem Seetharaman
@pseeth

Researcher in computer audition, machine learning, and HCI. Sr. Research Scientist, @AdobeResearch. Previously @DescriptApp, @Northwestern. https://pseeth.github.io/

Christian Steinmetz
@csteinmetz1

AI for Music • Research Scientist @ Suno

Hervé Bredin (a.k.a. the pyannote guy)
@hbredin

I created pyannote open source toolkit. Co-founder and CSO at pyannoteAI

Keisuke Imoto
@keisukeimoto