Facial gestures are enacted through a cortical hierarchy of dynamic and stable codes | Science www.science.org/doi/10.1126/...
#neuroskyence
@avsp.bsky.social
The official(ish) account of the Auditory-VIsual Speech Association (AVISA) AV π π speech references, but mostly what interests me avisa.loria.fr
Facial gestures are enacted through a cortical hierarchy of dynamic and stable codes | Science www.science.org/doi/10.1126/...
#neuroskyence
Some lips are red
Some eyes are blue
Seeing you speak
means I can identify more of the key words you said in noise
Watching Yourself Talk: Motor Experience Sharpens Sensitivity to Gesture-Speech Asynchrony
Tiziana Vercillo, Judith Holler, Uta Noppeney
www.biorxiv.org/content/10.6...
π Citation Classic
"Phonetic and phonological representation of stop consonant voicing"
Patricia Keating (1984)
Citations: 859+
Structured view of [voice] feature to phonetic implement...
π https://www.jstor.org/stable/pdf/413642.pdf
#SpeechScience
How does a deep neural network look at lexical stress in English words? pubs.aip.org/asa/jasa/art... CNNs trained to predict stress position from a spectrographic representation of disyllabic words ->92% accuracy on held-out tests, interpretability analysis >stressed vowel's 1st &2nd formants key
11.02.2026 22:16 β π 2 π 0 π¬ 0 π 0Our latest paper, βVisual language models show widespread visual deficits on neuropsychological testsβ, is now out in Nature Machine Intelligence: www.nature.com/articles/s42...
Non-paywalled version:
arxiv.org/abs/2504.10786
Tweet thread below from first author @genetang.bsky.social...
X-modal processing of auditory & visual symbol representations in the temporo-parietal cortex
www.researchsquare.com/article/rs-8...
Slow-event-related 3T fMRI: A passive listening/viewing task auditory/visual letters & numbers overlapping activation in auditory cortex for auditory letters/numbers
Attention decoding at the cocktail party: Preserved in hearing aid users, reduced in cochlear implant users www.sciencedirect.com/science/arti... 29 HA, 24 CI users & 29 age-matched TH people EEG attending 1 of 2 talkers (female/male) in free-field; EEG <-> envelope linear backward & forward models
08.02.2026 21:27 β π 3 π 0 π¬ 0 π 0Toward Fuller Integration of Respiratory Rhythms Into Research on Infant Vocal&Motor Development nyaspubs.onlinelibrary.wiley.com/doi/10.1111/... Assays motor control,physiology,speech & language acquisition proposes respiration is core in early rhythmic coordination linking vocalization & movement
07.02.2026 03:23 β π 1 π 1 π¬ 0 π 0Human newborns form musical predictions based on rhythmic but not melodic structure journals.plos.org/plosbiology/... TRF analyses had high inter-individual variability for overall neural tracking of musical stimuli - note-by-note predictability tracked (not shuffled) a rhythmic not melodic effect
06.02.2026 10:00 β π 15 π 6 π¬ 0 π 0Explaining the Musical Advantage in Speech Perception Through Beat Perception and Working Memory nyaspubs.onlinelibrary.wiley.com/doi/10.1111/... "Our findings clarify the cognitive and temporal foundations of the musician advantage and highlight the value of considering musical engagement"
05.02.2026 22:47 β π 3 π 0 π¬ 0 π 0Individuals with congenital amusia show degraded performance in a nonword repetition task with lexical tones www.sciencedirect.com/science/arti... Nonword repetition task for syllable-tone combinations with length of the nonwords gradually increased from 1 to 7 syllables accuracy & error analysed
05.02.2026 07:21 β π 2 π 0 π¬ 0 π 0Early multimodal behavioral cues in autism: a micro-analytical exploration of actions, gestures and speech during naturalistic parent-child interactions https://pubmed.ncbi.nlm.nih.gov/41631016/
03.02.2026 14:49 β π 1 π 2 π¬ 0 π 0Speech reading - "get me outta here"
02.02.2026 23:59 β π 3 π 0 π¬ 0 π 0In 1961, physicist John Kelly programmed an IBM 704 to sing 'Daisy Bell' - the first song ever sung by a computer. This inspired HAL 9000's song in 2001: A Space Odyssey!
π΅ Historic: youtube.com/watch?v=41U78QP8nBk
#SpeechScience #Technology
"We want AI agents that can discover like we can, not which contain what we have discovered. Building in our discoveries only makes it harder to see how the discovering process can be done" scholar.google.com.au/scholar?oi=b... π²
01.02.2026 00:42 β π 1 π 0 π¬ 0 π 0Thank you. What a wonderful representation of HARPY's knowledge graph, how it was based on an expert formal lexical grammar, rules for phone junctures & how the frequency contrast & matching worked. Superb!
31.01.2026 23:27 β π 0 π 0 π¬ 0 π 0Check out HARPY: 1971 DARPA project:
Speech understanding system that aimed to:
Goal (November, 1971)
Accept connected speech
Many cooperative speakers
Use 1000 word vocabulary
Task-oriented grammar
Constraining task
Less than 10% Semantic Errors
Requiring ~300 MIPSS www.youtube.com/watch?v=NiiD...
Cutaneous alternating current stimulation can cause a phasic modulation of speech perception https://pubmed.ncbi.nlm.nih.gov/41617605/
31.01.2026 05:48 β π 2 π 2 π¬ 0 π 0something, something, phone ... er, ring?
28.01.2026 10:50 β π 1 π 0 π¬ 1 π 0@speechpapers.bsky.social I see you're posting AV speech stuff - π
28.01.2026 07:27 β π 0 π 0 π¬ 1 π 0
The cortical contribution to the speech-FFR is not modulated by visual information
https://www.biorxiv.org/content/10.64898/2026.01.26.701703v1
Audio-visual speech-in-noise tests for evaluating speech reception thresholds: A scoping review https://pubmed.ncbi.nlm.nih.gov/41592005/
28.01.2026 02:34 β π 3 π 1 π¬ 0 π 0For context, see KaradΓΆller D. Z., SΓΌmer B., ΓzyΓΌrek A. (2025). First-language acquisition in a multimodal framework: Insights from speech, gesture, and sign journals.sagepub.com/doi/pdf/10.1... Miles et al argue "An embodied multi-articulatory multimodal framework is needed"
26.01.2026 21:22 β π 0 π 0 π¬ 0 π 0An embodied multi-articulatory multimodal language framework: A commentary on KaradΓΆller et al
journals.sagepub.com/doi/10.1177/...
"we believe it shows that our understanding of the role of gesture in language is incomplete and lacks crucial insight when co-sign gesture is not accounted for"
The involvement of endogenous brain rhythms in speech processing www.sciencedirect.com/science/arti... Reviews oscillation-based theories (dynamic attending, active sensing, asymmetric sampling in time, segmentation theories) & evidence > Naturalistic paradigms and resting-state data key to progress
23.01.2026 21:03 β π 5 π 0 π¬ 0 π 0Children Sustain Their Attention on Spatial Scenes When Planning to Describe Spatial Relations Multimodally in Speech & Gesture onlinelibrary.wiley.com/doi/10.1111/... "How do children allocate visual attention to scenes as they prepare to describe them multimodally in speech and co-speech gesture?"
20.01.2026 22:55 β π 0 π 0 π¬ 0 π 0Effects of Visual Input in Virtual Reality on Voice Production: Comparing Trained Singers & Untrained Speakers www.jvoice.org/article/S089... Study examined if visual spatial cues in immersive virtual reality (room size, speaker-to-listener distance) are associated with changes in vocal production π£οΈ
19.01.2026 12:28 β π 0 π 0 π¬ 0 π 0Distinct Temporal Dynamics of Speech & Gesture Processing: Insights From ERP Across L1 and L2 psycnet.apa.org/fulltext/202... "results point to potentially distinct neural and temporal dynamics in processing speech versus gestures" -> speech processing earlier as gestures recruit later stages (?)
17.01.2026 09:28 β π 1 π 0 π¬ 0 π 0