Man-Machine Speech Communication

Man-Machine Speech Communication

19th National Conference, NCMMSC 2024, Urumqi, China, August 15-18, 2024, Proceedings

Li, Ya; Hamdulla, Askar; Ling, Zhenhua; He, Liang; Chen, Xie

Springer Nature Switzerland AG

02/2025

396

Mole

9789819610440

Pré-lançamento - envio 15 a 20 dias após a sua edição

Descrição não disponível.
.- The Attention-Based Fusion of Master-Auxiliary Network for Speech Enhancement.

.- M-CMGAN: Attempting to Use Mamba on Speech Enhancement.

.- A Backend-friendly On-device Multi-channel Speech Enhancement System with IPD and PHM.

.- SESNet: A Speech Enhancement and Separation Network in Noisy Reverberant Environments.

.- ASD-Diff: Unsupervised Anomalous Sound Detection With Masked Diffusion Model.

.- Emergence of Hemispheric Asymmetries and Predictive Coding in the Neural Mechanism of Speech Perception.

.- Phoneme Semantic Backdoor Attacks with Multiple Task Learning for Speech Classification Task.

.- AESR: Speech Recognition With Speech Emotion Recogniting Learning.

.- A Comparative Analysis of Diphthong Acquisition in Standard Chinese by Learners from 'the Belt and Road'.

.- ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram.

.- Transformer-based Model for Auditory EEG Decoding.

.- A Neural Denoising Vocoder for Clean Waveform Generation from Noisy Mel-Spectrogram based on Amplitude and Phase Predictions.

.- Sound Zone Control Based on a Kronecker Second-Order Tensor Decomposition.

.- MCDubber: Multimodal Context-Aware Expressive Video Dubbing.

.- TeleSpeechPT: Large-Scale Chinese Multi-Dialect And Multi-Accent Speech Pre-Training.

.- Investigation into the Impact of Speaker Adversarial Perturbation on Speech Recognition.

.- Pruning and Quantization Enhanced Densely Connected Neural Network for Efficient Acoustic Echo Cancellation.

.- Improved DOA Estimation of Sound Source of Small Amplitudes using a Single Acoustic Vector Sensor.

.- Investigation on Training Strategy for Cross-Modal Large Language Models with Speech and Text.

.- ExARN: Target Speaker Extraction with Attentive Recurrent Networks.

.- Tone Perception by Putonghua-Learning Preschool Children in South Xinjiang Uyghur Autonomous Region.

.- Study on Prosodic Disambiguation of VP/NP Syntactic Structure by Chinese EFL Learners.

.- An electroencephalogram-based study of neural responses to imagined speech in Mandarin.

.- A Speech Corpus of Putonghua-Learning Preschoolers From the Uygur Ethnic Group in South Xinjiang Uygur Autonomous Region of China.

.- Evaluation of Data Inconsistency for Multi-modal Sentiment Analysis.

.- LDMME: Latent Diffusion Model for Music Editing.

.- Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech.

.- Speech emotion recognition based on multi acoustic feature fusion.

.- DA-KWFormer: A Domain Adaptation Network with K-Weight Transformer for Speech Emotion Recognition.

.- An Unsupervised Domain Adaptation Method based on Distribution Alignment for Speaker Verification.

.- Cross-Model Knowledge Distillation and Metadata Fusion for Respiratory Sound Classification.

.- Effect of Focus on Vowel Duration and Formant in Cantonese.

.- A Quantitative Parameter of Pronunciation, TVVF.
Este título pertence ao(s) assunto(s) indicados(s). Para ver outros títulos clique no assunto desejado.
speech processing;speech percept;Phonetics, phonology and prosody;speech recognition;speech synthesis and conversion;speech enhancement;speech security;speech emotion recognition;sound event detection;speech coding;spoken dialog system;speech science;Audio signal Analysis;Speech large language model