Man-Machine Speech Communication
Man-Machine Speech Communication
19th National Conference, NCMMSC 2024, Urumqi, China, August 15-18, 2024, Proceedings
Li, Ya; Hamdulla, Askar; Ling, Zhenhua; He, Liang; Chen, Xie
Springer Nature Switzerland AG
02/2025
396
Mole
9789819610440
Pré-lançamento - envio 15 a 20 dias após a sua edição
.- M-CMGAN: Attempting to Use Mamba on Speech Enhancement.
.- A Backend-friendly On-device Multi-channel Speech Enhancement System with IPD and PHM.
.- SESNet: A Speech Enhancement and Separation Network in Noisy Reverberant Environments.
.- ASD-Diff: Unsupervised Anomalous Sound Detection With Masked Diffusion Model.
.- Emergence of Hemispheric Asymmetries and Predictive Coding in the Neural Mechanism of Speech Perception.
.- Phoneme Semantic Backdoor Attacks with Multiple Task Learning for Speech Classification Task.
.- AESR: Speech Recognition With Speech Emotion Recogniting Learning.
.- A Comparative Analysis of Diphthong Acquisition in Standard Chinese by Learners from 'the Belt and Road'.
.- ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram.
.- Transformer-based Model for Auditory EEG Decoding.
.- A Neural Denoising Vocoder for Clean Waveform Generation from Noisy Mel-Spectrogram based on Amplitude and Phase Predictions.
.- Sound Zone Control Based on a Kronecker Second-Order Tensor Decomposition.
.- MCDubber: Multimodal Context-Aware Expressive Video Dubbing.
.- TeleSpeechPT: Large-Scale Chinese Multi-Dialect And Multi-Accent Speech Pre-Training.
.- Investigation into the Impact of Speaker Adversarial Perturbation on Speech Recognition.
.- Pruning and Quantization Enhanced Densely Connected Neural Network for Efficient Acoustic Echo Cancellation.
.- Improved DOA Estimation of Sound Source of Small Amplitudes using a Single Acoustic Vector Sensor.
.- Investigation on Training Strategy for Cross-Modal Large Language Models with Speech and Text.
.- ExARN: Target Speaker Extraction with Attentive Recurrent Networks.
.- Tone Perception by Putonghua-Learning Preschool Children in South Xinjiang Uyghur Autonomous Region.
.- Study on Prosodic Disambiguation of VP/NP Syntactic Structure by Chinese EFL Learners.
.- An electroencephalogram-based study of neural responses to imagined speech in Mandarin.
.- A Speech Corpus of Putonghua-Learning Preschoolers From the Uygur Ethnic Group in South Xinjiang Uygur Autonomous Region of China.
.- Evaluation of Data Inconsistency for Multi-modal Sentiment Analysis.
.- LDMME: Latent Diffusion Model for Music Editing.
.- Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech.
.- Speech emotion recognition based on multi acoustic feature fusion.
.- DA-KWFormer: A Domain Adaptation Network with K-Weight Transformer for Speech Emotion Recognition.
.- An Unsupervised Domain Adaptation Method based on Distribution Alignment for Speaker Verification.
.- Cross-Model Knowledge Distillation and Metadata Fusion for Respiratory Sound Classification.
.- Effect of Focus on Vowel Duration and Formant in Cantonese.
.- A Quantitative Parameter of Pronunciation, TVVF.
.- M-CMGAN: Attempting to Use Mamba on Speech Enhancement.
.- A Backend-friendly On-device Multi-channel Speech Enhancement System with IPD and PHM.
.- SESNet: A Speech Enhancement and Separation Network in Noisy Reverberant Environments.
.- ASD-Diff: Unsupervised Anomalous Sound Detection With Masked Diffusion Model.
.- Emergence of Hemispheric Asymmetries and Predictive Coding in the Neural Mechanism of Speech Perception.
.- Phoneme Semantic Backdoor Attacks with Multiple Task Learning for Speech Classification Task.
.- AESR: Speech Recognition With Speech Emotion Recogniting Learning.
.- A Comparative Analysis of Diphthong Acquisition in Standard Chinese by Learners from 'the Belt and Road'.
.- ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram.
.- Transformer-based Model for Auditory EEG Decoding.
.- A Neural Denoising Vocoder for Clean Waveform Generation from Noisy Mel-Spectrogram based on Amplitude and Phase Predictions.
.- Sound Zone Control Based on a Kronecker Second-Order Tensor Decomposition.
.- MCDubber: Multimodal Context-Aware Expressive Video Dubbing.
.- TeleSpeechPT: Large-Scale Chinese Multi-Dialect And Multi-Accent Speech Pre-Training.
.- Investigation into the Impact of Speaker Adversarial Perturbation on Speech Recognition.
.- Pruning and Quantization Enhanced Densely Connected Neural Network for Efficient Acoustic Echo Cancellation.
.- Improved DOA Estimation of Sound Source of Small Amplitudes using a Single Acoustic Vector Sensor.
.- Investigation on Training Strategy for Cross-Modal Large Language Models with Speech and Text.
.- ExARN: Target Speaker Extraction with Attentive Recurrent Networks.
.- Tone Perception by Putonghua-Learning Preschool Children in South Xinjiang Uyghur Autonomous Region.
.- Study on Prosodic Disambiguation of VP/NP Syntactic Structure by Chinese EFL Learners.
.- An electroencephalogram-based study of neural responses to imagined speech in Mandarin.
.- A Speech Corpus of Putonghua-Learning Preschoolers From the Uygur Ethnic Group in South Xinjiang Uygur Autonomous Region of China.
.- Evaluation of Data Inconsistency for Multi-modal Sentiment Analysis.
.- LDMME: Latent Diffusion Model for Music Editing.
.- Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech.
.- Speech emotion recognition based on multi acoustic feature fusion.
.- DA-KWFormer: A Domain Adaptation Network with K-Weight Transformer for Speech Emotion Recognition.
.- An Unsupervised Domain Adaptation Method based on Distribution Alignment for Speaker Verification.
.- Cross-Model Knowledge Distillation and Metadata Fusion for Respiratory Sound Classification.
.- Effect of Focus on Vowel Duration and Formant in Cantonese.
.- A Quantitative Parameter of Pronunciation, TVVF.