Speech and Computer

Speech and Computer

24th International Conference, SPECOM 2022, Gurugram, India, November 14-16, 2022, Proceedings

Agrawal, Shyam S.; Prasanna, S. R. Mahadeva; Samudravijaya, K.; Karpov, Alexey

Springer International Publishing AG

11/2022

720

Mole

Inglês

9783031209796

15 a 20 dias

1110

Descrição não disponível.
Thematic Diversity of Everyday Russian Discourse: a Case Study Based on the ORD corpus.- Neural Embedding Extractors for Text-Independent Speaker Verification.- Deep Speaker Embeddings based Online Diarization.- Overlapped Speech Detection Using AM-FM based Time-Frequency Representations.- Significance of Dimensionality Reduction in CNN-based Vowel Classification from Imagined Speech using Electroencephalogram Signals.- Study of Speech Recognition System Based on Transformer and Connectionist Temporal Classification Models for Low Resource Language.- An Initial Study on Birdsong Re-synthesis using Neural Vocoders.- Speech Music Overlap Detection using Spectral Peak Evolutions.- Influence of Accented Speech in Automatic Speech Recognition: A Case Study on Assamese L1 Speakers Speaking Code Switched Hindi-English.- ClusterVote: Automatic Summarization Dataset Construction with Document Clusters.- Comparing Unsupervised Detection Algorithms for Audio Adversarial Examples.- Celtic EnglishContinuum in Pitch Patterns of Spontane-ous Talk: Evidence of Long-Term Contacts .- Coherence Based Automatic Essay Scoring Using Sentence Embedding and Recurrent Neural Networks.- Analysis of Automatic Evaluation Metric on Low-Resourced Language: BERTScore Vs BLEU Score.- DyCoDa: A Multi-Modal Data Collection of Multi-User Remote Survival Game Recordings.- On the Use of Ensemble X-Vector Embeddings for Improved Sleepiness Detection.- Multiresolution Decomposition Analysis via Wavelet Transforms for Audio Deepfake Detection .- Automatic Rhythm and Speech Rate Analysis of Mising Spontaneous Speech.- An Electroglottographic Method for Assessing the Emotional State of the Speaker.- Significance of Distance on Pop Noise for Voice Liveness Detection .- CRIM's Speech Recognition System for OpenASR21 Evaluation with Conformer and Voice Activity Detector Embeddings.- Joint Changes in First and Second Formants of /a/, /i/, /u/ Vowels in Babble Noise - a New Statistical Approach.- Comparing NLPSolutions for the Disambiguation of French Heterophonic Homographs for End-to-End TTS Systems.- Detection of Speech Related Disorders by Pre-Trained Embedding Models Extracted Biomarkers.- Multi-Label Dysfluency Classification.- Harnessing Uncertainty - Multi-Label Dysfluency Classification with Uncertain Labels.- Continuous Wavelet Transform for Severity-Level Classification of Dysarthria.- Significance of Energy Features for Severity Classification of Dysarthria.- Sailor and Hemant A. Patil An Analytic Study on Clustering-based Pseudo-Labels for Self-Supervised Deep Speaker Verification.- Investigation of Transfer Learning for End-to-End Russian Speech Recognition.- Prosodic Features of Verbal Irony in Russian and French: Universal vs. Language-Specific.- Categorization of Threatening Speech Acts.- Assessment of Speech Quality During Speech Rehabilitation Based on the Solution of the Classification Problem.- Multi-level Fusion of Fisher Vector Encoded BERT and wav2vec 2.0 Embeddingsfor Native Language Identification.- Fake Speech Detection using OpenSMILE Features.- Nonverbal Constituents of Argumentative Discourse: Gesture and Prosody Interaction.- Classifying Mahout and Social Interactions of Asian Elephants based on Trumpet Calls.- Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and Automatic.- Fake Speech Detection using Modulation Spectrogram.- Self-Configuring Genetic Programming Feature Generation in Affect Recognition Tasks.- A Multi[1]Modal Approach to Mining Intent from Code-Mixed Hindi-English Calls in the Hyperlocal-Delivery Domain.- Importance of Supra-Segmental Information and Self-Supervised Framework for Spoken Language.- Diarization Task.- Low-resource Emotional Speech Synthesis: Transfer Learning, Data requirements and Adversarial Training.- Fuzzy Classifier For Speech Assessment in Speech Rehabilitation.- Analysis-by-Synthesis Modeling of Bengali Intonation.- Neural Network Based Curve Fitting to Enhance the Intelligibility of Dysarthric Speech.- Retrieval-based Dialogue Agents.- Forensic Identification of Foreign-Language Speakers by the Method of Structural-Melodic Analysis of Phonograms.- Logistics Translator. Concept Vision on Future Interlanguage Computer Assisted Translation.- Analysis of Time-Averaged Feature Extraction Techniques on Infant Cry Classification.- Should We Believe Our Eyes or Our Ears? Processing Incongruent Audiovisual Stimuli by Russian Listeners.- Emotional Speech Recognition Based on Lip-Reading.- Exploring The Use of Machine Learning for Resume Recommendations.- The Role of Pause in Interaction: A Case of Polylogue.- Dictionary with the Evaluation of Positivity/Negativity Degree of the Russian Words.- Effects of Depth of Field on Focus using a Virtual Reality Escape Room.- Dynamics of Frequency Characteristics of Visually Evoked Potentials of Electroencephalography During the Work with Brain-Computer Interfaces.- Device Robust Acoustic Scene Classification using Adaptive Noise Reduction and Convolutional Recurrent Attention Neural Network.- Comparison of Word Embeddings of Unaligned Audio and Text Data Using Persistent Homology.- Low-Cost Training of Speech Recognition System for Hindi ASR Challenge 2022.
Este título pertence ao(s) assunto(s) indicados(s). Para ver outros títulos clique no assunto desejado.
acoustic signal processing;artificial intelligence;automatic speech recognition;computer systems;computer vision;correlation analysis;databases;image processing;information retrieval;linguistics;machine learning;mathematics;natural languages;neural networks;signal processing;speech analysis;speech communication;speech processing;speech recognition;speech synthesis