MultiMedia Modeling

MultiMedia Modeling

31st International Conference on Multimedia Modeling, MMM 2025, Nara, Japan, January 8-10, 2025, Proceedings, Part I

Kompatsiaris, Ioannis; Yamasaki, Toshihiko; Yanai, Keiji; Xu, Changsheng; Chu, Wei-Ta; Riegler, Michael; Ide, Ichiro; Nitta, Naoko

Springer Nature Switzerland AG

02/2025

456

Mole

9789819620531

Pré-lançamento - envio 15 a 20 dias após a sua edição

Descrição não disponível.
Regular Papers.- A Dual-Branch Model for Color Constancy.- A Multi-Aspect Multi-Granularity Pronunciation Assessment Method Based on Branchformer Encoder and Hierarchical Aggregation.- A Multi-Expert Collaborative Framework for Multimodal Named Entity Recognition.- A Novel Human Abnormal Posture Detection Method Based on Spatial-Topological Feature Fusion of Skeleton.- AD2AT: Audio Description to Alternative Text, a Dataset of Alternative Text from Movies.- AMFT-YOLO: A Adaptive Multi-Scale YOLO Algorithm with Multi-Level Feature Fusion for Object Detection in UAV Scenes.- AMPLE: Emotion-Aware Multimodal Fusion Prompt Learning for Fake News Detection.- An Analytical Method for Rendering Plenoptic Cameras 2.0 on 3D Multi-Layer Displays.- Balancing Efficiency and Accuracy: An Analysis of Sampling for Video Copy Detection.- BiCA-YOLO: Bidirectional Feature Enhancement and Cross Coordinate Attention for Small Object Detection.- BLCC: A Benchmark for Multi-LiDAR and Multi-Camera Calibration.-Boosting Human Pose Estimation via Heatmap Refinement.- Camouflaged Object Detection Based on Localization Guidance and Multi-Scale Refinement.- Chain of Thought Guided Few-shot Fine-tuning of LLMs for Multimodal Aspect-based Sentiment Classification.- CLIP Multi-modal Hashing for Multimedia Retrieval.- Comparative Analysis of Relevance Feedback Techniques for Image Retrieval.- Cross-View Geo-Localization via Learning Correspondence Semantic Similarity Knowledge.- DART: Depth-Enhanced Accurate and Real-Time Background Matting. Data-free Functional Projection of Large Language Models onto Social Media Tagging Domain.- Deep Dual Internal Learning for Hyperspectral Image Super-Resolution.- Detoxification of Unlabeled Dataset: Reducing Implicit Class Imbalance Using Pseudo-Jacobian of GAN's Generator.- DistillSleep: Leverage Self-Distillation to Improve Performance After Representation Learning for Sleep Staging.- DocMamba: Robust Document Image Dewarping via Selective State Space Sequence Modeling.- Dual-Task Feedback Learning for Tongue Detection via Super-Resolution Integration.- Dynamic Exploration Graph: A Novel Approach for Efficient Nearest Neighbor Search in Evolving Multimedia Datasets.- EIA: Edge-aware Imperceptible Adversarial Attacks on 3D Point Clouds.- Enhancing Environmental Monitoring through Multispectral Imaging: The WasteMS Dataset for Semantic Segmentation of Lakeside Waste.- ESC-MISR: Enhancing Spatial Correlations for Multi-Image Super-Resolution in Remote Sensing.- Flat Local Minima for Continual learning on Semantic Segmentation.- FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation.- Frequency-aware Convolution for Sound Event Detection.- Frequency-Based Unsupervised Low-Light Image Enhancement Framework.- GFA-UDIS: Global-to-Flow Alignment for Unsupervised Deep Image Stitching.
machine learning;image analysis;semantic information;computer programming;multimedia content analysis;multimedia mining;signal processing and communications;multimedia abstraction and summarization;security and content protection;multimedia applications;media content browsing and retrieval tools;multi-camera and multi-view;multimedia databases, content delivery and transport;audio, image, video processing, coding and compression;multimodal analysis for retrieval applications;multimedia fusion methods;semantic analysis of multimedia and contextual data;media representation and algorithms;multimedia content generation;multimedia analytics applications