MultiMedia Modeling

MultiMedia Modeling

31st International Conference on Multimedia Modeling, MMM 2025, Nara, Japan, January 8-10, 2025, Proceedings, Part II

Kompatsiaris, Ioannis; Yamasaki, Toshihiko; Yanai, Keiji; Xu, Changsheng; Chu, Wei-Ta; Riegler, Michael; Ide, Ichiro; Nitta, Naoko

Springer Nature Switzerland AG

12/2024

455

Mole

9789819620609

15 a 20 dias

Descrição não disponível.
Regular Papers.- gFlow: Distributed Real-Time Reverse Remote Rendering System Model.- Grounding Deliberate Reasoning in Multimodal Large Language Models.- GWUNet: A UNet with Gated Attention and Improved Wavelet Transform for Thyroid Nodules Segmentation.- HCV: Lightweight Hybrid CNN-Vision Transformer for Visual Object Tracking.-HierArtEx: Hierarchical Representations and Art Experts Supporting the Retrieval of Museums in the Metaverse.- Hybrid Scalable Video Coding with Neural Compression and Enhancement for Streaming Media.- Hyper-NeuS:Hypernetworks for Neural SDF Implicit Surface Reconstruction by Volume Rendering.- Image-Generation AI Model Retrieval by Contrastive Learning-based Style Distance Calculation.- Improving Singing Voice Transcription Generalization with AI Generated Accompaniments.- Infrared Small Target Detection with Feature Refinement and Context Enhancement.- Innovative Lifelog Visualization and Exploration in Virtual Reality - A Comparative Study.- Integrating S1&S2 Framework for Enhanced Semantic Match in Person Re-identification.- Intra-Class Compact Facial Expression Recognition Based on Amplitude Phase Separation.- Joint Decision Network with Modality-Specific and Dual Interactive Features for Fake News Detection.- Kiite World: Socializing Map-Based Music Exploration Through Playlist Sharing and Synchronized Listening.- KuzushijiDiffuser: Japanese Kuzushiji Font Generation with FontDiffuser.- LIESA: Low-light Image Enhancement with Semantic Awareness.- Lightweight Dual Grouped Large-Kernel Convolutions for Salient Object Detection Network.- Lightweight Motion-Aware Video Super-Resolution for Compressed Videos.- LITA: LMM-guided Image-Text Alignment for Art Assessment.- LLMs-based Augmentation for Domain Adaptation in Long-tailed Food Datasets.- Making strides Security in Multimodal Fake News Detection Models: A Comprehensive Analysis of Adversarial Attacks.- MambaTalk: Speech-driven 3D Facial Animation with Mamba.- MC-YOLO: Multi-scale Transmission Line Defect Target Recognition Network.- MDT-Net: A Mask Decoder Tuning Strategy for CLIP-based Zero-shot 3D Classification.- MICAN? Multi-modal Inconsistency-based Cooperation Attention Network for Fake News Detection.- MineTinyNet-YOLO: An Efficient Small Object Detection Method for Complex Underground Coal Mine Scenarios.- Mix-YOLONet: Deep Image Dehazing for Improving Object Detection.- MKSNet: Advanced Small Object Detection in Remote Sensing Imagery with Multi-Kernel and Dual Attention Mechanisms.- MLP-AMDC: A MLP Architecture for Adaptive-Mask-based Dual-Camera Snapshot Hyperspectral Imaging.- MM-CARP: Multimodal Model with Cross-modal Retrieval-Augmented and Visual Region Perception.- Modality-Specific Hashing: Transform Cross-Modal Retrieval into Single-Modal Retrieval.
machine learning;image analysis;semantic information;computer programming;multimedia content analysis;multimedia mining;signal processing and communications;multimedia abstraction and summarization;security and content protection;multimedia applications;media content browsing and retrieval tools;multi-camera and multi-view;multimedia databases, content delivery and transport;audio, image, video processing, coding and compression;multimodal analysis for retrieval applications;multimedia fusion methods;semantic analysis of multimedia and contextual data;media representation and algorithms;multimedia content generation;multimedia analytics applications