Computational Visual Media

Computational Visual Media

13th International Conference, CVM 2025, Hong Kong SAR, China, April 19-21, 2025, Proceedings, Part III

Didyk, Piotr; Hou, Junhui

Springer Nature Switzerland AG

05/2025

454

Mole

Inglês

9789819658145

Pré-lançamento - envio 15 a 20 dias após a sua edição

Descrição não disponível.
Image and Video Analysis


DepthFisheye: Efficient Fine-Tuning of Depth Estimation Models for Fisheye Cameras.- DIMATrack: Dimension Aware Data Association for Multi-Object Tracking.- Efficient Transformer Network for Visible and Ultraviolet Object Tracking.- LightGR-Transformer: Light Grouped Residual Transformer for Multispectral Object Detection.- ADMMOA: Attribute-Driven Multimodal Optimization for Face Recognition Adversarial Attacks.- Training-Free Language-Guided Video Summarization via Multi-Grained Saliency Scoring.-


Multimodal Learning


Reinforced Label Denoising for Weakly-Supervised Audio-Visual Video Parsing.- Bridging the Modality Gap: Advancing Multimodal Human Pose Estimation with Modality-Adaptive Pose Estimator and Novel Benchmark Datasets.- Momentum-Based Uni-Modal Soft-Label Alignment and Multi-Modal Latent Projection Networks for Optimizing Image-Text Retrieval.- Multi-Granularity and Multi-Modal Prompt Learning for Person Re-Identification.- Local and Global Feature Cross-attention Multimodal Place Recognition.- IML-CMM - A Multimodal Sentiment Analysis Framework Integrating Intra-Modal Learning and Cross-Modal Mixup Enhancement.-


Geometrical Processing


MCFG with GUMAP: A Simple and Effective Clustering Framework on Grassmann Manifold.- Joint UMAP for Visualization of Time-Dependent Data.- Unsupervised Domain Adaptation on Point Cloud Classification via Imposing Structural Manifolds into Representation Space.-


Applications


Learning Adaptive Basis Fonts to Fuse Content Features for Few-shot Font Generation.- TaiCrowd: A High-Performance Simulation Framework for Massive Crowd.-Feature Disentanglement and Fusion Model for Multi-Source Domain Adaptation with Domain-Specific Features.- A Trademark Retrieval Method Based on Self-Supervised Learning.- Weaken Noisy Feature: Boosting Semi-Supervised Learning by Noise Estimation.- Multi-Dimension Full Scene Integrated Visual Emotion Analysis Network.- Gap-KD: Bridging the Significant Capacity Gap Between Teacher and Student Model.
Animation and physical simulation;Cognition of visual media;Content security of visual media;Datasets and benchmarking of visual media;Editing and composition of visual media;Enhancement and re-rendering of visual media;Geometric computing for image and video;Geometry modeling and processing;Generative models;Low-level analysis, motion, and tracking of visual media;Image and video retrieval;Interactive editing of visual media;Machine learning for visual media;Recognition and understanding of visual media;Representation learning for computer vision;Rendering;Social networks and social media;Scene analysis and understanding;Visualization and visual analytics;Vision and other modalities