Toggle navigation
OpenReview
.net
Login
×
Back to
ACMMM
ACMMM 2024 Conference Submissions
GalleryGPT: Analyzing Paintings with Large Multimodal Models
Yi Bin
,
WENHAO SHI
,
Yujuan Ding
,
Zhiqiang Hu
,
Zheng WANG
,
Yang Yang
,
See-Kiong Ng
,
Heng Tao Shen
Published: 20 Jul 2024, Last Modified: 05 Aug 2024
MM2024 Oral
Readers:
Everyone
RDLNet: A Novel and Accurate Real-world Document Localization Method
Yaqiang Wu
,
Zhen Xu
,
Yong Duan
,
Yanlai Wu
,
Qinghua Zheng
,
Hui Li
,
Xiaochen Hu
,
Lianwen Jin
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
SATPose: Improving Monocular 3D Pose Estimation with Spatial-aware Ground Tactility
Lishuang Zhan
,
Enting Ying
,
Jiabao Gan
,
Shihui Guo
,
BoYu Gao
,
Yipeng Qin
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
RelScene: A Benchmark and baseline for Spatial Relations in text-driven 3D Scene Generation
Zhaoda Ye
,
Xinhan Zheng
,
Yang Liu
,
Yuxin Peng
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization
Xingqi Wang
,
Xiaoyuan Yi
,
Xing Xie
,
Jia Jia
Published: 20 Jul 2024, Last Modified: 06 Aug 2024
MM2024 Poster
Readers:
Everyone
SCREEN: A Benchmark for Situated Conversational Recommendation
Dongding Lin
,
Jian Wang
,
Chak Tou Leong
,
Wenjie Li
Published: 20 Jul 2024, Last Modified: 05 Aug 2024
MM2024 Poster
Readers:
Everyone
Edge-assisted Real-time Dynamic 3D Point Cloud Rendering for Multi-party Mobile Virtual Reality
Ximing Wu
,
Kongyange Zhao
,
Teng Liang
,
Xu Chen
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
Semantic Editing Increment Benefits Zero-Shot Composed Image Retrieval
Zhenyu Yang
,
Shengsheng Qian
,
Dizhan Xue
,
Jiahong Wu
,
Fan Yang
,
Weiming Dong
,
Changsheng Xu
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
Ada2I: Enhancing Modality Balance for Multimodal Conversational Emotion Recognition
Cam Van Thi Nguyen
,
Son Le The
,
Tuan Anh Mai
,
Duc-Trong Le
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
AMG-Embedding: a Self-Supervised Embedding Approach for Audio Identification
Yuhang Su
,
Wei Hu
,
Fan Zhang
,
Qiming Xu
Published: 20 Jul 2024, Last Modified: 05 Aug 2024
MM2024 Poster
Readers:
Everyone
Diverse consensuses paired with motion estimation-based multi-model fitting
Wenyu Yin
,
Shuyuan Lin
,
Yang Lu
,
Hanzi Wang
Published: 20 Jul 2024, Last Modified: 05 Aug 2024
MM2024 Poster
Readers:
Everyone
EEG-MACS: Manifold Attention and Confidence Stratification for EEG-based Cross-Center Brain Disease Diagnosis under Unreliable Annotations
Zhenxi Song
,
Ruihan Qin
,
Huixia Ren
,
Zhen Liang
,
Yi Guo
,
Min zhang
,
Zhiguo Zhang
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Oral
Readers:
Everyone
Control-Talker: A Rapid-Customization Talking Head Generation Method for Multi-Condition Control and High-Texture Enhancement
Yiding Li
,
Lingyun Yu
,
Li Wang
,
Hongtao Xie
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
HINER: Neural Representation for Hyperspectral Image
Junqi Shi
,
Mingyi Jiang
,
Ming Lu
,
Tong Chen
,
Xun Cao
,
Zhan Ma
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
ExpressiveSinger: Multilingual and Multi-Style Score-based Singing Voice Synthesis with Expressive Performance Control
Shuqi Dai
,
Ming-Yu Liu
,
Rafael Valle
,
Siddharth Gururani
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
HMR-Adapter: A Lightweight Adapter with Dual-Path Cross Augmentation for Expressive Human Mesh Recovery
Wenhao Shen
,
Wanqi Yin
,
Hao Wang
,
Chen Wei
,
Zhongang Cai
,
Lei Yang
,
Guosheng Lin
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
Navigating Beyond Instructions: Vision-and-Language Navigation in Obstructed Environments
Haodong Hong
,
Sen Wang
,
Zi Huang
,
Qi Wu
,
Jiajun Liu
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Oral
Readers:
Everyone
Deeply Fusing Semantics and Interactions for Item Representation Learning via Topology-driven Pre-training
Shiqin Liu
,
Chaozhuo Li
,
Xi Zhang
,
Minjun Zhao
,
yuanbo xu
,
Jiajun Bu
Published: 20 Jul 2024, Last Modified: 05 Aug 2024
MM2024 Poster
Readers:
Everyone
A Unimodal Valence-Arousal Driven Contrastive Learning Framework for Multimodal Multi-Label Emotion Recognition
Wenjie Zheng
,
Jianfei Yu
,
Rui Xia
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Oral
Readers:
Everyone
SkipVSR: Adaptive Patch Routing for Video Super-Resolution with Inter-Frame Mask
zekun Ai
,
Xiaotong Luo
,
Yanyun Qu
,
Yuan Xie
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
AlignCLIP: Align Multi Domains of Texts Input for CLIP models with Object-IoU Loss
Lu Zhang
,
Ke Yan
,
Shouhong Ding
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
LDStega: Practical and Robust Generative Image Steganography based on Latent Diffusion Models
Yinyin Peng
,
Yaofei Wang
,
Donghui Hu
,
Kejiang Chen
,
Xianjin Rong
,
Weiming Zhang
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models
Haibo Yang
,
Yang Chen
,
Yingwei Pan
,
Ting Yao
,
Zhineng Chen
,
Chong-Wah Ngo
,
Tao Mei
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
Multimodal Fusion via Hypergraph Autoencoder and Contrastive Learning for Emotion Recognition in Conversation
Zijian Yi
,
Ziming Zhao
,
Zhishu Shen
,
Tiehua Zhang
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
Hierarchical Debiasing and Noisy Correction for Cross-domain Video Tube Retrieval
Jingqiao Xiu
,
Mengze Li
,
Wei Ji
,
Jingyuan Chen
,
Hanbin Zhao
,
Shin'ichi Satoh
,
Roger Zimmermann
Published: 20 Jul 2024, Last Modified: 21 Jul 2024
MM2024 Poster
Readers:
Everyone
«
‹
1
2
3
4
5
6
7
8
9
10
›
»