Presentation Schedule
All times are in Central time zone
Date: Monday, June 24, 2022 2:30PM – 5:00PM
Session Title | Poster ID | Title | Authors |
Representation Learning | 46b | Unified Contrastive Learning in Image-Text-Label Space |
Jianwei Yang; Chunyuan Li; Pengchuan Zhang; Bin Xiao; Ce Liu; Lu Yuan; Jianfeng Gao |
47b | AlignMixup: Improving Representations by Interpolating Aligned Features |
Shashanka Venkataramanan; Ewa Kijak; Laurent Amsaleg; Yannis Avrithis |
|
NOTE: Poster IDs refer to the poster # and poster time slot. |
48b | On the Road to Online Adaptation for Semantic Image Segmentation |
Riccardo Volpi; Pau De Jorge; Diane Larlus; Gabriela Csurka |
49b | ADAS: A Direct Adaptation Strategy for Multi-Target Domain Adaptive Semantic Segmentation |
Seunghun Lee; Wonhyeok Choi; Changjae Kim; Minwoo Choi; Sunghoon Im |
|
50b | Kernelized Few-Shot Object Detection With Efficient Integral Aggregation |
Shan Zhang; Lei Wang; Naila Murray; Piotr Koniusz |
|
51b | Neural Mean Discrepancy for Efficient Out-of-Distribution Detection |
Xin Dong; Junfeng Guo; Ang Li; Wei-Te Ting; Cong Liu; H.T. Kung |
|
52b | A Structured Dictionary Perspective on Implicit Neural Representations |
Gizem Yüce; Guillermo Ortiz-Jiménez; Beril Besbinar; Pascal Frossard |
|
53b | LARGE: Latent-Based Regression Through GAN Semantics |
Yotam Nitzan; Rinon Gal; Ofir Brenner; Daniel Cohen-Or |
|
54b | Rethinking Controllable Variational Autoencoders |
Huajie Shao; Yifei Yang; Haohong Lin; Longzhong Lin; Yizhuo Chen; Qinmin Yang; Han Zhao |
|
55b | Learning Canonical F-Correlation Projection for Compact Multiview Representation |
Yun-Hao Yuan; Jin Li; Yun Li; Jipeng Qiang; Yi Zhu; Xiaobo Shen; Jianping Gou |
|
56b | Cross-Architecture Self-Supervised Video Representation Learning |
Sheng Guo; Zihua Xiong; Yujie Zhong; Limin Wang; Xiaobo Guo; Bing Han; Weilin Huang |
|
57b | Improving Video Model Transfer With Dynamic Representation Learning | Yi Li; Nuno Vasconcelos | |
58b | Self-Supervised Image Representation Learning With Geometric Set Consistency |
Nenglun Chen; Lei Chu; Hao Pan; Yan Lu; Wenping Wang |
|
59b | HLRTF: Hierarchical Low-Rank Tensor Factorization for Inverse Problems in Multi-Dimensional Imaging |
Yisi Luo; Xi-Le Zhao; Deyu Meng; Tai-Xiang Jiang |
|
60b | Point-BERT: Pre-Training 3D Point Cloud Transformers With Masked Point Modeling |
Xumin Yu; Lulu Tang; Yongming Rao; Tiejun Huang; Jie Zhou; Jiwen Lu |
|
61b | DiGS: Divergence Guided Shape Implicit Neural Representation for Unoriented Point Clouds |
Yizhak Ben-Shabat; Chamin Hewa Koneputugodage; Stephen Gould |
|
62b | Neural Convolutional Surfaces |
Luca Morreale; Noam Aigerman; Paul Guerrero; Vladimir G. Kim; Niloy J. Mitra |
|
63b | Representing 3D Shapes With Probabilistic Directed Distance Fields |
Tristan Aumentado-Armstrong; Stavros Tsogkas; Sven Dickinson; Allan D. Jepson |
|
64b | H4D: Human 4D Modeling by Learning Neural Compositional Representation |
Boyan Jiang; Yinda Zhang; Xingkui Wei; Xiangyang Xue; Yanwei Fu |
|
65b | Learning Memory-Augmented Unidirectional Metrics for Cross-Modality Person Re-Identification |
Jialun Liu; Yifan Sun; Feng Zhu; Hongbin Pei; Yi Yang; Wenhui Li |
|
66b | Contrastive Regression for Domain Adaptation on Gaze Estimation |
Yaoming Wang; Yangzhou Jiang; Jin Li; Bingbing Ni; Wenrui Dai; Chenglin Li; Hongkai Xiong; Teng Li |
|
67b | Forward Compatible Training for Large-Scale Embedding Retrieval Systems |
Vivek Ramanujan; Pavan Kumar Anasosalu Vasu; Ali Farhadi; Oncel Tuzel; Hadi Pouransari |
|
68b | Improving Subgraph Recognition With Variational Graph Information Bottleneck | Junchi Yu; Jie Cao; Ran He | |
69b | Learning Soft Estimator of Keypoint Scale and Orientation With Probabilistic Covariant Loss |
Pei Yan; Yihua Tan; Shengzhou Xiong; Yuan Tai; Yansheng Li |
|
70b | Few-Shot Keypoint Detection With Uncertainty Learning for Unseen Species | Changsheng Lu; Piotr Koniusz | |
Scene Analysis and Understanding | 71b | Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation |
Xingning Dong; Tian Gan; Xuemeng Song; Jianlong Wu; Yuan Cheng; Liqiang Nie |
72b | Structured Sparse R-CNN for Direct Scene Graph Generation | Yao Teng; Limin Wang | |
73b | PPDL: Predicate Probability Distribution Based Loss for Unbiased Scene Graph Generation |
Wei Li; Haiwei Zhang; Qijie Bai; Guoqing Zhao; Ning Jiang; Xiaojie Yuan |
|
74b | RU-Net: Regularized Unrolling Network for Scene Graph Generation |
Xin Lin; Changxing Ding; Jing Zhang; Yibing Zhan; Dacheng Tao |
|
75b | Fine-Grained Predicates Learning for Scene Graph Generation |
Xinyu Lyu; Lianli Gao; Yuyu Guo; Zhou Zhao; Hao Huang; Heng Tao Shen; Jingkuan Song |
|
76b | HL-Net: Heterophily Learning Network for Scene Graph Generation |
Xin Lin; Changxing Ding; Yibing Zhan; Zijian Li; Dacheng Tao |
|
77b | SGTR: End-to-End Scene Graph Generation With Transformer | Rongjie Li; Songyang Zhang; Xuming He | |
78b | Classification-Then-Grounding: Reformulating Video Scene Graphs As Temporal Bipartite Graphs |
Kaifeng Gao; Long Chen; Yulei Niu; Jian Shao; Jun Xiao |
|
79b | RelTransformer: A Transformer-Based Long-Tail Visual Relationship Recognition |
Jun Chen; Aniket Agarwal; Sherif Abdelkarim; Deyao Zhu; Mohamed Elhoseiny |
|
80b | Spatial Commonsense Graph for Object Localisation in Partial Scenes |
Francesco Giuliari; Geri Skenderi; Marco Cristani; Yiming Wang; Alessio Del Bue |
|
81b | “The Pedestrian Next to the Lamppost” Adaptive Object Graphs for Better Instantaneous Mapping |
Avishkar Saha; Oscar Mendez; Chris Russell; Richard Bowden |
|
82b | Category-Aware Transformer Network for Better Human-Object Interaction Detection |
Leizhen Dong; Zhimin Li; Kunlun Xu; Zhijun Zhang; Luxin Yan; Sheng Zhong; Xu Zou |
|
83b | Exploring Structure-Aware Transformer Over Interaction Proposals for Human-Object Interaction Detection |
Yong Zhang; Yingwei Pan; Ting Yao; Rui Huang; Tao Mei; Chang-Wen Chen |
|
84b | Distillation Using Oracle Queries for Transformer-Based Human-Object Interaction Detection |
Xian Qu; Changxing Ding; Xingao Li; Xubin Zhong; Dacheng Tao |
|
85b | Human-Object Interaction Detection via Disentangled Transformer |
Desen Zhou; Zhichao Liu; Jian Wang; Leshan Wang; Tao Hu; Errui Ding; Jingdong Wang |
|
86b | MSTR: Multi-Scale Transformer for End-to-End Human-Object Interaction Detection |
Bumsoo Kim; Jonghwan Mun; Kyoung-Woon On; Minchul Shin; Junhyun Lee; Eun-Sol Kim |
|
87b | GaTector: A Unified Framework for Gaze Object Prediction |
Binglu Wang; Tao Hu; Baoshan Li; Xiaojuan Chen; Zhijie Zhang |
|
88b | STCrowd: A Multimodal Dataset for Pedestrian Perception in Crowded Scenes |
Peishan Cong; Xinge Zhu; Feng Qiao; Yiming Ren; Xidong Peng; Yuenan Hou; Lan Xu; Ruigang Yang; Dinesh Manocha; Yuexin Ma |
|
89b | Crowd Counting in the Frequency Domain |
Weibo Shu; Jia Wan; Kay Chen Tan; Sam Kwong; Antoni B. Chan |
|
90b | Boosting Crowd Counting via Multifaceted Attention |
Hui Lin; Zhiheng Ma; Rongrong Ji; Yaowei Wang; Xiaopeng Hong |
|
91b | Rethinking Spatial Invariance of Convolutional Networks for Object Counting |
Zhi-Qi Cheng; Qi Dai; Hong Li; Jingkuan Song; Xiao Wu; Alexander G. Hauptmann |
|
92b | Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing |
Xiaoxue Chen; Tianyu Liu; Hao Zhao; Guyue Zhou; Ya-Qin Zhang |
|
93b | Collaborative Transformers for Grounded Situation Recognition |
Junhyeong Cho; Youngseok Yoon; Suha Kwak |
|
Computational Photography | 94b | Deep Stereo Image Compression via Bi-Directional Coding |
Jianjun Lei; Xiangrui Liu; Bo Peng; Dengchao Jin; Wanqing Li; Jingxiao Gu |
95b | RFNet: Unsupervised Network for Mutually Reinforcing Multi-Modal Image Registration and Fusion |
Han Xu; Jiayi Ma; Jiteng Yuan; Zhuliang Le; Wei Liu |
|
96b | Semi-Supervised Wide-Angle Portraits Correction by Multi-Scale Transformer |
Fushun Zhu; Shan Zhao; Peng Wang; Hao Wang; Hua Yan; Shuaicheng Liu |
|
97b | Semi-Supervised Learning of Semantic Correspondence With Pseudo-Labels |
Jiwon Kim; Kwangrok Ryoo; Junyoung Seo; Gyuseong Lee; Daehwan Kim; Hansang Cho; Seungryong Kim |
|
98b | SCS-Co: Self-Consistent Style Contrastive Learning for Image Harmonization |
Yucheng Hang; Bin Xia; Wenming Yang; Qingmin Liao |
|
99b | Automatic Color Image Stitching Using Quaternion Rank-1 Alignment | Jiaxue Li; Yicong Zhou | |
100b | SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Color Editing |
Jing Shi; Ning Xu; Haitian Zheng; Alex Smith; Jiebo Luo; Chenliang Xu |
|
101b | Degree-of-Linear-Polarization-Based Color Constancy |
Taishi Ono; Yuhi Kondo; Legong Sun; Teppei Kurita; Yusuke Moriuchi |
|
102b | Point Cloud Color Constancy |
Xiaoyan Xing; Yanlin Qian; Sibo Feng; Yuhan Dong; Jiří Matas |
|
103b | Boosting View Synthesis With Residual Transfer |
Xuejian Rong; Jia-Bin Huang; Ayush Saraf; Changil Kim; Johannes Kopf |
|
104b | Deep Hyperspectral-Depth Reconstruction Using Single Color-Dot Projection |
Chunyu Li; Yusuke Monno; Masatoshi Okutomi |
|
105b | Quantization-Aware Deep Optics for Diffractive Snapshot Hyperspectral Imaging |
Lingen Li; Lizhi Wang; Weitao Song; Lei Zhang; Zhiwei Xiong; Hua Huang |
|
106b | PIE-Net: Photometric Invariant Edge Guided Network for Intrinsic Image Decomposition | Partha Das; Sezer Karaoglu; Theo Gevers | |
107b | Multimodal Material Segmentation |
Yupeng Liang; Ryosuke Wakaki; Shohei Nobuhara; Ko Nishino |
|
108b | Occlusion-Aware Cost Constructor for Light Field Depth Estimation |
Yingqian Wang; Longguang Wang; Zhengyu Liang; Jungang Yang; Wei An; Yulan Guo |
|
109b | Learning Neural Light Fields With Ray-Space Embedding |
Benjamin Attal; Jia-Bin Huang; Michael Zollhöfer; Johannes Kopf; Changil Kim |
|
110b | Acquiring a Dynamic Light Field Through a Single-Shot Coded Image |
Ryoya Mizuno; Keita Takahashi; Michitaka Yoshida; Chihiro Tsutake; Toshiaki Fujii; Hajime Nagahara |
|
111b | Gravitationally Lensed Black Hole Emission Tomography |
Aviad Levis; Pratul P. Srinivasan; Andrew A. Chael; Ren Ng; Katherine L. Bouman |
|
112b | Deep Saliency Prior for Reducing Visual Distraction |
Kfir Aberman; Junfeng He; Yossi Gandelsman; Inbar Mosseri; David E. Jacobs; Kai Kohlhoff; Yael Pritch; Michael Rubinstein |
|
113b | Personalized Image Aesthetics Assessment With Rich Attributes |
Yuzhe Yang; Liwu Xu; Leida Li; Nan Qie; Yaqian Li; Peng Zhang; Yandong Guo |
|
114b | Artistic Style Discovery With Independent Components |
Xin Xie; Yi Li; Huaibo Huang; Haiyan Fu; Wanwan Wang; Yanqing Guo |
|
Action and Event Recognition | 115b | Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos |
Muheng Li; Lei Chen; Yueqi Duan; Zhilan Hu; Jianjiang Feng; Jie Zhou; Jiwen Lu |
116b | SVIP: Sequence VerIfication for Procedures in Videos |
Yicheng Qian; Weixin Luo; Dongze Lian; Xu Tang; Peilin Zhao; Shenghua Gao |
|
117b | Set-Supervised Action Learning in Procedural Task Videos via Pairwise Order Consistency | Zijia Lu; Ehsan Elhamifar | |
118b | Exploring Denoised Cross-Video Contrast for Weakly-Supervised Temporal Action Localization |
Jingjing Li; Tianyu Yang; Wei Ji; Jue Wang; Li Cheng |
|
119b | GateHUB: Gated History Unit With Background Suppression for Online Action Detection |
Junwen Chen; Gaurav Mittal; Ye Yu; Yu Kong; Mei Chen |
|
120b | E2(GO)MOTION: Motion Augmented Event Stream for Egocentric Action Recognition |
Chiara Plizzari; Mirco Planamente; Gabriele Goletto; Marco Cannici; Emanuele Gusso; Matteo Matteucci; Barbara Caputo |
|
121b | Hybrid Relation Guided Set Matching for Few-Shot Action Recognition |
Xiang Wang; Shiwei Zhang; Zhiwu Qing; Mingqian Tang; Zhengrong Zuo; Changxin Gao; Rong Jin; Nong Sang |
|
122b | Spatio-Temporal Relation Modeling for Few-Shot Action Recognition |
Anirudh Thatipelli; Sanath Narayan; Salman Khan; Rao Muhammad Anwer; Fahad Shahbaz Khan; Bernard Ghanem |
|
123b | Alignment-Uniformity Aware Representation Learning for Zero-Shot Video Classification | Shi Pu; Kaili Zhao; Mao Zheng | |
124b | Cross-Modal Representation Learning for Zero-Shot Action Recognition |
Chung-Ching Lin; Kevin Lin; Lijuan Wang; Zicheng Liu; Linjie Li |
|
125b | Cross-Modal Background Suppression for Audio-Visual Event Localization | Yan Xia; Zhou Zhao | |
126b | Fine-Grained Temporal Contrastive Learning for Weakly-Supervised Temporal Action Localization |
Junyu Gao; Mengyuan Chen; Changsheng Xu |
|
127b | An Empirical Study of End-to-End Temporal Action Detection | Xiaolong Liu; Song Bai; Xiang Bai | |
128b | Everything at Once – Multi-Modal Fusion Transformer for Video Retrieval |
Nina Shvetsova; Brian Chen; Andrew Rouditchenko; Samuel Thomas; Brian Kingsbury; Rogerio S. Feris; David Harwath; James Glass; Hilde Kuehne |
|
129b | DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition |
Thanh-Dat Truong; Quoc-Huy Bui; Chi Nhan Duong; Han-Seok Seo; Son Lam Phung; Xin Li; Khoa Luu |
|
130b | MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection |
Rui Dai; Srijan Das; Kumara Kahatapitiya; Michael S. Ryoo; François Brémond |
|
131b | Uncertainty-Guided Probabilistic Transformer for Complex Action Recognition | Hongji Guo; Hanjing Wang; Qiang Ji | |
132b | AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition |
Yulin Wang; Yang Yue; Yuanze Lin; Haojun Jiang; Zihang Lai; Victor Kulikov; Nikita Orlov; Humphrey Shi; Gao Huang |
|
133b | UBoCo: Unsupervised Boundary Contrastive Learning for Generic Event Boundary Detection |
Hyolim Kang; Jinwoo Kim; Taehyun Kim; Seon Joo Kim |
|
134b | Detector-Free Weakly Supervised Group Activity Recognition |
Dongkeun Kim; Jinsung Lee; Minsu Cho; Suha Kwak |
|
135b | Multi-Grained Spatio-Temporal Features Perceived Network for Event-Based Lip-Reading |
Ganchao Tan; Yang Wang; Han Han; Yang Cao; Feng Wu; Zheng-Jun Zha |
|
136b | Efficient Two-Stage Detection of Human-Object Interactions With a Novel Unary-Pairwise Transformer |
Frederic Z. Zhang; Dylan Campbell; Stephen Gould |
|
137b | Interactiveness Field in Human-Object Interactions |
Xinpeng Liu; Yong-Lu Li; Xiaoqian Wu; Yu-Wing Tai; Cewu Lu; Chi-Keung Tang |
|
138b | GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection |
Yue Liao; Aixi Zhang; Miao Lu; Yongliang Wang; Xiaobo Li; Si Liu |
|
139b | Object-Relation Reasoning Graph for Action Recognition | Yangjun Ou; Li Mi; Zhenzhong Chen | |
140b | UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection |
Andra Acsintoae; Andrei Florescu; Mariana-Iuliana Georgescu; Tudor Mare; Paul Sumedrea; Radu Tudor Ionescu; Fahad Shahbaz Khan; Mubarak Shah |
|
141b | Decoupling and Recoupling Spatiotemporal Representation for RGB-D-Based Motion Recognition |
Benjia Zhou; Pichao Wang; Jun Wan; Yanyan Liang; Fan Wang; Du Zhang; Zhen Lei; Hao Li; Rong Jin |
|
142b | SPAct: Self-Supervised Privacy Preservation for Action Recognition |
Ishan Rajendrakumar Dave; Chen Chen; Mubarak Shah |
|
143b | Unsupervised Action Segmentation by Joint Representation Learning and Online Clustering |
Sateesh Kumar; Sanjay Haresh; Awais Ahmed; Andrey Konin; M. Zeeshan Zia; Quoc-Huy Tran |
|
144b | InfoGCN: Representation Learning for Human Skeleton-Based Action Recognition |
Hyung-gun Chi; Myoung Hoon Ha; Seunggeun Chi; Sang Wan Lee; Qixing Huang; Karthik Ramani |
|
145b | Learning Video Representations of Human Motion From Synthetic Data |
Xi Guo; Wei Wu; Dongliang Wang; Jing Su; Haisheng Su; Weihao Gan; Jian Huang; Qin Yang |
|
146b | Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos | Saghir Alfasly; Jian Lu; Chen Xu; Yuru Zou | |
Biometrics | 147b | EyePAD++: A Distillation-Based Approach for Joint Eye Authentication and Presentation Attack Detection Using Periocular Images |
Prithviraj Dhar; Amit Kumar; Kirsten Kaplan; Khushi Gupta; Rakesh Ranjan; Rama Chellappa |
148b | Gait Recognition in the Wild With Dense 3D Representations and a Benchmark |
Jinkai Zheng; Xinchen Liu; Wu Liu; Lingxiao He; Chenggang Yan; Tao Mei |
|
149b | Camera-Conditioned Stable Feature Generation for Isolated Camera Supervised Person Re-IDentification |
Chao Wu; Wenhang Ge; Ancong Wu; Xiaobin Chang |
|
150b | Lagrange Motion Analysis and View Embeddings for Improved Gait Recognition |
Tianrui Chai; Annan Li; Shaoxiong Zhang; Zilong Li; Yunhong Wang |
|
151b | DeepFace-EMD: Re-Ranking Using Patch-Wise Earth Mover’s Distance Improves Out-of-Distribution Face Identification | Hai Phan; Anh Nguyen | |
152b | Learning Second Order Local Anomaly for General Face Forgery Detection |
Jianwei Fei; Yunshu Dai; Peipeng Yu; Tianrun Shen; Zhihua Xia; Jian Weng |
|
153b | PatchNet: A Simple Face Anti-Spoofing Framework via Fine-Grained Patch Recognition |
Chien-Yi Wang; Yu-Ding Lu; Shang-Ta Yang; Shang-Hong Lai |
|
154b | Face2Exp: Combating Data Biases for Facial Expression Recognition |
Dan Zeng; Zhiyuan Lin; Xiao Yan; Yuting Liu; Fei Wang; Bo Tang |
|
155b | Local-Adaptive Face Recognition via Graph-Based Meta-Clustering and Regularized Adaptation |
Wenbin Zhu; Chien-Yi Wang; Kuan-Lun Tseng; Shang-Hong Lai; Baoyuan Wang |
|
Face and Gestures | 156b | EMOCA: Emotion Driven Monocular Face Capture and Animation |
Radek Daněček; Michael J. Black; Timo Bolkart |
157b | Robust Egocentric Photo-Realistic Facial Expression Transfer for Virtual Reality |
Amin Jourabloo; Fernando De la Torre; Jason Saragih; Shih-En Wei; Stephen Lombardi; Te-Li Wang; Danielle Belko; Autumn Trimble; Hernan Badino |
|
158b | FaceVerse: A Fine-Grained and Detail-Controllable 3D Face Morphable Model From a Hybrid Dataset |
Lizhen Wang; Zhiyuan Chen; Tao Yu; Chenguang Ma; Liang Li; Yebin Liu |
|
159b | ImFace: A Nonlinear 3D Morphable Face Model With Implicit Neural Representations |
Mingwu Zheng; Hongyu Yang; Di Huang; Liming Chen |
|
160b | Physically-Guided Disentangled Implicit Rendering for 3D Face Modeling |
Zhenyu Zhang; Yanhao Ge; Ying Tai; Weijian Cao; Renwang Chen; Kunlin Liu; Hao Tang; Xiaoming Huang; Chengjie Wang; Zhifeng Xie; Dongjin Huang |
|
161b | RigNeRF: Fully Controllable Neural 3D Portraits |
ShahRukh Athar; Zexiang Xu; Kalyan Sunkavalli; Eli Shechtman; Zhixin Shu |
|
162b | HeadNeRF: A Real-Time NeRF-Based Parametric Head Model |
Yang Hong; Bo Peng; Haiyao Xiao; Ligang Liu; Juyong Zhang |
|
163b | Sparse to Dense Dynamic 3D Facial Expression Generation |
Naima Otberdout; Claudio Ferrari; Mohamed Daoudi; Stefano Berretti; Alberto Del Bimbo |
|
164b | Learning To Listen: Modeling Non-Deterministic Dyadic Facial Motion |
Evonne Ng; Hanbyul Joo; Liwen Hu; Hao Li; Trevor Darrell; Angjoo Kanazawa; Shiry Ginosar |
|
165b | Speech Driven Tongue Animation |
Salvador Medina; Denis Tome; Carsten Stoll; Mark Tiede; Kevin Munhall; Alexander G. Hauptmann; Iain Matthews |
|
166b | Knowledge-Driven Self-Supervised Representation Learning for Facial Action Unit Recognition | Yanan Chang; Shangfei Wang | |
167b | gDNA: Towards Generative Detailed Neural Avatars |
Xu Chen; Tianjian Jiang; Jie Song; Jinlong Yang; Michael J. Black; Andreas Geiger; Otmar Hilliges |
|
168b | GraFormer: Graph-Oriented Transformer for 3D Pose Estimation | Weixi Zhao; Weiqiang Wang; Yunjie Tian | |
169b | Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose Estimation |
Jogendra Nath Kundu; Siddharth Seth; Pradyumna YM; Varun Jampani; Anirban Chakraborty; R. Venkatesh Babu |
|
170b | Towards Diverse and Natural Scene-Aware 3D Human Motion Synthesis |
Jingbo Wang; Yu Rong; Jingyuan Liu; Sijie Yan; Dahua Lin; Bo Dai |
|
171b | PINA: Learning a Personalized Implicit Neural Avatar From a Single RGB-D Video Sequence |
Zijian Dong; Chen Guo; Jie Song; Xu Chen; Andreas Geiger; Otmar Hilliges |
|
172b | The Wanderings of Odysseus in 3D Scenes | Yan Zhang; Siyu Tang | |
173b | OSSO: Obtaining Skeletal Shape From Outside |
Marilyn Keller; Silvia Zuffi; Michael J. Black; Sergi Pujades |
|
174b | LiDARCap: Long-Range Marker-Less 3D Human Motion Capture With LiDAR Point Clouds |
Jialian Li; Jingyi Zhang; Zhiyong Wang; Siqi Shen; Chenglu Wen; Yuexin Ma; Lan Xu; Jingyi Yu; Cheng Wang |
|
175b | Unimodal-Concentrated Loss: Fully Adaptive Label Distribution Learning for Ordinal Regression |
Qiang Li; Jingjing Wang; Zhaoliang Yao; Yachun Li; Pengju Yang; Jingwei Yan; Chunmao Wang; Shiliang Pu |
|
176b | Spatial-Temporal Parallel Transformer for Arm-Hand Dynamic Estimation |
Shuying Liu; Wenbin Wu; Jiaxian Wu; Yue Lin |
|
177b | LISA: Learning Implicit Shape and Appearance of Hands |
Enric Corona; Tomas Hodan; Minh Vo; Francesc Moreno-Noguer; Chris Sweeney; Richard Newcombe; Lingni Ma |
|
178b | MobRecon: Mobile-Friendly Hand Mesh Reconstruction From Monocular Image |
Xingyu Chen; Yufeng Liu; Yajiao Dong; Xiong Zhang; Chongyang Ma; Yanmin Xiong; Yuan Zhang; Xiaoyan Guo |
|
179b | Mining Multi-View Information: A Strong Self-Supervised Framework for Depth-Based 3D Hand Pose and Mesh Estimation |
Pengfei Ren; Haifeng Sun; Jiachang Hao; Jingyu Wang; Qi Qi; Jianxin Liao |
|
180b | Low-Resource Adaptation for Personalized Co-Speech Gesture Generation |
Chaitanya Ahuja; Dong Won Lee; Louis-Philippe Morency |
|
181b | D-Grasp: Physically Plausible Dynamic Grasp Synthesis for Hand-Object Interactions |
Sammy Christen; Muhammed Kocabas; Emre Aksan; Jemin Hwangbo; Jie Song; Otmar Hilliges |
|
Medical, Biological and Cell Microscopy | 182b | Synthetic Generation of Face Videos With Plethysmograph Physiology |
Zhen Wang; Yunhao Ba; Pradyumna Chari; Oyku Deniz Bozkurt; Gianna Brown; Parth Patwa; Niranjan Vaddi; Laleh Jalilian; Achuta Kadambi |
183b | Contour-Hugging Heatmaps for Landmark Detection | James McCouat; Irina Voiculescu | |
184b | Which Images To Label for Few-Shot Medical Landmark Detection? |
Quan Quan; Qingsong Yao; Jun Li; S. Kevin Zhou |
|
185b | Self-Supervised Bulk Motion Artifact Removal in Optical Coherence Tomography Angiography |
Jiaxiang Ren; Kicheon Park; Yingtian Pan; Haibin Ling |
|
186b | Multi-Marginal Contrastive Learning for Multi-Label Subcellular Protein Localization | Ziyi Liu; Zengmao Wang; Bo Du | |
187b | Transformer-Empowered Multi-Scale Contextual Matching and Aggregation for Multi-Contrast MRI Super-Resolution |
Guangyuan Li; Jun Lv; Yapeng Tian; Qi Dou; Chengyan Wang; Chenliang Xu; Jing Qin |
|
188b | Harmony: A Generic Unsupervised Approach for Disentangling Semantic Content From Parameterized Transformations |
Mostofa Rafid Uddin; Gregory Howe; Xiangrui Zeng; Min Xu |
|
189b | Cross-Modal Clinical Graph Transformer for Ophthalmic Report Generation |
Mingjie Li; Wenjia Cai; Karin Verspoor; Shirui Pan; Xiaodan Liang; Xiaojun Chang |
|
190b | BoostMIS: Boosting Medical Image Semi-Supervised Learning With Adaptive Pseudo Labeling and Informative Active Annotation |
Wenqiao Zhang; Lei Zhu; James Hallinan; Shengyu Zhang; Andrew Makmur; Qingpeng Cai; Beng Chin Ooi |
|
191b | Incremental Cross-View Mutual Distillation for Self-Supervised Medical CT Synthesis |
Chaowei Fang; Liang Wang; Dingwen Zhang; Jun Xu; Yixuan Yuan; Junwei Han |
|
192b | Towards Low-Cost and Efficient Malaria Detection |
Waqas Sultani; Wajahat Nawaz; Syed Javed; Muhammad Sohail Danish; Asma Saadia; Mohsen Ali |
|
193b | ACPL: Anti-Curriculum Pseudo-Labelling for Semi-Supervised Medical Image Classification |
Fengbei Liu; Yu Tian; Yuanhong Chen; Yuyuan Liu; Vasileios Belagiannis; Gustavo Carneiro |
|
194b | Multimodal Dynamics: Dynamical Fusion for Trustworthy Multimodal Classification |
Zongbo Han; Fan Yang; Junzhou Huang; Changqing Zhang; Jianhua Yao |
|
195b | M3T: Three-Dimensional Medical Image Classifier Using Multi-Plane and Multi-Slice Transformer | Jinseong Jang; Dosik Hwang | |
196b | Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image Analysis |
Yucheng Tang; Dong Yang; Wenqi Li; Holger R. Roth; Bennett Landman; Daguang Xu; Vishwesh Nath; Ali Hatamizadeh |
|
197b | HyperSegNAS: Bridging One-Shot Neural Architecture Search With 3D Medical Image Segmentation Using HyperNet |
Cheng Peng; Andriy Myronenko; Ali Hatamizadeh; Vishwesh Nath; Md Mahfuzur Rahman Siddiquee; Yufan He; Daguang Xu; Rama Chellappa; Dong Yang |
|
198b | DArch: Dental Arch Prior-Assisted 3D Tooth Instance Segmentation With Weak Annotations |
Liangdong Qiu; Chongjie Ye; Pei Chen; Yunbi Liu; Xiaoguang Han; Shuguang Cui |
|
199b | Clean Implicit 3D Structure From Noisy 2D STEM Images |
Hannah Kniesel; Timo Ropinski; Tim Bergner; Kavitha Shaga Devan; Clarissa Read; Paul Walther; Tobias Ritschel; Pedro Hermosilla |
|
200b | Vox2Cortex: Fast Explicit Reconstruction of Cortical Surfaces From 3D MRI Scans With Geometric Deep Neural Networks |
Fabian Bongratz; Anne-Marie Rickmann; Sebastian Pölsterl; Christian Wachinger |
|
201b | Aladdin: Joint Atlas Building and Diffeomorphic Registration Learning With Pairwise Alignment | Zhipeng Ding; Marc Niethammer | |
202b | Learning Optimal K-Space Acquisition and Reconstruction Using Physics-Informed Neural Networks | Wei Peng; Li Feng; Guoying Zhao; Fang Liu | |
203b | NODEO: A Neural Ordinary Differential Equation Based Optimization Framework for Deformable Image Registration |
Yifan Wu; Tom Z. Jiahao; Jiancong Wang; Paul A. Yushkevich; M. Ani Hsieh; James C. Gee |
|
204b | SMPL-A: Modeling Person-Specific Deformable Anatomy |
Hengtao Guo; Benjamin Planche; Meng Zheng; Srikrishna Karanam; Terrence Chen; Ziyan Wu |
|
205b | DiRA: Discriminative, Restorative, and Adversarial Learning for Self-Supervised Medical Image Analysis |
Fatemeh Haghighi; Mohammad Reza Hosseinzadeh Taher; Michael B. Gotway; Jianming Liang |
|
206b | Affine Medical Image Registration With Coarse-To-Fine Vision Transformer | Tony C. W. Mok; Albert C. S. Chung | |
207b | Topology-Preserving Shape Reconstruction and Registration via Neural Diffeomorphic Flow |
Shanlin Sun; Kun Han; Deying Kong; Hao Tang; Xiangyi Yan; Xiaohui Xie |
|
208b | Generalizable Cross-Modality Medical Image Segmentation via Style Augmentation and Dual Normalization |
Ziqi Zhou; Lei Qi; Xin Yang; Dong Ni; Yinghuan Shi |
|
209b | Closing the Generalization Gap of Cross-Silo Federated Medical Image Segmentation |
An Xu; Wenqi Li; Pengfei Guo; Dong Yang; Holger R. Roth; Ali Hatamizadeh; Can Zhao; Daguang Xu; Heng Huang; Ziyue Xu |
|
210b | FIBA: Frequency-Injection Based Backdoor Attack in Medical Image Analysis |
Yu Feng; Benteng Ma; Jing Zhang; Shanshan Zhao; Yong Xia; Dacheng Tao |
|
211b | Surpassing the Human Accuracy: Detecting Gallbladder Cancer From USG Images With Curriculum Learning |
Soumen Basu; Mayank Gupta; Pratyaksha Rana; Pankaj Gupta; Chetan Arora |
|
212b | CellTypeGraph: A New Geometric Computer Vision Benchmark |
Lorenzo Cerrone; Athul Vijayan; Tejasvinee Mody; Kay Schneitz; Fred A. Hamprecht |
|
213b | ContIG: Self-Supervised Multimodal Contrastive Learning for Medical Imaging With Genetics |
Aiham Taleb; Matthias Kirchler; Remo Monti; Christoph Lippert |
|
Datasets and Evaluation | 214b | FERV39k: A Large-Scale Multi-Scene Dataset for Facial Expression Recognition in Videos |
Yan Wang; Yixuan Sun; Yiwen Huang; Zhongying Liu; Shuyong Gao; Wei Zhang; Weifeng Ge; Wenqiang Zhang |
215b | Multi-Dimensional, Nuanced and Subjective – Measuring the Perception of Facial Expressions |
De'Aira Bryant; Siqi Deng; Nashlie Sephus; Wei Xia; Pietro Perona |
|
216b | DAD-3DHeads: A Large-Scale Dense, Accurate and Diverse Dataset for 3D Head Alignment From a Single Image |
Tetiana Martyniuk; Orest Kupyn; Yana Kurlyak; Igor Krashenyi; Jiří Matas; Viktoriia Sharmanska |
|
217b | OakInk: A Large-Scale Knowledge Repository for Understanding Hand-Object Interaction |
Lixin Yang; Kailin Li; Xinyu Zhan; Fei Wu; Anran Xu; Liu Liu; Cewu Lu |
|
218b | PoseTrack21: A Dataset for Person Search, Multi-Object Tracking and Multi-Person Pose Tracking |
Andreas Döring; Di Chen; Shanshan Zhang; Bernt Schiele; Jürgen Gall |
|
219b | Learning Modal-Invariant and Temporal-Memory for Video-Based Visible-Infrared Person Re-Identification |
Xinyu Lin; Jinxing Li; Zeyu Ma; Huafeng Li; Shuang Li; Kaixiong Xu; Guangming Lu; David Zhang |
|
220b | JRDB-Act: A Large-Scale Dataset for Spatio-Temporal Action, Social Group and Activity Detection |
Mahsa Ehsanpour; Fatemeh Saleh; Silvio Savarese; Ian Reid; Hamid Rezatofighi |
|
221b | DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion |
Peize Sun; Jinkun Cao; Yi Jiang; Zehuan Yuan; Song Bai; Kris Kitani; Ping Luo |
|
222b | Egocentric Prediction of Action Target in 3D |
Yiming Li; Ziang Cao; Andrew Liang; Benjamin Liang; Luoyao Chen; Hang Zhao; Chen Feng |
|
223b | HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction |
Yunze Liu; Yun Liu; Che Jiang; Kangbo Lyu; Weikang Wan; Hao Shen; Boqiang Liang; Zhoujie Fu; He Wang; Li Yi |
|
224b | Amodal Panoptic Segmentation | Rohit Mohan; Abhinav Valada | |
225b | Large-Scale Video Panoptic Segmentation in the Wild: A Benchmark |
Jiaxu Miao; Xiaohan Wang; Yu Wu; Wei Li; Xu Zhang; Yunchao Wei; Yi Yang |
|
226b | YouMVOS: An Actor-Centric Multi-Shot Video Object Segmentation Dataset |
Donglai Wei; Siddhant Kharbanda; Sarthak Arora; Roshan Roy; Nishant Jain; Akash Palrecha; Tanav Shah; Shray Mathur; Ritik Mathur; Abhijay Kemkar; Anirudh Chakravarthy; Zudi Lin; Won-Dong Jang; Yansong Tang; Song Bai; James Tompkin; Philip H.S. Torr; Hanspeter Pfister |
|
227b | The DEVIL Is in the Details: A Diagnostic Evaluation Benchmark for Video Inpainting | Ryan Szeto; Jason J. Corso | |
228b | 3MASSIV: Multilingual, Multimodal and Multi-Aspect Dataset of Social Media Short Videos |
Vikram Gupta; Trisha Mittal; Puneet Mathur; Vaibhav Mishra; Mayank Maheshwari; Aniket Bera; Debdoot Mukherjee; Dinesh Manocha |
|
229b | AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval |
Riku Togashi; Mayu Otani; Yuta Nakashima; Esa Rahtu; Janne Heikkilä; Tetsuya Sakai |
|
230b | A Large-Scale Comprehensive Dataset and Copy-Overlap Aware Evaluation Protocol for Segment-Level Video Copy Detection |
Sifeng He; Xudong Yang; Chen Jiang; Gang Liang; Wei Zhang; Tan Pan; Qing Wang; Furong Xu; Chunguang Li; JinXiong Liu; Hui Xu; Kaiming Huang; Yuan Cheng; Feng Qian; Xiaobo Zhang; Lei Yang |
|
231b | Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities |
Fadime Sener; Dibyadip Chatterjee; Daniel Shelepov; Kun He; Dipika Singhania; Robert Wang; Angela Yao |
|
232b | Optimal Correction Cost for Object Detection Evaluation |
Mayu Otani; Riku Togashi; Yuta Nakashima; Esa Rahtu; Janne Heikkilä; Shin'ichi Satoh |
|
233b | GrainSpace: A Large-Scale Dataset for Fine-Grained and Domain-Adaptive Recognition of Cereal Grains |
Lei Fan; Yiwen Ding; Dongdong Fan; Donglin Di; Maurice Pagnucco; Yang Song |
|
234b | ABO: Dataset and Benchmarks for Real-World 3D Object Understanding |
Jasmine Collins; Shubham Goel; Kenan Deng; Achleshwar Luthra; Leon Xu; Erhan Gundogdu; Xi Zhang; Tomas F. Yago Vicente; Thomas Dideriksen; Himanshu Arora; Matthieu Guillaumin; Jitendra Malik |
|
235b | Improving Segmentation of the Inferior Alveolar Nerve Through Deep Label Propagation |
Marco Cipriano; Stefano Allegretti; Federico Bolelli; Federico Pollastri; Costantino Grana |
|
236b | ZeroWaste Dataset: Towards Deformable Object Segmentation in Cluttered Scenes |
Dina Bashkirova; Mohamed Abdelfattah; Ziliang Zhu; James Akl; Fadi Alladkani; Ping Hu; Vitaly Ablavsky; Berk Calli; Sarah Adel Bargal; Kate Saenko |
|
237b | DynamicEarthNet: Daily Multi-Spectral Satellite Dataset for Semantic Change Segmentation |
Aysim Toker; Lukas Kondmann; Mark Weber; Marvin Eisenberger; Andrés Camero; Jingliang Hu; Ariadna Pregel Hoderlein; Çağlar Şenaras; Timothy Davis; Daniel Cremers; Giovanni Marchisio; Xiao Xiang Zhu; Laura Leal-Taixé |
|
238b | Open Challenges in Deep Stereo: The Booster Dataset |
Pierluigi Zama Ramirez; Fabio Tosi; Matteo Poggi; Samuele Salti; Stefano Mattoccia; Luigi Di Stefano |
|
239b | No-Reference Point Cloud Quality Assessment via Domain Adaptation |
Qi Yang; Yipeng Liu; Siheng Chen; Yiling Xu; Jun Sun |
|
240b | Exploring Endogenous Shift for Cross-Domain Detection: A Large-Scale Benchmark and Perturbation Suppression Network |
Renshuai Tao; Hainan Li; Tianbo Wang; Yanlu Wei; Yifu Ding; Bowei Jin; Hongping Zhi; Xianglong Liu; Aishan Liu |
|
241b | How Good Is Aesthetic Ability of a Fashion Model? |
Xingxing Zou; Kaicheng Pang; Wen Zhang; Waikeung Wong |
|
242b | Instance-Wise Occlusion and Depth Orders in Natural Scenes | Hyunmin Lee; Jaesik Park | |
243b | PhoCaL: A Multi-Modal Dataset for Category-Level Object Pose Estimation With Photometrically Challenging Objects |
Pengyuan Wang; HyunJun Jung; Yitong Li; Siyuan Shen; Rahul Parthasarathy Srikanth; Lorenzo Garattoni; Sven Meier; Nassir Navab; Benjamin Busam |
|
244b | Replacing Labeled Real-Image Datasets With Auto-Generated Contours |
Hirokatsu Kataoka; Ryo Hayamizu; Ryosuke Yamada; Kodai Nakashima; Sora Takashima; Xinyu Zhang; Edgar Josafat Martinez-Noriega; Nakamasa Inoue; Rio Yokota |
|
245b | V2C: Visual Voice Cloning |
Qi Chen; Mingkui Tan; Yuankai Qi; Jiaqiu Zhou; Yuanqing Li; Qi Wu |
|
246b | M5Product: Self-Harmonized Contrastive Learning for E-Commercial Multi-Modal Pretraining |
Xiao Dong; Xunlin Zhan; Yangxin Wu; Yunchao Wei; Michael C. Kampffmeyer; Xiaoyong Wei; Minlong Lu; Yaowei Wang; Xiaodan Liang |
|
247b | It Is Okay To Not Be Okay: Overcoming Emotional Bias in Affective Image Captioning by Contrastive Data Collection |
Youssef Mohamed; Faizan Farooq Khan; Kilichbek Haydarov; Mohamed Elhoseiny |
|
248b | From Representation to Reasoning: Towards Both Evidence and Commonsense Reasoning for Video Question-Answering | Jiangtong Li; Li Niu; Liqing Zhang | |
249b | Point Cloud Pre-Training With Natural 3D Structures |
Ryosuke Yamada; Hirokatsu Kataoka; Naoya Chiba; Yukiyasu Domae; Tetsuya Ogata |
|
250b | The Auto Arborist Dataset: A Large-Scale Benchmark for Multiview Urban Forest Monitoring Under Domain Shift |
Sara Beery; Guanhang Wu; Trevor Edwards; Filip Pavetic; Bo Majewski; Shreyasee Mukherjee; Stanley Chan; John Morgan; Vivek Rathod; Jonathan Huang |
|
251b | AutoMine: An Unmanned Mine Dataset |
Yuchen Li; Zixuan Li; Siyu Teng; Yu Zhang; Yuhang Zhou; Yuchang Zhu; Dongpu Cao; Bin Tian; Yunfeng Ai; Zhe Xuanyuan; Long Chen |
|
252b | SmartPortraits: Depth Powered Handheld Smartphone Dataset of Human Portraits for State Estimation, Reconstruction and Synthesis |
Anastasiia Kornilova; Marsel Faizullin; Konstantin Pakulev; Andrey Sadkov; Denis Kukushkin; Azat Akhmetyanov; Timur Akhtyamov; Hekmat Taherinejad; Gonzalo Ferrer |
|
253b | BigDatasetGAN: Synthesizing ImageNet With Pixel-Wise Annotations |
Daiqing Li; Huan Ling; Seung Wook Kim; Karsten Kreis; Sanja Fidler; Antonio Torralba |
|
254b | Rope3D: The Roadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task |
Xiaoqing Ye; Mao Shu; Hanyu Li; Yifeng Shi; Yingying Li; Guangjie Wang; Xiao Tan; Errui Ding |
|
255b | Unifying Panoptic Segmentation for Autonomous Driving |
Oliver Zendel; Matthias Schörghuber; Bernhard Rainer; Markus Murschitz; Csaba Beleznai |
|
256b | DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection |
Haibao Yu; Yizhen Luo; Mao Shu; Yiyi Huo; Zebang Yang; Yifeng Shi; Zhenglong Guo; Hanyu Li; Xing Hu; Jirui Yuan; Zaiqing Nie |
|
257b | SHIFT: A Synthetic Driving Dataset for Continuous Multi-Task Domain Adaptation |
Tao Sun; Mattia Segu; Janis Postels; Yuxuan Wang; Luc Van Gool; Bernt Schiele; Federico Tombari; Fisher Yu |
|
258b | Ithaca365: Dataset and Driving Perception Under Repeated and Challenging Weather Conditions |
Carlos A. Diaz-Ruiz; Youya Xia; Yurong You; Jose Nino; Junan Chen; Josephine Monica; Xiangyu Chen; Katie Luo; Yan Wang; Marc Emond; Wei-Lun Chao; Bharath Hariharan; Kilian Q. Weinberger; Mark Campbell |