Presentation Schedule
All times are in Central time zone
Date: Thursday, June 23, 2022 2:30PM – 5:00PM
Session Title | Poster ID | Title | Authors |
Video Analysis & Understanding | 46b | UnweaveNet: Unweaving Activity Stories | Will Price; Carl Vondrick; Dima Damen |
47b | Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos |
Reza Ghoddoosian; Isht Dwivedi; Nakul Agarwal; Chiho Choi; Behzad Dariush |
|
|
48b | Audio-Adaptive Activity Recognition Across Video Domains |
Yunhua Zhang; Hazel Doughty; Ling Shao; Cees G. M. Snoek |
49b | Frame-Wise Action Representations for Long Videos via Sequence Contrastive Learning |
Minghao Chen; Fangyun Wei; Chong Li; Deng Cai |
|
50b | Image Based Reconstruction of Liquids From 2D Surface Detections |
Florian Richter; Ryan K. Orosco; Michael C. Yip |
|
51b | Learning From Untrimmed Videos: Self-Supervised Video Representation Learning With Hierarchical Consistency |
Zhiwu Qing; Shiwei Zhang; Ziyuan Huang; Yi Xu; Xiang Wang; Mingqian Tang; Changxin Gao; Rong Jin; Nong Sang |
|
52b | How Do You Do It? Fine-Grained Action Understanding With Pseudo-Adverbs | Hazel Doughty; Cees G. M. Snoek | |
53b | Programmatic Concept Learning for Human Motion Description and Synthesis |
Sumith Kulal; Jiayuan Mao; Alex Aiken; Jiajun Wu |
|
54b | Learning To Recognize Procedural Activities With Distant Supervision |
Xudong Lin; Fabio Petroni; Gedas Bertasius; Marcus Rohrbach; Shih-Fu Chang; Lorenzo Torresani |
|
55b | Implicit Motion Handling for Video Camouflaged Object Detection |
Xuelian Cheng; Huan Xiong; Deng-Ping Fan; Yiran Zhong; Mehrtash Harandi; Tom Drummond; Zongyuan Ge |
|
56b | Dynamic Scene Graph Generation via Anticipatory Pre-Training | Yiming Li; Xiaoshan Yang; Changsheng Xu | |
57b | Learning To Refactor Action and Co-Occurrence Features for Temporal Action Localization |
Kun Xia; Le Wang; Sanping Zhou; Nanning Zheng; Wei Tang |
|
58b | OCSampler: Compressing Videos to One Clip With Single-Step Sampling |
Jintao Lin; Haodong Duan; Kai Chen; Dahua Lin; Limin Wang |
|
59b | A Hybrid Egocentric Activity Anticipation Framework via Memory-Augmented Recurrent and One-Shot Representation Forecasting | Tianshan Liu; Kin-Man Lam | |
60b | TubeFormer-DeepLab: Video Mask Transformer |
Dahun Kim; Jun Xie; Huiyu Wang; Siyuan Qiao; Qihang Yu; Hong-Seok Kim; Hartwig Adam; In So Kweon; Liang-Chieh Chen |
|
61b | ASM-Loc: Action-Aware Segment Modeling for Weakly-Supervised Temporal Action Localization |
Bo He; Xitong Yang; Le Kang; Zhiyu Cheng; Xin Zhou; Abhinav Shrivastava |
|
62b | A Graph Matching Perspective With Transformers on Video Instance Segmentation |
Zheyun Qin; Xiankai Lu; Xiushan Nie; Yilong Yin; Jianbing Shen |
|
63b | STRPM: A Spatiotemporal Residual Predictive Model for High-Resolution Video Prediction |
Zheng Chang; Xinfeng Zhang; Shanshe Wang; Siwei Ma; Wen Gao |
|
64b | Look for the Change: Learning Object States and State-Modifying Actions From Untrimmed Web Videos |
Tomáš Souček; Jean-Baptiste Alayrac; Antoine Miech; Ivan Laptev; Josef Sivic |
|
65b | End-to-End Compressed Video Representation Learning for Generic Event Boundary Detection |
Congcong Li; Xinyao Wang; Longyin Wen; Dexiang Hong; Tiejian Luo; Libo Zhang |
|
66b | Contextualized Spatio-Temporal Contrastive Learning With Self-Supervision |
Liangzhe Yuan; Rui Qian; Yin Cui; Boqing Gong; Florian Schroff; Ming-Hsuan Yang; Hartwig Adam; Ting Liu |
|
67b | Deep Anomaly Discovery From Unlabeled Videos via Normality Advantage and Self-Paced Refinement |
Guang Yu; Siqi Wang; Zhiping Cai; Xinwang Liu; Chuanfu Xu; Chengkun Wu |
|
68b | A Deeper Dive Into What Deep Spatiotemporal Networks Encode: Quantifying Static vs. Dynamic Information |
Matthew Kowal; Mennatullah Siam; Md Amirul Islam; Neil D. B. Bruce; Richard P. Wildes; Konstantinos G. Derpanis |
|
69b | Long-Short Temporal Contrastive Learning of Video Transformers |
Jue Wang; Gedas Bertasius; Du Tran; Lorenzo Torresani |
|
70b | Scene Consistency Representation Learning for Video Scene Segmentation |
Haoqian Wu; Keyu Chen; Yanan Luo; Ruizhi Qiao; Bo Ren; Haozhe Liu; Weicheng Xie; Linlin Shen |
|
71b | Unsupervised Pre-Training for Temporal Action Localization Tasks |
Can Zhang; Tianyu Yang; Junwu Weng; Meng Cao; Jue Wang; Yuexian Zou |
|
72b | Contrastive Learning for Unsupervised Video Highlight Detection |
Taivanbat Badamdorj; Mrigank Rochan; Yang Wang; Li Cheng |
|
73b | Deformable Video Transformer | Jue Wang; Lorenzo Torresani | |
74b | Recurring the Transformer for Video Action Recognition |
Jiewen Yang; Xingbo Dong; Liujun Liu; Chao Zhang; Jiajun Shen; Dahai Yu |
|
Recognition: Detection, Categorization, Retrieval | 75b | Open-Vocabulary One-Stage Detection With Hierarchical Visual-Language Knowledge Distillation |
Zongyang Ma; Guan Luo; Jin Gao; Liang Li; Yuxin Chen; Shaoru Wang; Congxuan Zhang; Weiming Hu |
76b | Learning To Prompt for Open-Vocabulary Object Detection With Vision-Language Model |
Yu Du; Fangyun Wei; Zihe Zhang; Miaojing Shi; Yue Gao; Guoqi Li |
|
77b | Sign Language Video Retrieval With Free-Form Textual Queries |
Amanda Duarte; Samuel Albanie; Xavier Giró-i-Nieto; Gül Varol |
|
78b | FashionVLP: Vision Language Transformer for Fashion Retrieval With Feedback |
Sonam Goenka; Zhaoheng Zheng; Ayush Jaiswal; Rakesh Chada; Yue Wu; Varsha Hedau; Pradeep Natarajan |
|
79b | Pushing the Performance Limit of Scene Text Recognizer Without Human Annotation |
Caiyuan Zheng; Hui Li; Seon-Min Rhee; Seungju Han; Jae-Joon Han; Peng Wang |
|
80b | ESCNet: Gaze Target Detection With the Understanding of 3D Scenes | Jun Bao; Buyu Liu; Jun Yu | |
81b | Interactive Multi-Class Tiny-Object Detection |
Chunggi Lee; Seonwook Park; Heon Song; Jeongun Ryu; Sanghoon Kim; Haejoon Kim; Sérgio Pereira; Donggeun Yoo |
|
82b | Weakly Supervised Rotation-Invariant Aerial Object Detection Network |
Xiaoxu Feng; Xiwen Yao; Gong Cheng; Junwei Han |
|
83b | Large Loss Matters in Weakly Supervised Multi-Label Classification |
Youngwook Kim; Jae Myung Kim; Zeynep Akata; Jungwoo Lee |
|
84b | MetaFSCIL: A Meta-Learning Approach for Few-Shot Class Incremental Learning |
Zhixiang Chi; Li Gu; Huan Liu; Yang Wang; Yuanhao Yu; Jin Tang |
|
85b | FreeSOLO: Learning To Segment Objects Without Annotations |
Xinlong Wang; Zhiding Yu; Shalini De Mello; Jan Kautz; Anima Anandkumar; Chunhua Shen; Jose M. Alvarez |
|
86b | Revisiting AP Loss for Dense Object Detection: Adaptive Ranking Pair Selection | Dongli Xu; Jinhong Deng; Wen Li | |
87b | SIOD: Single Instance Annotated per Category per Image for Object Detection |
Hanjun Li; Xingjia Pan; Ke Yan; Fan Tang; Wei-Shi Zheng |
|
88b | Towards Robust Adaptive Object Detection Under Noisy Annotations |
Xinyu Liu; Wuyang Li; Qiushi Yang; Baopu Li; Yixuan Yuan |
|
89b | Task-Specific Inconsistency Alignment for Domain Adaptive Object Detection | Liang Zhao; Limin Wang | |
90b | Salvage of Supervision in Weakly Supervised Object Detection | Lin Sui; Chen-Lin Zhang; Jianxin Wu | |
91b | Label, Verify, Correct: A Simple Few Shot Object Detection Method |
Prannay Kaul; Weidi Xie; Andrew Zisserman |
|
92b | Background Activation Suppression for Weakly Supervised Object Localization | Pingyu Wu; Wei Zhai; Yang Cao | |
93b | Bridging the Gap Between Classification and Localization for Weakly Supervised Object Localization |
Eunji Kim; Siwon Kim; Jungbeom Lee; Hyunwoo Kim; Sungroh Yoon |
|
94b | Divide and Conquer: Compositional Experts for Generalized Novel Class Discovery |
Muli Yang; Yuehua Zhu; Jiaping Yu; Aming Wu; Cheng Deng |
|
95b | Cloth-Changing Person Re-Identification From a Single Image With Gait Prediction and Regularization |
Xin Jin; Tianyu He; Kecheng Zheng; Zhiheng Yin; Xu Shen; Zhen Huang; Ruoyu Feng; Jianqiang Huang; Zhibo Chen; Xian-Sheng Hua |
|
96b | Lifelong Unsupervised Domain Adaptive Person Re-Identification With Coordinated Anti-Forgetting and Adaptation |
Zhipeng Huang; Zhizheng Zhang; Cuiling Lan; Wenjun Zeng; Peng Chu; Quanzeng You; Jiang Wang; Zicheng Liu; Zheng-Jun Zha |
|
97b | Unleashing Potential of Unsupervised Pre-Training With Intra-Identity Regularization for Person Re-Identification |
Zizheng Yang; Xin Jin; Kecheng Zheng; Feng Zhao |
|
98b | Learning With Twin Noisy Labels for Visible-Infrared Person Re-Identification |
Mouxing Yang; Zhenyu Huang; Peng Hu; Taihao Li; Jiancheng Lv; Xi Peng |
|
99b | Towards Total Recall in Industrial Anomaly Detection |
Karsten Roth; Latha Pemula; Joaquin Zepeda; Bernhard Schölkopf; Thomas Brox; Peter Gehler |
|
100b | H2FA R-CNN: Holistic and Hierarchical Feature Alignment for Cross-Domain Weakly Supervised Object Detection |
Yunqiu Xu; Yifan Sun; Zongxin Yang; Jiaxu Miao; Yi Yang |
|
101b | Geometric and Textural Augmentation for Domain Gap Reduction |
Xiao-Chang Liu; Yong-Liang Yang; Peter Hall |
|
102b | General Incremental Learning With Domain-Aware Categorical Representations | Jiangwei Xie; Shipeng Yan; Xuming He | |
103b | DST: Dynamic Substitute Training for Data-Free Black-Box Attack |
Wenxuan Wang; Xuelin Qian; Yanwei Fu; Xiangyang Xue |
|
104b | ART-Point: Improving Rotation Robustness of Point Cloud Classifiers via Adversarial Rotation | Ruibin Wang; Yibo Yang; Dacheng Tao | |
Self-, Semi-, Meta-, & Unsupervised Learning | 105b | Label Matching Semi-Supervised Object Detection |
Binbin Chen; Weijie Chen; Shicai Yang; Yunyi Xuan; Jie Song; Di Xie; Shiliang Pu; Mingli Song; Yueting Zhuang |
106b | Multidimensional Belief Quantification for Label-Efficient Meta-Learning | Deep Shankar Pandey; Qi Yu | |
107b | Propagation Regularizer for Semi-Supervised Learning With Extremely Scarce Labeled Samples | Noo-ri Kim; Jee-Hyong Lee | |
108b | Learning To Affiliate: Mutual Centralized Learning for Few-Shot Classification |
Yang Liu; Weifeng Zhang; Chao Xiang; Tu Zheng; Deng Cai; Xiaofei He |
|
109b | Class-Aware Contrastive Semi-Supervised Learning |
Fan Yang; Kai Wu; Shuyi Zhang; Guannan Jiang; Yong Liu; Feng Zheng; Wei Zhang; Chengjie Wang; Long Zeng |
|
110b | Exploring the Equivalence of Siamese Self-Supervised Learning via a Unified Gradient Framework |
Chenxin Tao; Honghui Wang; Xizhou Zhu; Jiahua Dong; Shiji Song; Gao Huang; Jifeng Dai |
|
111b | Dual Temperature Helps Contrastive Learning Without Many Negative Samples: Towards Understanding and Simplifying MoCo |
Chaoning Zhang; Kang Zhang; Trung X. Pham; Axi Niu; Zhinan Qiao; Chang D. Yoo; In So Kweon |
|
112b | Learning Where To Learn in Cross-View Self-Supervised Learning |
Lang Huang; Shan You; Mingkai Zheng; Fei Wang; Chen Qian; Toshihiko Yamasaki |
|
113b | Dist-PU: Positive-Unlabeled Learning From a Label Distribution Perspective |
Yunrui Zhao; Qianqian Xu; Yangbangyan Jiang; Peisong Wen; Qingming Huang |
|
114b | SimMatch: Semi-Supervised Learning With Similarity Matching |
Mingkai Zheng; Shan You; Lang Huang; Fei Wang; Chen Qian; Chang Xu |
|
115b | Active Teacher for Semi-Supervised Object Detection |
Peng Mi; Jianghang Lin; Yiyi Zhou; Yunhang Shen; Gen Luo; Xiaoshuai Sun; Liujuan Cao; Rongrong Fu; Qiang Xu; Rongrong Ji |
|
116b | Not All Labels Are Equal: Rationalizing the Labeling Costs for Training Object Detection |
Ismail Elezi; Zhiding Yu; Anima Anandkumar; Laura Leal-Taixé; Jose M. Alvarez |
|
117b | Self-Supervised Learning of Object Parts for Semantic Segmentation | Adrian Ziegler; Yuki M. Asano | |
118b | MUM: Mix Image Tiles and UnMix Feature Tiles for Semi-Supervised Object Detection |
JongMok Kim; JooYoung Jang; Seunghyeon Seo; Jisoo Jeong; Jongkeun Na; Nojun Kwak |
|
119b | Scale-Equivalent Distillation for Semi-Supervised Object Detection |
Qiushan Guo; Yao Mu; Jianyu Chen; Tianqi Wang; Yizhou Yu; Ping Luo |
|
120b | A Self-Supervised Descriptor for Image Copy Detection |
Ed Pizzi; Sreya Dutta Roy; Sugosh Nagavara Ravindra; Priya Goyal; Matthijs Douze |
|
121b | Self-Supervised Transformers for Unsupervised Object Discovery Using Normalized Cut |
Yangtao Wang; Xi Shen; Shell Xu Hu; Yuan Yuan; James L. Crowley; Dominique Vaufreydaz |
|
122b | CAD: Co-Adapting Discriminative Features for Improved Few-Shot Classification |
Philip Chikontwe; Soopil Kim; Sang Hyun Park |
|
123b | Semi-Supervised Few-Shot Learning via Multi-Factor Clustering | Jie Ling; Lei Liao; Meng Yang; Jia Shuai | |
124b | CoSSL: Co-Learning of Representation and Classifier for Imbalanced Semi-Supervised Learning |
Yue Fan; Dengxin Dai; Anna Kukleva; Bernt Schiele |
|
125b | Safe-Student for Safe Deep Semi-Supervised Learning With Unseen-Class Unlabeled Data |
Rundong He; Zhongyi Han; Xiankai Lu; Yilong Yin |
|
126b | A Simple Data Mixing Prior for Improving Self-Supervised Learning |
Sucheng Ren; Huiyu Wang; Zhengqi Gao; Shengfeng He; Alan Yuille; Yuyin Zhou; Cihang Xie |
|
127b | DETReg: Unsupervised Pretraining With Region Priors for Object Detection |
Amir Bar; Xin Wang; Vadim Kantorov; Colorado J. Reed; Roei Herzig; Gal Chechik; Anna Rohrbach; Trevor Darrell; Amir Globerson |
|
128b | Sound and Visual Representation Learning With Multiple Pretraining Tasks |
Arun Balajee Vasudevan; Dengxin Dai; Luc Van Gool |
|
129b | UniVIP: A Unified Framework for Self-Supervised Visual Pre-Training |
Zhaowen Li; Yousong Zhu; Fan Yang; Wei Li; Chaoyang Zhao; Yingying Chen; Zhiyang Chen; Jiahao Xie; Liwei Wu; Rui Zhao; Ming Tang; Jinqiao Wang |
|
130b | Weakly Supervised Object Localization As Domain Adaption |
Lei Zhu; Qi She; Qian Chen; Yunfei You; Boyu Wang; Yanye Lu |
|
131b | Debiased Learning From Naturally Imbalanced Pseudo-Labels |
Xudong Wang; Zhirong Wu; Long Lian; Stella X. Yu |
|
132b | Towards Discovering the Effectiveness of Moderately Confident Samples for Semi-Supervised Learning | Hui Tang; Kui Jia | |
133b | Masked Feature Prediction for Self-Supervised Visual Pre-Training |
Chen Wei; Haoqi Fan; Saining Xie; Chao-Yuan Wu; Alan Yuille; Christoph Feichtenhofer |
|
134b | Contrastive Learning for Space-Time Correspondence via Self-Cycle Consistency | Jeany Son | |
135b | Id-Free Person Similarity Learning |
Bing Shuai; Xinyu Li; Kaustav Kundu; Joseph Tighe |
|
136b | End-to-End Semi-Supervised Learning for Video Action Detection | Akash Kumar; Yogesh Singh Rawat | |
137b | Probabilistic Representations for Video Contrastive Learning |
Jungin Park; Jiyoung Lee; Ig-Jae Kim; Kwanghoon Sohn |
|
138b | Interact Before Align: Leveraging Cross-Modal Knowledge for Domain Adaptive Action Recognition |
Lijin Yang; Yifei Huang; Yusuke Sugano; Yoichi Sato |
|
139b | BEVT: BERT Pretraining of Video Transformers |
Rui Wang; Dongdong Chen; Zuxuan Wu; Yinpeng Chen; Xiyang Dai; Mengchen Liu; Yu-Gang Jiang; Luowei Zhou; Lu Yuan |
|
140b | Generative Cooperative Learning for Unsupervised Video Anomaly Detection |
M. Zaigham Zaheer; Arif Mahmood; M. Haris Khan; Mattia Segu; Fisher Yu; Seung-Ik Lee |
|
141b | When Does Contrastive Visual Representation Learning Work? |
Elijah Cole; Xuan Yang; Kimberly Wilber; Oisin Mac Aodha; Serge Belongie |
|
142b | The Norm Must Go On: Dynamic Unsupervised Domain Adaptation by Normalization |
M. Jehanzeb Mirza; Jakub Micorek; Horst Possegger; Horst Bischof |
|
143b | What Matters for Meta-Learning Vision Regression Tasks? |
Ning Gao; Hanna Ziesche; Ngo Anh Vien; Michael Volpp; Gerhard Neumann |
|
Robot Vision | 144b | IFOR: Iterative Flow Minimization for Robotic Object Rearrangement |
Ankit Goyal; Arsalan Mousavian; Chris Paxton; Yu-Wei Chao; Brian Okorn; Jia Deng; Dieter Fox |
145b | TCTrack: Temporal Contexts for Aerial Tracking |
Ziang Cao; Ziyuan Huang; Liang Pan; Shiwei Zhang; Ziwei Liu; Changhong Fu |
|
146b | AKB-48: A Real-World Articulated Object Knowledge Base |
Liu Liu; Wenqiang Xu; Haoyuan Fu; Sucheng Qian; Qiaojun Yu; Yang Han; Cewu Lu |
|
147b | 3DAC: Learning Attribute Compression for Point Clouds |
Guangchi Fang; Qingyong Hu; Hanyun Wang; Yiling Xu; Yulan Guo |
|
148b | Simple but Effective: CLIP Embeddings for Embodied AI |
Apoorv Khandelwal; Luca Weihs; Roozbeh Mottaghi; Aniruddha Kembhavi |
|
149b | Multi-Robot Active Mapping via Neural Bipartite Graph Matching |
Kai Ye; Siyan Dong; Qingnan Fan; He Wang; Li Yi; Fei Xia; Jue Wang; Baoquan Chen |
|
150b | Continuous Scene Representations for Embodied AI |
Samir Yitzhak Gadre; Kiana Ehsani; Shuran Song; Roozbeh Mottaghi |
|
151b | Interactron: Embodied Adaptive Object Detection | Klemen Kotar; Roozbeh Mottaghi | |
152b | Online Learning of Reusable Abstract Models for Object Goal Navigation |
Tommaso Campari; Leonardo Lamanna; Paolo Traverso; Luciano Serafini; Lamberto Ballan |
|
153b | RNNPose: Recurrent 6-DoF Object Pose Refinement With Robust Correspondence Field Estimation and Pose Optimization |
Yan Xu; Kwan-Yee Lin; Guofeng Zhang; Xiaogang Wang; Hongsheng Li |
|
154b | UDA-COPE: Unsupervised Domain Adaptation for Category-Level Object Pose Estimation |
Taeyeop Lee; Byeong-Uk Lee; Inkyu Shin; Jaesung Choe; Ukcheol Shin; In So Kweon; Kuk-Jin Yoon |
|
155b | Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation |
Nathaniel Merrill; Yuliang Guo; Xingxing Zuo; Xinyu Huang; Stefan Leutenegger; Xi Peng; Liu Ren; Guoquan Huang |
|
156b | Upright-Net: Learning Upright Orientation for 3D Point Cloud |
Xufang Pang; Feng Li; Ning Ding; Xiaopin Zhong |
|
Computer Vision for Social Good | 157b | DeepFake Disrupter: The Detector of DeepFake Is My Friend |
Xueyu Wang; Jiajun Huang; Siqi Ma; Surya Nepal; Chang Xu |
158b | HybridCR: Weakly-Supervised 3D Point Cloud Semantic Segmentation via Hybrid Contrastive Regularization |
Mengtian Li; Yuan Xie; Yunhang Shen; Bo Ke; Ruizhi Qiao; Bo Ren; Shaohui Lin; Lizhuang Ma |
|
159b | Open-Domain, Content-Based, Multi-Modal Fact-Checking of Out-of-Context Images via Online Resources |
Sahar Abdelnabi; Rakibul Hasan; Mario Fritz |
|
160b | Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection |
Alexandros Haliassos; Rodrigo Mira; Stavros Petridis; Maja Pantic |
|
Adversarial Attack & Defense | 161b | Transferable Sparse Adversarial Attack |
Ziwen He; Wei Wang; Jing Dong; Tieniu Tan |
162b | Segment and Complete: Defending Object Detectors Against Adversarial Patch Attacks With Robust Patch Detection |
Jiang Liu; Alexander Levine; Chun Pong Lau; Rama Chellappa; Soheil Feizi |
|
163b | Stochastic Variance Reduced Ensemble Adversarial Attack for Boosting the Adversarial Transferability |
Yifeng Xiong; Jiadong Lin; Min Zhang; John E. Hopcroft; Kun He |
|
164b | Improving Adversarial Transferability via Neuron Attribution-Based Attacks |
Jianping Zhang; Weibin Wu; Jen-tse Huang; Yizhan Huang; Wenxuan Wang; Yuxin Su; Michael R. Lyu |
|
165b | Complex Backdoor Detection by Symmetric Feature Differencing |
Yingqi Liu; Guangyu Shen; Guanhong Tao; Zhenting Wang; Shiqing Ma; Xiangyu Zhang |
|
166b | Protecting Facial Privacy: Generating Adversarial Identity Masks via Style-Robust Makeup Transfer |
Shengshan Hu; Xiaogeng Liu; Yechao Zhang; Minghui Li; Leo Yu Zhang; Hai Jin; Libing Wu |
|
167b | Zero-Query Transfer Attacks on Context-Aware Object Detectors |
Zikui Cai; Shantanu Rane; Alejandro E. Brito; Chengyu Song; Srikanth V. Krishnamurthy; Amit K. Roy-Chowdhury; M. Salman Asif |
|
168b | 360-Attack: Distortion-Aware Perturbations From Perspective-Views |
Yunjian Zhang; Yanwei Liu; Jinxia Liu; Jingbo Miao; Antonios Argyriou; Liming Wang; Zhen Xu |
|
169b | Label-Only Model Inversion Attacks via Boundary Repulsion |
Mostafa Kahla; Si Chen; Hoang Anh Just; Ruoxi Jia |
|
170b | Merry Go Round: Rotate a Frame and Fool a DNN | Daksh Thapar; Aditya Nigam; Chetan Arora | |
171b | Cross-Modal Transferable Adversarial Attacks From Images to Videos |
Zhipeng Wei; Jingjing Chen; Zuxuan Wu; Yu-Gang Jiang |
|
172b | BppAttack: Stealthy and Efficient Trojan Attacks Against Deep Neural Networks via Image Quantization and Contrastive Adversarial Learning | Zhenting Wang; Juan Zhai; Shiqing Ma | |
173b | Investigating Top-k White-Box and Transferable Black-Box Attack |
Chaoning Zhang; Philipp Benz; Adil Karjauv; Jae Won Cho; Kang Zhang; In So Kweon |
|
174b | Boosting Black-Box Attack With Partially Transferred Conditional Adversarial Distribution |
Yan Feng; Baoyuan Wu; Yanbo Fan; Li Liu; Zhifeng Li; Shu-Tao Xia |
|
175b | Practical Evaluation of Adversarial Robustness via Adaptive Auto Attack |
Ye Liu; Yaya Cheng; Lianli Gao; Xianglong Liu; Qilong Zhang; Jingkuan Song |
|
176b | Towards Efficient Data Free Black-Box Adversarial Attack |
Jie Zhang; Bo Li; Jianghe Xu; Shuang Wu; Shouhong Ding; Lei Zhang; Chao Wu |
|
177b | Masking Adversarial Damage: Finding Adversarial Saliency for Robust and Sparse Network | Byung-Kwan Lee; Junho Kim; Yong Man Ro | |
178b | Certified Patch Robustness via Smoothed Vision Transformers |
Hadi Salman; Saachi Jain; Eric Wong; Aleksander Madry |
|
179b | Towards Practical Certifiable Patch Defense With Vision Transformer |
Zhaoyu Chen; Bo Li; Jianghe Xu; Shuang Wu; Shouhong Ding; Wenqiang Zhang |
|
180b | On Adversarial Robustness of Trajectory Prediction for Autonomous Vehicles |
Qingzhao Zhang; Shengtuo Hu; Jiachen Sun; Qi Alfred Chen; Z. Morley Mao |
|
181b | 3DeformRS: Certifying Spatial Deformations on Point Clouds |
Gabriel Pérez S.; Juan C. Pérez; Motasem Alfarra; Silvio Giancola; Bernard Ghanem |
|
182b | Stereoscopic Universal Perturbations Across Different Architectures and Datasets |
Zachary Berger; Parth Agrawal; Tian Yu Liu; Stefano Soatto; Alex Wong |
|
183b | Aug-NeRF: Training Stronger Neural Radiance Fields With Triple-Level Physically-Grounded Augmentations |
Tianlong Chen; Peihao Wang; Zhiwen Fan; Zhangyang Wang |
|
184b | Bounded Adversarial Attack on Deep Content Features | Qiuling Xu; Guanhong Tao; Xiangyu Zhang | |
185b | DEFEAT: Deep Hidden Feature Backdoor Attacks by Imperceptible Perturbation and Latent Representation Constraints |
Zhendong Zhao; Xiaojun Chen; Yuexin Xuan; Ye Dong; Dakui Wang; Kaitai Liang |
|
186b | Two Coupled Rejection Metrics Can Tell Adversarial Examples Apart |
Tianyu Pang; Huishuai Zhang; Di He; Yinpeng Dong; Hang Su; Wei Chen; Jun Zhu; Tie-Yan Liu |
|
187b | Give Me Your Attention: Dot-Product Attention Considered Harmful for Adversarial Patch Robustness |
Giulio Lovisotto; Nicole Finnie; Mauricio Munoz; Chaithanya Kumar Mummadi; Jan Hendrik Metzen |
|
188b | Improving the Transferability of Targeted Adversarial Examples Through Object-Based Diverse Input |
Junyoung Byun; Seungju Cho; Myung-Joon Kwon; Hee-Seon Kim; Changick Kim |
|
189b | Adversarial Eigen Attack on Black-Box Models |
Linjun Zhou; Peng Cui; Xingxuan Zhang; Yinan Jiang; Shiqiang Yang |
|
190b | Appearance and Structure Aware Robust Deep Visual Graph Matching: Attack, Defense and Beyond |
Qibing Ren; Qingquan Bao; Runzhong Wang; Junchi Yan |
|
191b | Enhancing Adversarial Training With Second-Order Statistics of Weights |
Gaojie Jin; Xinping Yi; Wei Huang; Sven Schewe; Xiaowei Huang |
|
192b | Towards Data-Free Model Stealing in a Hard Label Setting |
Sunandini Sanyal; Sravanti Addepalli; R. Venkatesh Babu |
|
193b | Robust Structured Declarative Classifiers for 3D Point Clouds: Defending Adversarial Attacks With Implicit Gradients |
Kaidong Li; Ziming Zhang; Cuncong Zhong; Guanghui Wang |
|
194b | DTA: Physical Camouflage Attacks Using Differentiable Transformation Network |
Naufal Suryanto; Yongsu Kim; Hyoeun Kang; Harashta Tatimma Larasati; Youngyeo Yun; Thi-Thu-Huong Le; Hunmin Yang; Se-Yoon Oh; Howon Kim |
|
195b | Frequency-Driven Imperceptible Adversarial Attack on Semantic Similarity |
Cheng Luo; Qinliang Lin; Weicheng Xie; Bizhu Wu; Jinheng Xie; Linlin Shen |
|
196b | Enhancing Adversarial Robustness for Deep Metric Learning | Mo Zhou; Vishal M. Patel | |
197b | Shape-Invariant 3D Adversarial Point Clouds |
Qidong Huang; Xiaoyi Dong; Dongdong Chen; Hang Zhou; Weiming Zhang; Nenghai Yu |
|
198b | Shadows Can Be Dangerous: Stealthy and Effective Physical-World Adversarial Attack by Natural Phenomenon |
Yiqi Zhong; Xianming Liu; Deming Zhai; Junjun Jiang; Xiangyang Ji |
|
199b | Exploring Effective Data for Surrogate Training Towards Black-Box Attack |
Xuxiang Sun; Gong Cheng; Hongda Li; Lei Pei; Junwei Han |
|
200b | NICGSlowDown: Evaluating the Efficiency Robustness of Neural Image Caption Generation Models |
Simin Chen; Zihe Song; Mirazul Haque; Cong Liu; Wei Yang |
|
201b | Dual-Key Multimodal Backdoors for Visual Question Answering |
Matthew Walmer; Karan Sikka; Indranil Sur; Abhinav Shrivastava; Susmit Jha |
|
202b | Proactive Image Manipulation Detection |
Vishal Asnani; Xi Yin; Tal Hassner; Sijia Liu; Xiaoming Liu |
|
Vision & Language | 203b | ADAPT: Vision-Language Navigation With Modality-Aligned Action Prompts |
Bingqian Lin; Yi Zhu; Zicong Chen; Xiwen Liang; Jianzhuang Liu; Xiaodan Liang |
204b | EnvEdit: Environment Editing for Vision-and-Language Navigation | Jialu Li; Hao Tan; Mohit Bansal | |
205b | HOP: History-and-Order Aware Pre-Training for Vision-and-Language Navigation |
Yanyuan Qiao; Yuankai Qi; Yicong Hong; Zheng Yu; Peng Wang; Qi Wu |
|
206b | Less Is More: Generating Grounded Navigation Instructions From Landmarks |
Su Wang; Ceslee Montgomery; Jordi Orbay; Vighnesh Birodkar; Aleksandra Faust; Izzeddin Gur; Natasha Jaques; Austin Waters; Jason Baldridge; Peter Anderson |
|
207b | Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation |
Yicong Hong; Zun Wang; Qi Wu; Stephen Gould |
|
208b | Reinforced Structured State-Evolution for Vision-Language Navigation |
Jinyu Chen; Chen Gao; Erli Meng; Qiong Zhang; Si Liu |
|
209b | Cross-Modal Map Learning for Vision and Language Navigation |
Georgios Georgakis; Karl Schmeckpeper; Karan Wanchoo; Soham Dan; Eleni Miltsakaki; Dan Roth; Kostas Daniilidis |
|
210b | Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation |
Hanqing Wang; Wei Liang; Jianbing Shen; Luc Van Gool; Wenguan Wang |
|
211b | One Step at a Time: Long-Horizon Vision-and-Language Navigation With Milestones |
Chan Hee Song; Jihyung Kil; Tai-Yu Pan; Brian M. Sadler; Wei-Lun Chao; Yu Su |
|
212b | Expanding Large Pre-Trained Unimodal Models With Multimodal Information Injection for Image-Text Multimodal Classification |
Tao Liang; Guosheng Lin; Mingyang Wan; Tianrui Li; Guojun Ma; Fengmao Lv |
|
213b | Shifting More Attention to Visual Backbone: Query-Modulated Refinement Networks for End-to-End Visual Grounding |
Jiabo Ye; Junfeng Tian; Ming Yan; Xiaoshan Yang; Xuwu Wang; Ji Zhang; Liang He; Xin Lin |
|
214b | Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding |
Haojun Jiang; Yuanze Lin; Dongchen Han; Shiji Song; Gao Huang |
|
215b | Multi-View Transformer for 3D Visual Grounding |
Shijia Huang; Yilun Chen; Jiaya Jia; Liwei Wang |
|
216b | Multi-Modal Dynamic Graph Transformer for Visual Grounding | Sijia Chen; Baochun Li | |
217b | Weakly-Supervised Generation and Grounding of Visual Descriptions With Conditional Generative Models | Effrosyni Mavroudi; René Vidal | |
218b | Weakly Supervised Temporal Sentence Grounding With Gaussian-Based Contrastive Proposal Learning |
Minghang Zheng; Yanjie Huang; Qingchao Chen; Yuxin Peng; Yang Liu |
|
219b | Visual Abductive Reasoning |
Chen Liang; Wenguan Wang; Tianfei Zhou; Yi Yang |
|
220b | Query and Attention Augmentation for Knowledge-Based Explainable Reasoning | Yifeng Zhang; Ming Jiang; Qi Zhao | |
221b | REX: Reasoning-Aware and Grounded Explanation | Shi Chen; Qi Zhao | |
222b | Not All Relations Are Equal: Mining Informative Labels for Scene Graph Generation |
Arushi Goel; Basura Fernando; Frank Keller; Hakan Bilen |
|
223b | Unsupervised Vision-Language Parsing: Seamlessly Bridging Visual Scene Graphs With Language Structures via Dependency Relationships |
Chao Lou; Wenjuan Han; Yuhuan Lin; Zilong Zheng |
|
224b | Scene Graph Expansion for Semantics-Guided Image Outpainting |
Chiao-An Yang; Cheng-Yo Tan; Wan-Cyuan Fan; Cheng-Fu Yang; Meng-Lin Wu; Yu-Chiang Frank Wang |
|
225b | VisualHow: Multimodal Problem Solving |
Jinhui Yang; Xianyu Chen; Ming Jiang; Shi Chen; Louis Wang; Qi Zhao |
|
226b | FLAVA: A Foundational Language and Vision Alignment Model |
Amanpreet Singh; Ronghang Hu; Vedanuj Goswami; Guillaume Couairon; Wojciech Galuba; Marcus Rohrbach; Douwe Kiela |
|
227b | Multi-Modal Alignment Using Representation Codebook |
Jiali Duan; Liqun Chen; Son Tran; Jinyu Yang; Yi Xu; Belinda Zeng; Trishul Chilimbi |
|
228b | Negative-Aware Attention Framework for Image-Text Matching |
Kun Zhang; Zhendong Mao; Quan Wang; Yongdong Zhang |
|
229b | Vision-Language Pre-Training With Triple Contrastive Learning |
Jinyu Yang; Jiali Duan; Son Tran; Yi Xu; Sampath Chanda; Liqun Chen; Belinda Zeng; Trishul Chilimbi; Junzhou Huang |
|
230b | Vision-Language Pre-Training for Boosting Scene Text Detectors |
Sibo Song; Jianqiang Wan; Zhibo Yang; Jun Tang; Wenqing Cheng; Xiang Bai; Cong Yao |
|
231b | COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval |
Haoyu Lu; Nanyi Fei; Yuqi Huo; Yizhao Gao; Zhiwu Lu; Ji-Rong Wen |
|
3D From Multi-View & Sensors | 232b | NeurMiPs: Neural Mixture of Planar Experts for View Synthesis |
Zhi-Hao Lin; Wei-Chiu Ma; Hao-Yu Hsu; Yu-Chiang Frank Wang; Shenlong Wang |
233b | FWD: Real-Time Novel View Synthesis With Forward Warping and Depth | Ang Cao; Chris Rockwell; Justin Johnson | |
234b | SOMSI: Spherical Novel View Synthesis With Soft Occlusion Multi-Sphere Images |
Tewodros Habtegebrial; Christiano Gava; Marcel Rogge; Didier Stricker; Varun Jampani |
|
235b | Fast, Accurate and Memory-Efficient Partial Permutation Synchronization | Shaohan Li; Yunpeng Shi; Gilad Lerman | |
236b | Learning To Find Good Models in RANSAC | Daniel Barath; Luca Cavalli; Marc Pollefeys | |
237b | Optimizing Elimination Templates by Greedy Parameter Search |
Evgeniy Martyushev; Jana Vráblíková; Tomas Pajdla |
|
238b | GPU-Based Homotopy Continuation for Minimal Problems in Computer Vision |
Chiang-Heng Chien; Hongyi Fan; Ahmad Abdelfattah; Elias Tsigaridas; Stanimire Tomov; Benjamin Kimia |
|
239b | HARA: A Hierarchical Approach for Robust Rotation Averaging | Seong Hun Lee; Javier Civera | |
240b | RAGO: Recurrent Graph Optimizer for Multiple Rotation Averaging |
Heng Li; Zhaopeng Cui; Shuaicheng Liu; Ping Tan |
|
241b | A Unified Model for Line Projections in Catadioptric Cameras With Rotationally Symmetric Mirrors | Pedro Miraldo; José Pedro Iglesias | |
242b | ELSR: Efficient Line Segment Reconstruction With Planes and Points Guidance |
Dong Wei; Yi Wan; Yongjun Zhang; Xinyi Liu; Bin Zhang; Xiqi Wang |
|
243b | Self-Supervised Neural Articulated Shape and Appearance Models |
Fangyin Wei; Rohan Chabra; Lingni Ma; Christoph Lassner; Michael Zollhöfer; Szymon Rusinkiewicz; Chris Sweeney; Richard Newcombe; Mira Slavcheva |
|
244b | Virtual Elastic Objects |
Hsiao-yu Chen; Edith Tretschk; Tuur Stuyck; Petr Kadlecek; Ladislav Kavan; Etienne Vouga; Christoph Lassner |
|
245b | Decoupling Makes Weakly Supervised Local Feature Better |
Kunhong Li; Longguang Wang; Li Liu; Qing Ran; Kai Xu; Yulan Guo |
|
246b | JoinABLe: Learning Bottom-Up Assembly of Parametric CAD Joints |
Karl D.D. Willis; Pradeep Kumar Jayaraman; Hang Chu; Yunsheng Tian; Yifei Li; Daniele Grandi; Aditya Sanghi; Linh Tran; Joseph G. Lambourne; Armando Solar-Lezama; Wojciech Matusik |
|
247b | ImplicitAtlas: Learning Deformable Shape Templates in Medical Imaging |
Jiancheng Yang; Udaranga Wickramasinghe; Bingbing Ni; Pascal Fua |
|
248b | DoubleField: Bridging the Neural Surface and Radiance Fields for High-Fidelity Human Reconstruction and Rendering |
Ruizhi Shao; Hongwen Zhang; He Zhang; Mingjia Chen; Yan-Pei Cao; Tao Yu; Yebin Liu |
|
249b | Surface-Aligned Neural Radiance Fields for Controllable 3D Human Synthesis |
Tianhan Xu; Yasuhiro Fujita; Eiichi Matsumoto |
|
250b | Structured Local Radiance Fields for Human Avatar Modeling |
Zerong Zheng; Han Huang; Tao Yu; Hongwen Zhang; Yandong Guo; Yebin Liu |
|
251b | High-Fidelity Human Avatars From a Single RGB Camera |
Hao Zhao; Jinsong Zhang; Yu-Kun Lai; Zerong Zheng; Yingdi Xie; Yebin Liu; Kun Li |
|
252b | Forecasting Characteristic 3D Poses of Human Actions |
Christian Diller; Thomas Funkhouser; Angela Dai |
|
253b | Virtual Correspondence: Humans as a Cue for Extreme-View Geometry |
Wei-Chiu Ma; Anqi Joyce Yang; Shenlong Wang; Raquel Urtasun; Antonio Torralba |
|
254b | BEHAVE: Dataset and Method for Tracking Human Object Interactions |
Bharat Lal Bhatnagar; Xianghui Xie; Ilya A. Petrov; Cristian Sminchisescu; Christian Theobalt; Gerard Pons-Moll |
|
255b | Primitive3D: 3D Object Dataset Synthesis From Randomly Assembled Primitives |
Xinke Li; Henghui Ding; Zekun Tong; Yuwei Wu; Yeow Meng Chee |
|
256b | RGB-Multispectral Matching: Dataset, Learning Methodology, Evaluation |
Fabio Tosi; Pierluigi Zama Ramirez; Matteo Poggi; Samuele Salti; Stefano Mattoccia; Luigi Di Stefano |
|
257b | NPBG++: Accelerating Neural Point-Based Graphics |
Ruslan Rakhimov; Andrei-Timotei Ardelean; Victor Lempitsky; Evgeny Burnaev |
|
258b | Depth-Guided Sparse Structure-From-Motion for Movies and TV Shows | Sheng Liu; Xiaohan Nie; Raffay Hamid | |
259b | Motion-From-Blur: 3D Shape and Motion Estimation of Motion-Blurred Objects in Videos |
Denys Rozumnyi; Martin R. Oswald; Vittorio Ferrari; Marc Pollefeys |