Orals 6/24 AM

Presentation Schedule

All times are in Central time zone

Date: Friday, June 24, 2022   8:30AM – 10:18AM
Session Title: Representation Learning
Session Chairs: Jiajun Wu (Stanford Univ.), Pablo Arbelaez (Universidad de los Andes)

Poster ID Title Authors
1a Masked Autoencoders Are Scalable Vision Learners

Kaiming He; Xinlei Chen; Saining Xie; Yanghao Li; Piotr Dollár; Ross Girshick

2a Learning ABCs: Approximate Bijective Correspondence for Isolating Factors of Variation With Weak Supervision

Kieran A. Murphy; Varun Jampani; Srikumar Ramalingam; Ameesh Makadia

3a Bayesian Invariant Risk Minimization

Yong Lin; Hanze Dong; Hao Wang; Tong Zhang

4a Crafting Better Contrastive Views for Siamese Representation Learning

Xiangyu Peng; Kai Wang; Zheng Zhu; Mang Wang; Yang You

5a Rethinking Minimal Sufficient Representation in Contrastive Learning

Haoqing Wang; Xun Guo; Zhi-Hong Deng; Yan Lu

6a Multi-Level Feature Learning for Contrastive Multi-View Clustering

Jie Xu; Huayi Tang; Yazhou Ren; Liang Peng; Xiaofeng Zhu; Lifang He

7a Point-Level Region Contrast for Object Detection Pre-Training

Yutong Bai; Xinlei Chen; Alexander Kirillov; Alan Yuille; Alexander C. Berg

8a Class-Incremental Learning by Knowledge Distillation With Adaptive Feature Consolidation Minsoo Kang; Jaeyoo Park; Bohyung Han
9a A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration

Ramya Hebbalaguppe; Jatin Prakash; Neelabh Madan; Chetan Arora

10a SLIC: Self-Supervised Learning With Iterative Clustering for Human Action Videos

Salar Hosseini Khorasgani; Yuxuan Chen; Florian Shkurti

11a Omnivore: A Single Model for Many Visual Modalities

Rohit Girdhar; Mannat Singh; Nikhila Ravi; Laurens van der Maaten; Armand Joulin; Ishan Misra

12a DPICT: Deep Progressive Image Compression Using Trit-Planes

Jae-Han Lee; Seungmin Jeon; Kwang Pyo Choi; Youngo Park; Chang-Su Kim

13a Efficient Geometry-Aware 3D Generative Adversarial Networks

Eric R. Chan; Connor Z. Lin; Matthew A. Chan; Koki Nagano; Boxiao Pan; Shalini De Mello; Orazio Gallo; Leonidas J. Guibas; Jonathan Tremblay; Sameh Khamis; Tero Karras; Gordon Wetzstein

14a Geometric Anchor Correspondence Mining With Uncertainty Modeling for Universal Domain Adaptation

Liang Chen; Yihang Lou; Jianzhong He; Tao Bai; Minghua Deng

15a Scaling Vision Transformers to Gigapixel Images via Hierarchical Self-Supervised Learning

Richard J. Chen; Chengkuan Chen; Yicong Li; Tiffany Y. Chen; Andrew D. Trister; Rahul G. Krishnan; Faisal Mahmood

16a Versatile Multi-Modal Pre-Training for Human-Centric Perception

Fangzhou Hong; Liang Pan; Zhongang Cai; Ziwei Liu

17a Bridging Video-Text Retrieval With Multiple Choice Questions

Yuying Ge; Yixiao Ge; Xihui Liu; Dian Li; Ying Shan; Xiaohu Qie; Ping Luo

18a Integrating Language Guidance Into Vision-Based Deep Metric Learning Karsten Roth; Oriol Vinyals; Zeynep Akata

 

Date: Friday, June 24, 2022   8:30AM – 10:18AM
Session Title: Computational Photography
Session Chairs: Jinwei Ye (Louisiana State Univ.), Qi Shan (Apple)

Poster ID Title Authors
19a NeRF in the Dark: High Dynamic Range View Synthesis From Noisy Raw Images

Ben Mildenhall; Peter Hedman; Ricardo Martin-Brualla; Pratul P. Srinivasan; Jonathan T. Barron

20a DIVeR: Real-Time and Accurate Neural Radiance Fields With Deterministic Integration for Volume Rendering

Liwen Wu; Jae Yong Lee; Anand Bhattad; Yu-Xiong Wang; David Forsyth

21a HumanNeRF: Free-Viewpoint Rendering of Moving People From Monocular Video

Chung-Yi Weng; Brian Curless; Pratul P. Srinivasan; Jonathan T. Barron; Ira Kemelmacher-Shlizerman

22a Neural Reflectance for Shape Recovery With Shadow Handling Junxuan Li; Hongdong Li
23a Visual Vibration Tomography: Estimating Interior Material Properties From Monocular Video

Berthy T. Feng; Alexander C. Ogren; Chiara Daraio; Katherine L. Bouman

24a Dancing Under the Stars: Video Denoising in Starlight

Kristina Monakhova; Stephan R. Richter; Laura Waller; Vladlen Koltun

25a BACON: Band-Limited Coordinate Networks for Multiscale Scene Representation

David B. Lindell; Dave Van Veen; Jeong Joon Park; Gordon Wetzstein

26a Practical Stereo Matching via Cascaded Recurrent Network With Adaptive Correlation

Jiankun Li; Peisen Wang; Pengfei Xiong; Tao Cai; Ziwei Yan; Lei Yang; Jiangyu Liu; Haoqiang Fan; Shuaicheng Liu

27a 3D Photo Stylization: Learning To Generate Stylized Novel Views From a Single Image

Fangzhou Mu; Jian Wang; Yicheng Wu; Yin Li

28a BokehMe: When Neural Rendering Meets Classical Rendering

Juewen Peng; Zhiguo Cao; Xianrui Luo; Hao Lu; Ke Xian; Jianming Zhang

29a Deblurring via Stochastic Refinement

Jay Whang; Mauricio Delbracio; Hossein Talebi; Chitwan Saharia; Alexandros G. Dimakis; Peyman Milanfar

30a Learning to Deblur Using Light Field Generated and Real Defocus Images

Lingyan Ruan; Bin Chen; Jizhou Li; Miuling Lam

31a Towards Layer-Wise Image Vectorization

Xu Ma; Yuqian Zhou; Xingqian Xu; Bin Sun; Valerii Filev; Nikita Orlov; Yun Fu; Humphrey Shi

32a Dual-Shutter Optical Vibration Sensing

Mark Sheinin; Dorian Chan; Matthew O'Toole; Srinivasa G. Narasimhan

33a Fisher Information Guidance for Learned Time-of-Flight Imaging Jiaqu Li; Tao Yue; Sijie Zhao; Xuemei Hu
34a Autofocus for Event Cameras

Shijie Lin; Yinqiang Zhang; Lei Yu; Bin Zhou; Xiaowei Luo; Jia Pan

35a Adaptive Gating for Single-Photon 3D Imaging

Ryan Po; Adithya Pediredla; Ioannis Gkioulekas

36a LiDAR Snowfall Simulation for Robust 3D Object Detection

Martin Hahner; Christos Sakaridis; Mario Bijelic; Felix Heide; Fisher Yu; Dengxin Dai; Luc Van Gool

 

Date: Friday, June 24, 2022   8:30AM – 10:18AM
Session Title: Vision & Language
Session Chairs: Zicheng Liu (Microsoft), Gul Varol (Ecole des Ponts ParisTech)

Poster ID Title Authors
37a MERLOT Reserve: Neural Script Knowledge Through Vision and Language and Sound

Rowan Zellers; Jiasen Lu; Ximing Lu; Youngjae Yu; Yanpeng Zhao; Mohammadreza Salehi; Aditya Kusupati; Jack Hessel; Ali Farhadi; Yejin Choi

38a Joint Video Summarization and Moment Localization by Cross-Task Sample Transfer Hao Jiang; Yadong Mu
39a Towards General Purpose Vision Systems: An End-to-End Task-Agnostic Vision-Language Architecture

Tanmay Gupta; Amita Kamath; Aniruddha Kembhavi; Derek Hoiem

40a Disentangling Visual and Written Concepts in CLIP

Joanna Materzyńska; Antonio Torralba; David Bau

41a CLIP-Event: Connecting Text and Images With Event Structures

Manling Li; Ruochen Xu; Shuohang Wang; Luowei Zhou; Xudong Lin; Chenguang Zhu; Michael Zeng; Heng Ji; Shih-Fu Chang

42a Robust Cross-Modal Representation Learning With Progressive Self-Distillation

Alex Andonian; Shixing Chen; Raffay Hamid

43a TubeDETR: Spatio-Temporal Video Grounding With Transformers

Antoine Yang; Antoine Miech; Josef Sivic; Ivan Laptev; Cordelia Schmid

44a 3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection

Junyu Luo; Jiahui Fu; Xianghao Kong; Chen Gao; Haibing Ren; Hao Shen; Huaxia Xia; Si Liu

45a 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds

Daigang Cai; Lichen Zhao; Jing Zhang; Lu Sheng; Dong Xu

46a Globetrotter: Connecting Languages by Connecting Images Dídac Surís; Dave Epstein; Carl Vondrick
47a Unsupervised Vision-and-Language Pre-Training via Retrieval-Based Multi-Granular Alignment

Mingyang Zhou; Licheng Yu; Amanpreet Singh; Mengjiao Wang; Zhou Yu; Ning Zhang

48a WebQA: Multihop and Multimodal QA

Yingshan Chang; Mridu Narang; Hisami Suzuki; Guihong Cao; Jianfeng Gao; Yonatan Bisk

49a PartGlot: Learning Shape Part Segmentation From Language Reference Games

Juil Koo; Ian Huang; Panos Achlioptas; Leonidas J. Guibas; Minhyuk Sung

50a DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis

Ming Tao; Hao Tang; Fei Wu; Xiao-Yuan Jing; Bing-Kun Bao; Changsheng Xu

51a L-Verse: Bidirectional Generation Between Image and Text

Taehoon Kim; Gwangmo Song; Sihaeng Lee; Sangyun Kim; Yewon Seo; Soonyoung Lee; Seung Hwan Kim; Honglak Lee; Kyunghoon Bae

52a Think Global, Act Local: Dual-Scale Graph Transformer for Vision-and-Language Navigation

Shizhe Chen; Pierre-Louis Guhur; Makarand Tapaswi; Cordelia Schmid; Ivan Laptev

53a LaTr: Layout-Aware Transformer for Scene-Text VQA

Ali Furkan Biten; Ron Litman; Yusheng Xie; Srikar Appalaraju; R. Manmatha

54a Learning Program Representations for Food Images and Cooking Recipes

Dim P. Papadopoulos; Enrique Mora; Nadiia Chepurko; Kuan Wei Huang; Ferda Ofli; Antonio Torralba