ECCV2022-Paper-Code-Interpretation

ECCV2022 论文/代码/解读合集，极市团队整理

ECCV2022 最新论文分类

检索链接：https://arxiv.org/search/?query=ECCV2022&searchtype=all&source=header
更新时间：2022年7月22日

相关报道：ECCV 2022放榜了！1629篇论文中选，录用率不到20%

1.ECCV2022 接受论文/代码分方向整理(持续更新)

2.ECCV2022 oral

3.ECCV2022 论文解读汇总

update:

2022/8/4 更新11篇

2022/7/29 更新 54 篇
2022/7/20 更新 54 篇

ECCV2022 接受论文/代码分方向整理(持续更新)

2D目标检测(2D Object Detection)
视频目标检测(Video Object Detection)
3D目标检测(3D Object Detection)
人物交互检测(HOI Detection)
伪装目标检测(Camouflaged Object Detection)
旋转目标检测(Rotation Object Detection)
显著性目标检测(Saliency Object Detection)
关键点检测(Keypoint Detection)
车道线检测(Lane Detection)
边缘检测(Edge Detection)
消失点检测(Vanishing Point Detection)
异常检测(Anomaly Detection)

2. 分割(Segmentation)

图像分割(Image Segmentation)
全景分割(Panoptic Segmentation)
语义分割(Semantic Segmentation)
实例分割(Instance Segmentation)
超像素(Superpixel)
视频目标分割(Video Object Segmentation)
抠图(Matting)
密集预测(Dense Prediction)

3. 图像处理(Image Processing)

超分辨率(Super Resolution)
图像复原/图像增强/图像重建(Image Restoration/Image Reconstruction)
图像去阴影/去反射(Image Shadow Removal/Image Reflection Removal)
图像去噪/去模糊/去雨去雾(Image Denoising/Deblurring/Dehazing)
图像编辑/图像修复(Image Edit/Image Inpainting)
图像翻译(Image Translation)
图像质量评估(Image Quality Assessment)
风格迁移(Style Transfer)

4. 视频处理(Video Processing)

视频编辑(Video Editing)
视频修复(Video Inpainting)
视频去模糊(Video Deblurring)
视频生成/视频合成(Video Generation/Video Synthesis)
视频超分(Video Super-Resolution)

5. 图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)

行为识别/行为识别/动作识别/检测/分割(Action/Activity Recognition)
行人重识别/检测(Re-Identification/Detection)
图像/视频字幕(Image/Video Caption)
视频理解(Video Understanding)
图像/视频检索(Image/Video Retrieval)

6. 估计(Estimation)

光流/运动估计(Flow/Motion Estimation)
深度估计(Depth Estimation)
人体解析/人体姿态估计(Human Parsing/Human Pose Estimation)
手势估计(Gesture Estimation)

7. 人脸(Face)

人脸识别/检测(Facial Recognition/Detection)
人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing)
人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing)

8. 三维视觉(3D Vision)

点云(Point Cloud)
三维重建(3D Reconstruction)
场景重建/视图合成/新视角合成(Novel View Synthesis)

9. 目标跟踪(Object Tracking)

10. 医学影像(Medical Imaging)

11. 文本检测/识别/理解(Text Detection/Recognition/Understanding)

12. 遥感图像(Remote Sensing Image)

13. GAN/生成式/对抗式(GAN/Generative/Adversarial)

14. 图像生成/图像合成(Image Generation/Image Synthesis)

15. 场景图(Scene Graph)

场景图生成(Scene Graph Generation)
场景图预测(Scene Graph Prediction)
场景图理解(Scene Graph Understanding)

16. 视觉推理/视觉问答(Visual Reasoning/VQA)

17. 视觉预测(Vision-based Prediction)

18. 神经网络结构设计(Neural Network Structure Design)

DNN
CNN
Transformer
图神经网络(GNN)
神经网络架构搜索(NAS)
MLP

19. 神经网络可解释性(Neural Network Interpretability)

20. 数据集(Dataset)

21. 数据处理(Data Processing)

数据增广(Data Augmentation)
归一化/正则化(Batch Normalization)
图像聚类(Image Clustering)
图像压缩(Image Compression)

22. 图像特征提取与匹配(Image feature extraction and matching)

23. 视觉表征学习(Visual Representation Learning)

24. 模型训练/泛化(Model Training/Generalization)

噪声标签(Noisy Label)
长尾分布(Long-Tailed Distribution)

25. 模型压缩(Model Compression)

知识蒸馏(Knowledge Distillation)
剪枝(Pruning)
量化(Quantization)

26. 模型评估(Model Evaluation)

27. 图像分类(Image Classification)

28. 图像计数(Image Counting)

29. 机器人(Robotic)

30. 半监督学习/弱监督学习/无监督学习/自监督学习(Self-supervised Learning/Semi-supervised Learning)

31. 多模态学习(Multi-Modal Learning)

视听学习(Audio-visual Learning)
视觉-语言（Vision-language）

32. 主动学习(Active Learning)

33. 小样本学习/零样本学习(Few-shot/Zero-shot Learning)

34. 持续学习(Continual Learning/Life-long Learning)

35. 迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)

36. 度量学习(Metric Learning)

37. 对比学习(Contrastive Learning)

38. 增量学习(Incremental Learning)

39. 强化学习(Reinforcement Learning)

40. 元学习(Meta Learning)

41. 联邦学习(Federated Learning)

42. 模仿学习(Imitation Learning)

1. 检测

2D目标检测(2D Object Detection)

[4] Multimodal Object Detection via Probabilistic Ensembling (基于概率集成的多模态目标检测) (Oral)

paper | code

[3] Point-to-Box Network for Accurate Object Detection via Single Point Supervision (通过单点监督实现精确目标检测的点对盒网络)
paper | code

[2] You Should Look at All Objects (您应该查看所有物体)
paper | code

[1] Adversarially-Aware Robust Object Detector (对抗性感知鲁棒目标检测器)(Oral))
paper | code

3D目标检测(3D Object Detection)

[2] Densely Constrained Depth Estimator for Monocular 3D Object Detection (用于单目 3D 目标检测的密集约束深度估计器)
paper | code

[1] Rethinking IoU-based Optimization for Single-stage 3D Object Detection (重新思考基于 IoU 的单阶段 3D 对象检测优化)
paper

视频目标检测(Video Object Detection)

人物交互检测(HOI Detection)

[2] Discovering Human-Object Interaction Concepts via Self-Compositional Learning (通过自组合学习发现人-物交互概念)

paper | [code](https://github.com/zhihou7/scl; https://github.com/zhihou7/HOI-CL)

[1] Towards Hard-Positive Query Mining for DETR-based Human-Object Interaction Detection (面向基于 DETR 的人机交互检测的硬性查询挖掘)
paper | code

显著性目标检测(Saliency Object Detection)

[1] KD-SCFNet: Towards More Accurate and Efficient Salient Object Detection via Knowledge Distillation (KD-SCFNet：通过知识蒸馏实现更准确、更高效的显着目标检测)

paper | code

伪装目标检测(Camouflaged Object Detection)

图像异常检测/表面缺陷检测(Anomally Detection in Image)

[2] DSR -- A dual subspace re-projection network for surface anomaly detection (DSR——用于表面异常检测的双子空间重投影网络)

paper | code

[1] DICE: Leveraging Sparsification for Out-of-Distribution Detection (DICE：利用稀疏化进行分布外检测)
paper | code

边缘检测(Edge Detection)

2. 分割(Segmentation)

图像分割(Image Segmentation)

实例分割(Instance Segmentation)

[3] In Defense of Online Models for Video Instance Segmentation (为视频实例分割的在线模型辩护) (Oral)
paper|code

[2] Box-supervised Instance Segmentation with Level Set Evolution (具有水平集进化的框监督实例分割)
paper

[1] OSFormer: One-Stage Camouflaged Instance Segmentation with Transformers (OSFormer：使用 Transformers 进行单阶段伪装实例分割)
paper | code

语义分割(Semantic Segmentation)

[1] 2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds (2DPASS：激光雷达点云上的二维先验辅助语义分割)
paper | code

视频目标分割(Video Object Segmentation)

[1] Learning Quality-aware Dynamic Memory for Video Object Segmentation (视频对象分割的学习质量感知动态内存)
paper | code

参考图像分割(Referring Image Segmentation)

密集预测(Dense Prediction)

3. 图像处理(Image Processing)

超分辨率(Super Resolution)

[3] Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution (学习高效图像超分辨率的串并行查找表)

paper | code

[2] Efficient Meta-Tuning for Content-aware Neural Video Delivery (内容感知神经视频交付的高效元调整)
paper | code

[1] Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks (超低精度超分辨率网络的动态双可训练边界)
paper | code

图像复原/图像增强/图像重建(Image Restoration/Image Reconstruction)

[9] Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression (无监督夜间图像增强：当层分解遇到光效抑制时)

paper | code

[8] Bringing Rolling Shutter Images Alive with Dual Reversed Distortion(通过双重反转失真使滚动快门图像重现) (Oral)
paper | code

[7] Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression (无监督夜间图像增强：当层分解遇到光效抑制时)
paper | code

[6] Semantic-Sparse Colorization Network for Deep Exemplar-based Colorization (用于基于深度示例的着色的语义稀疏着色网络)
paper

[5] Geometry-aware Single-image Full-body Human Relighting (几何感知单图像全身人体重新照明)
paper

[4] Multi-Modal Masked Pre-Training for Monocular Panoramic Depth Completion (单目全景深度补全的多模态蒙面预训练)
paper

[3] PanoFormer: Panorama Transformer for Indoor 360 Depth Estimation (PanoFormer：用于室内 360 深度估计的全景变压器)
paper

[2] SESS: Saliency Enhancing with Scaling and Sliding (SESS：通过缩放和滑动增强显着性)
paper

[1] RigNet: Repetitive Image Guided Network for Depth Completion (RigNet：用于深度补全的重复图像引导网络)
paper

图像去阴影/去反射(Image Shadow Removal/Image Reflection Removal)

[1] Deep Portrait Delighting (深度人像去光)

paper

图像去噪(Image Denoising/Deblurring/Dehazing)

[3] Perceiving and Modeling Density is All You Need for Image Dehazing (感知和建模密度是图像去雾所需的全部) (Oral)
paper |code

[2] Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance (来自模糊的动画：具有运动引导的多模态模糊分解)
paper | code

[1] Deep Semantic Statistics Matching (D2SM) Denoising Network (深度语义统计匹配（D2SM）去噪网络)
paper

图像外推(Image Outpainting)

[1] Outpainting by Queries (通过查询进行外推)
paper | code

风格迁移(Style Transfer)

[1] CCPL: Contrastive Coherence Preserving Loss for Versatile Style Transfer (CCPL：通用风格迁移的对比相干性保留损失) (Oral)
paper | code

4. 视频处理(Video Processing)

视频编辑(Video Editing)

[3] AlphaVC: High-Performance and Efficient Learned Video Compression (AlphaVC：高性能和高效的学习视频压缩)

paper

[2] Improving the Perceptual Quality of 2D Animation Interpolation (提高二维动画插值的感知质量)
paper | code

[1] Real-Time Intermediate Flow Estimation for Video Frame Interpolation(视频帧插值的实时中间流估计)
paper | code

视频修复(Video Inpainting)

[1] Error Compensation Framework for Flow-Guided Video Inpainting (流引导视频修复的误差补偿框架)
paper

视频去模糊(Video Deblurring)

[2] Event-guided Deblurring of Unknown Exposure Time Videos (未知曝光时间视频的事件引导去模糊) (Oral)

paper

[1] Efficient Video Deblurring Guided by Motion Magnitude (由运动幅度引导的高效视频去模糊)

paper | code

5. 图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)

行为识别/行为识别/动作识别/检测/分割(Action/Activity Recognition)

[4] GaitEdge: Beyond Plain End-to-end Gait Recognition for Better Practicality (GaitEdge：超越普通的端到端步态识别，提高实用性)
paper | code

[3] Collaborating Domain-shared and Target-specific Feature Clustering for Cross-domain 3D Action Recognition (用于跨域 3D 动作识别的协作域共享和特定于目标的特征聚类)
paper | code

[2] ReAct: Temporal Action Detection with Relational Queries (ReAct：使用关系查询的时间动作检测)
paper | code

[1] Hunting Group Clues with Transformers for Social Group Activity Recognition (用Transformers寻找群体线索用于社会群体活动识别)
paper

行人重识别/检测(Re-Identification/Detection)

[1] PASS: Part-Aware Self-Supervised Pre-Training for Person Re-Identification(PASS：用于人员重新识别的部分感知自我监督预训练)
paper | code

图像/视频字幕(Image/Video Caption)

视频理解(Video Understanding)

[1] GraphVid: It Only Takes a Few Nodes to Understand a Video (GraphVid：只需几个节点即可理解视频) (Oral)
paper

图像/视频检索(Image/Video Retrieval)

[6] Can Shuffling Video Benefit Temporal Bias Problem: A Novel Training Framework for Temporal Grounding (打乱的视频是否有益于时间偏差问题：一种新的时间接地训练框架)

paper |code

[5] Feature Representation Learning for Unsupervised Cross-domain Image Retrieval (无监督跨域图像检索的特征表示学习)
paper | code

[4] LocVTP: Video-Text Pre-training for Temporal Localization (LocVTP：时间定位的视频文本预训练)
paper | code

[3] Deep Hash Distillation for Image Retrieval (用于图像检索的深度哈希蒸馏)
paper | code

[2] TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval (TS2-Net：用于文本视频检索的令牌移位和选择转换器)
paper | code

[1] Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval (轻量级注意力特征融合：文本到视频检索的新基线)
paper

6. 估计(Estimation)

光流/运动估计(Flow/Motion Estimation)

[1] Deep 360∘ Optical Flow Estimation Based on Multi-Projection Fusion (基于多投影融合的深度360∘光流估计)

paper

视觉定位/位姿估计(Visual Localization/Pose Estimation)

[4] Overlooked Poses Actually Make Sense: Distilling Privileged Knowledge for Human Motion Prediction (被忽视的姿势实际上是有意义的：为人体运动预测提炼特权知识)

paper

[3] 3D Interacting Hand Pose Estimation by Hand De-occlusion and Removal (通过手部去遮挡和移除的 3D 交互手部姿势估计)

paper | code

[2] Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration (基于隐式空间校准的 Transformer 的弱监督目标定位)
[paper] (https://arxiv.org/abs/2207.10447) | code

[1] Category-Level 6D Object Pose and Size Estimation using Self-Supervised Deep Prior Deformation Networks (使用自监督深度先验变形网络的类别级 6D 对象姿势和大小估计)
paper | code

深度估计(Depth Estimation)

[1] Physical Attack on Monocular Depth Estimation with Optimal Adversarial Patches ((使用最优对抗补丁对单目深度估计进行物理攻击))
paper

7. 人脸(Face)

人脸识别/检测(Facial Recognition/Detection)

[1] Towards Racially Unbiased Skin Tone Estimation via Scene Disambiguation (通过场景消歧实现种族无偏肤色估计)

paper | code

人脸识别/检测(Facial Recognition/Detection)

[1] MoFaNeRF: Morphable Facial Neural Radiance Field (MoFaNeRF：可变形面部神经辐射场)

paper |code

人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing)

8. 三维视觉(3D Vision)

三维重建(3D Reconstruction)

[1] DiffuStereo: High Quality Human Reconstruction via Diffusion-based Stereo Using Sparse Cameras (DiffuStereo：使用稀疏相机通过基于扩散的立体进行高质量人体重建)
paper

场景重建/视图合成/新视角合成(Novel View Synthesis)

[1] Sem2NeRF: Converting Single-View Semantic Masks to Neural Radiance Fields (Sem2NeRF：将单视图语义掩码转换为神经辐射场)
paper | code

9. 目标跟踪(Object Tracking)

[2] Tracking Every Thing in the Wild (追踪野外的每一件事)

paper

[1] Towards Grand Unification of Object Tracking (迈向目标跟踪的大统一) (Oral)
paper | code

10. 医学影像(Medical Imaging)

11. 文本检测/识别/理解(Text Detection/Recognition/Understanding)

[5] Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition (了解艺术字：用于场景文本识别的角引导转换器) (Oral)

paper | code

[4] Contextual Text Block Detection towards Scene Text Understanding (面向场景文本理解的上下文文本块检测)

paper

[3] PromptDet: Towards Open-vocabulary Detection using Uncurated Images (PromptDet：使用未经处理的图像进行开放词汇检测)
paper |code

[2] End-to-End Video Text Spotting with Transformer (使用 Transformer 的端到端视频文本定位) (Oral)
paper | code

[1] Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting (用于经济高效的端到端文本定位的动态低分辨率蒸馏)
paper | code

12. 遥感图像(Remote Sensing Image)

13. GAN/生成式/对抗式(GAN/Generative/Adversarial)

[7] Learning Energy-Based Models With Adversarial Training (通过对抗训练学习基于能量的模型)

paper | code

[6] Adaptive Image Transformations for Transfer-based Adversarial Attack (基于传输的对抗性攻击的自适应图像转换)
paper

[5] Generative Multiplane Images: Making a 2D GAN 3D-Aware (生成多平面图像：让一个2D GAN变得3D感知)
paper | code

[4] Eliminating Gradient Conflict in Reference-based Line-Art Colorization (消除基于参考的艺术线条着色中的梯度冲突)
paper | code

[3] WaveGAN: Frequency-aware GAN for High-Fidelity Few-shot Image Generation (WaveGAN：用于高保真少镜头图像生成的频率感知 GAN)
paper | code

[2] FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs (FakeCLR：探索对比学习以解决数据高效 GAN 中的潜在不连续性)
paper | code

[1] UniCR: Universally Approximated Certified Robustness via Randomized Smoothing (UniCR：通过随机平滑获得普遍近似的认证鲁棒性)
paper

14. 图像生成/图像合成(Image Generation/Image Synthesis)

[1] PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation (PixelFolder：用于图像生成的高效渐进式像素合成网络)

paper | code

15. 场景图(Scene Graph)

16. 视觉推理/视觉问答(Visual Reasoning/VQA)

17. 视觉预测(Vision-based Prediction)

[1] D2-TPred: Discontinuous Dependency for Trajectory Prediction under Traffic Lights (D2-TPred：交通灯下轨迹预测的不连续依赖)
paper | code

18. 神经网络结构设计(Neural Network Structure Design)

DNN

[1] Hardly Perceptible Trojan Attack against Neural Networks with Bit Flips (使用 Bit Flips 对神经网络进行难以察觉的特洛伊木马攻击)

paper|code

CNN

[1] PalQuant: Accelerating High-precision Networks on Low-precision Accelerators (PalQuant：在低精度加速器上加速高精度网络)

paper | code

Transformer

[5] Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding (用于长期 4D 点云视频理解的 Point Primitive Transformer)

paper

[4] Improving Vision Transformers by Revisiting High-frequency Components (通过重新审视高频组件来改进视觉变压器)

paper | code

[3] Transformer with Implicit Edges for Particle-based Physics Simulation (用于基于粒子的物理模拟的隐式边缘变压器)

paper | code

[2] ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer (ScalableViT：重新思考 Vision Transformer 面向上下文的泛化)
paper | code

[1] Visual Prompt Tuning (视觉提示调整)
paper | code

图神经网络(GNN)

神经网络架构搜索(NAS)

[3] ScaleNet: Searching for the Model to Scale (ScaleNet：搜索要扩展的模型)
paper | code

[2] Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning (集成知识引导的子网络搜索和过滤器修剪微调)
paper | code

[1] EAGAN: Efficient Two-stage Evolutionary Architecture Search for GANs (EAGAN：GAN 的高效两阶段进化架构搜索)
paper | code

MLP

19. 神经网络可解释性(Neural Network Interpretability)

20. 数据集(Dataset)

21. 数据处理(Data Processing)

数据增广(Data Augmentation)

归一化/正则化(Batch Normalization)

[1] Fine-grained Data Distribution Alignment for Post-Training Quantization (训练后量化的细粒度数据分布对齐) (Oral)
paper | code

图像聚类(Image Clustering)

图像压缩(Image Compression)

[1] Content-Oriented Learned Image Compression (面向内容的学习图像压缩)

paper

22. 图像特征提取与匹配(Image feature extraction and matching)

[1] Unsupervised Deep Multi-Shape Matching (无监督深度多形状匹配)
paper

23. 视觉表征学习(Visual Representation Learning)

[1] Object-Compositional Neural Implicit Surfaces (对象组合神经隐式曲面)
paper | code

24. 模型训练/泛化(Model Training/Generalization)

噪声标签(Noisy Label)

[1] Learning with Noisy Labels by Efficient Transition Matrix Estimation to Combat Label Miscorrection (通过有效的转移矩阵估计学习噪声标签以对抗标签错误校正)
paper

长尾分布(Long-Tailed Distribution)

[2] Long-tailed Instance Segmentation using Gumbel Optimized Loss (使用 Gumbel 优化损失的长尾实例分割)

paper | code

[1] Identifying Hard Noise in Long-Tailed Sample Distribution (识别长尾样本分布中的硬噪声) (Oral)

paper|code

25. 模型压缩(Model Compression)

知识蒸馏(Knowledge Distillation)

[3] Prune Your Model Before Distill It (在蒸馏之前修剪你的模型)

paper|code

[2] Efficient One Pass Self-distillation with Zipf's Label Smoothing (使用 Zipf 的标签平滑实现高效的单程自蒸馏)

paper | code

[1] Knowledge Condensation Distillation (知识浓缩蒸馏)
paper | code

剪枝(Pruning)

量化(Quantization)

26. 模型评估(Model Evaluation)

[1] Hierarchical Latent Structure for Multi-Modal Vehicle Trajectory Forecasting (多模式车辆轨迹预测的分层潜在结构)
paper | code

27. 图像分类(Image Classification)

[1] Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels (中心性和一致性：使用实例相关的噪声标签进行学习的两阶段清洁样本识别)

paper | code

28. 图像计数(Image Counting)

29. 机器人(Robotic)

30. 半监督学习/弱监督学习/无监督学习/自监督学习(Self-supervised Learning/Semi-supervised Learning)

[8] Acknowledging the Unknown for Multi-label Learning with Single Positive Labels (用单个正标签承认未知的多标签学习)

paper | code

[7] W2N:Switching From Weak Supervision to Noisy Supervision for Object Detection (W2N：目标检测从弱监督切换到嘈杂监督)

paper | code

[6] CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation (CA-SSL：用于检测和分割的与类别无关的半监督学习)
paper | code

[5] FedX: Unsupervised Federated Learning with Cross Knowledge Distillation (FedX：具有交叉知识蒸馏的无监督联合学习)
paper

[4] Synergistic Self-supervised and Quantization Learning (协同自监督和量化学习)
paper | code

[3] Contrastive Deep Supervision (对比深度监督)
paper | code

[2] Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection (稠密教师：用于半监督目标检测的稠密伪标签)
paper

[1] Image Coding for Machines with Omnipotent Feature Learning (具有全能特征学习的机器的图像编码)
paper

31. 多模态学习/跨模态(Multi-Modal Learning/Cross-Modal Learning)

视听学习(Audio-visual Learning)

视觉-语言（Vision-language）

[2] Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting (语言问题：用于场景文本检测和识别的弱监督视觉语言预训练方法) (Oral)

paper

[1] Contrastive Vision-Language Pre-training with Limited Resources (资源有限的对比视觉语言预训练)
paper | code

跨模态（cross-modal)

[1] Cross-modal Prototype Driven Network for Radiology Report Generation (用于放射学报告生成的跨模式原型驱动网络)
paper | code

32. 主动学习(Active Learning)

33. 小样本学习/零样本学习(Few-shot/Zero-shot Learning)

[2] Worst Case Matters for Few-Shot Recognition (最坏情况对少数镜头识别很重要)

paper | code

[1] Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning (用于少数镜头学习的学习实例和任务感知动态内核)
paper

34. 持续学习(Continual Learning/Life-long Learning)

[2] Balancing Stability and Plasticity through Advanced Null Space in Continual Learning (通过持续学习中的高级零空间平衡稳定性和可塑性) (Oral)

paper

[1] Online Continual Learning with Contrastive Vision Transformer (使用对比视觉转换器进行在线持续学习)

paper

35. 迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)

[2] Factorizing Knowledge in Neural Networks (在神经网络中分解知识)
paper | code

[1] CycDA: Unsupervised Cycle Domain Adaptation from Image to Video (CycDA：从图像到视频的无监督循环域自适应)
paper

36. 度量学习(Metric Learning)

37. 对比学习(Contrastive Learning)

38. 增量学习(Incremental Learning)

39. 强化学习(Reinforcement Learning)

[1] Target-absent Human Attention (目标缺失——人类注意力缺失)
paper | code

40. 元学习(Meta Learning)

41. 联邦学习(Federated Learning)

42. 模仿学习(Imitation Learning)

[1] Resolving Copycat Problems in Visual Imitation Learning via Residual Action Prediction (通过残差动作预测解决视觉模仿学习中的模仿问题)
paper

ECCV2022 Oral

[15] Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition (了解艺术字：用于场景文本识别的角引导转换器) (Oral)

paper | code

[14] Balancing Stability and Plasticity through Advanced Null Space in Continual Learning (通过持续学习中的高级零空间平衡稳定性和可塑性) (Oral)

paper

[13] Event-guided Deblurring of Unknown Exposure Time Videos (未知曝光时间视频的事件引导去模糊) (Oral)

paper

[12] Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting (语言问题：用于场景文本检测和识别的弱监督视觉语言预训练方法) (Oral)

paper

[11] Multimodal Object Detection via Probabilistic Ensembling (基于概率集成的多模态目标检测) (Oral)

paper | code

[10] Identifying Hard Noise in Long-Tailed Sample Distribution (识别长尾样本分布中的硬噪声) (Oral)

paper|code

[9] In Defense of Online Models for Video Instance Segmentation (为视频实例分割的在线模型辩护) (Oral)
paper|code

[8] Perceiving and Modeling Density is All You Need for Image Dehazing (感知和建模密度是图像去雾所需的全部) (Oral)
paper |code

[7] Bringing Rolling Shutter Images Alive with Dual Reversed Distortion(通过双重反转失真使滚动快门图像重现) (Oral)
paper | code

[6] End-to-End Video Text Spotting with Transformer(使用 Transformer 的端到端视频文本定位) (Oral)
paper | code

[5] GraphVid: It Only Takes a Few Nodes to Understand a Video(GraphVid：只需几个节点即可理解视频) (Oral)
paper

[4] CCPL: Contrastive Coherence Preserving Loss for Versatile Style Transfer(CCPL：用于通用风格迁移的对比相干性保留损失) (Oral)
paper | code

[3] Fine-grained Data Distribution Alignment for Post-Training Quantization(训练后量化的细粒度数据分布对齐) (Oral)
paper | code