CVPR2024-Papers-with-Code 欢迎分享CVPR 2023 论文和代码 / Welcome to share the paper and code of CVPR 2023

[The format of the issue] Paper name/title: Paper link: Code link:

Feb 28 '23 03:02 amusi

Paper title: DeepMapping2: Self-Supervised Large-Scale LiDAR Map Optimization Paper link: https://arxiv.org/abs/2212.06331 Code link: https://github.com/ai4ce/DeepMapping2

Feb 28 '23 16:02 Gaaaavin

Paper title: VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion Paper link: https://arxiv.org/abs/2302.12251 Code link: https://github.com/NVlabs/VoxFormer

Feb 28 '23 16:02 Gaaaavin

Paper title: PolyFormer: Referring Image Segmentation as Sequential Polygon Generation" Paper link: https://arxiv.org/abs/2302.07387

[update May 12th 2023] @amusi Could you please add the code link as well? https://github.com/amazon-science/polygon-transformer

Thank you!

Mar 01 '23 01:03 joellliu

Paper title: All in One: Exploring Unified Video-Language Pre-training Paper link: https://arxiv.org/abs/2203.07303 Code link: https://github.com/showlab/all-in-one

Paper title: Position-guided Text Prompt for Vision Language Pre-training Paper link: https://arxiv.org/abs/2212.09737 Code link: https://github.com/sail-sg/ptp

Mar 01 '23 02:03 FingerRec

Paper title: GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis Paper link: https://arxiv.org/abs/2301.12959 Code link: https://github.com/tobran/GALIP

GALIP是一个简单，快速，高质量的文本到图像生成模型，对比需要数百张GPU，400M图文对，数周时间进行预训练的Diffusion Model和Autoregressive Model，GALIP仅需8张3090，12M图文对，3天的预训练时间，取得了相当甚至更好的效果，同时生成速度提高了120倍，支持仅CPU生成。代码和预训练模型已经开源。

GALIP is a simple, fast, and high-quality Text-to-Image Generative Model with comparable results to large pretrained Autoregressive and Diffusion models and 120 times faster synthesis speed. Compared with the Diffusion and Autoregressive models which require hundreds of GPUs, 400M image-text pairs, and several weeks for pre-training, GALIP only needs 8x3090, 12M image-text pairs, and 3 days for pre-training. Furthermore, GALIP can be used without GPU. The code and pre-trained models have been released.

Mar 01 '23 16:03 tobran

Paper title: HouseDiffusion: Vector Floorplan Generation via a Diffusion Model with Discrete and Continuous Denoising Project page: https://aminshabani.github.io/housediffusion/ Paper link: https://arxiv.org/abs/2211.13287 Code link: https://github.com/aminshabani/house_diffusion

Mar 01 '23 17:03 aminshabani

Vision Transformers are Parameter-Efficient Audio-Visual Learners Project page: https://yanbo.ml/project_page/LAVISH/ code: https://github.com/GenjiB/LAVISH

Mar 01 '23 20:03 GenjiB

3D Visual and Language Paper title: EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding Paper link: https://arxiv.org/abs/2209.14941 Code link: https://github.com/yanmin-wu/EDA

Mar 02 '23 03:03 wuxiaolang

Paper title: DeepMapping2: Self-Supervised Large-Scale LiDAR Map Optimization Paper link: https://arxiv.org/abs/2212.06331 Code link: https://github.com/ai4ce/DeepMapping2

This paper can be categorized into "3D point cloud"

Mar 03 '23 02:03 Gaaaavin

Paper title: Generic-to-Specific Distillation of Masked Autoencoders Paper link: https://arxiv.org/abs/2302.14771 Code link: https://github.com/pengzhiliang/G2SD

This paper can be categorized into "Knowledge Distillation " or "Masked Autoencoders". Thank you!

Mar 03 '23 03:03 pengzhiliang

Paper title: Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples Paper link: https://arxiv.org/abs/2301.01217 Code link: https://github.com/jiamingzhang94/Unlearnable-Clusters

Mar 03 '23 16:03 jiamingzhang94

Paper title: MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation Paper link: https://arxiv.org/abs/2212.09478 Code link: https://github.com/researchmm/MM-Diffusion

Mar 04 '23 06:03 realPasu

Paper title: Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation Paper link: https://arxiv.org/abs/2211.13202 Code link: https://github.com/noahzn/Lite-Mono

Mar 04 '23 21:03 noahzn

Paper title: AdaptiveMix: Robust Feature Representation via Shrinking Feature Space Paper link: https://arxiv.org/pdf/2303.01559.pdf Code link: https://github.com/WentianZhang-ML/AdaptiveMix

Mar 07 '23 06:03 WentianZhang-ML

Paper title: DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting Paper link: https://arxiv.org/pdf/2211.10772v3.pdf Code link: https://github.com/ViTAE-Transformer/DeepSolo

Thank you!

Mar 08 '23 08:03 ymy-k

Paper title: DepGraph: Towards Any Structural Pruning Paper link: https://arxiv.org/abs/2301.12900 Code link: https://github.com/VainF/Torch-Pruning

Thank you! This paper should be categorized as Network Pruning

Mar 08 '23 12:03 VainF

Paper title: Back to the Source: Diffusion-Driven Adaptation to Test-Time Corruption Paper link: https://arxiv.org/abs/2207.03442 Code link: https://github.com/shiyegao/DDA

Thank you!

Mar 08 '23 13:03 Jialing-Zhang

Paper title: 3D Video Loops from Asynchronous Input Paper link: https://arxiv.org/abs/2303.05312 Project page: https://limacv.github.io/VideoLoop3D_web/ Code link: https://github.com/limacv/VideoLoop3D

This paper should be in a new category named 新视点合成(Novel View Synthesis), which I believe is also a hot topic with many more papers. But it can also be categorized as NeRF if no more sections can be added. Thank you!

Mar 10 '23 05:03 limacv

Paper title: Super-Resolution Neural Operator Paper link: https://arxiv.org/abs/2303.02584 Code link: https://github.com/2y7c3/Super-Resolution-Neural-Operator

Mar 10 '23 12:03 2y7c3

Paper name/title: Learning Transferable Spatiotemporal Representations from Natural Script Knowledge Paper link: https://arxiv.org/abs/2209.15280 Code link: https://github.com/TencentARC/TVTS

Mar 11 '23 04:03 stdKonjac

Paper name/title: DPE: Disentanglement of Pose and Expression for General Video Portrait Editing Paper link: https://arxiv.org/abs/2301.06281 Code link: https://carlyx.github.io/DPE/

Mar 12 '23 05:03 Carlyx

paper name/title: SadTalker： Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation Paper link: https://arxiv.org/abs/2211.12194 Code link: https://github.com/Winfredy/SadTalker

Mar 12 '23 08:03 vinthony

Paper name/title: DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network Paper link: https://arxiv.org/abs/2303.02165 Code link: https://github.com/alibaba/lightweight-neural-architecture-search

Please put it in the backbone chapter of the README.md.

Mar 13 '23 05:03 slacklife

Paper title: DeltaEdit: Exploring Text-free Training for Text-driven Image Manipulation Paper link: https://arxiv.org/abs/2303.06285 Code link: https://github.com/Yueming6568/DeltaEdit

Thank you : ) please put it in the GAN/CLIP/image manipulation/ image generation chapters.

Mar 14 '23 08:03 Yueming6568

The Arxiv link for BiFormer is now available. Please update. Thanks!

Paper name/title: BiFormer: Vision Transformer with Bi-Level Routing Attention Paper link: https://arxiv.org/abs/2303.08810 Code link: https://github.com/rayleizhu/BiFormer

Mar 16 '23 09:03 rayleizhu

Paper title: TriDet: Temporal Action Detection with Relative Boundary Modeling Paper link: https://arxiv.org/pdf/2303.07347.pdf Code link: https://github.com/dingfengshi/TriDet

maybe it can be put it in Video Understanding or a new chapter Action Detection? Thank you!

Mar 16 '23 11:03 dingfengshi

Thanks for it. Paper title: Causal-IR: Learning Distortion Invariant Representation for Image Restoration from A Causality Perspective Paper link: https://arxiv.org/pdf/2303.06859.pdf Code link: https://github.com/lixinustc/Casual-IR-DIL

The code will be released soon.

Mar 18 '23 06:03 lixinustc

Paper title: Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation Paper link: https://arxiv.org/abs/2303.11203 Code link: https://github.com/l1997i/lim3d

The code will be released soon. Thanks in advance!

Mar 21 '23 02:03 l1997i

Paper name/title: GFPose: Learning 3D Human Pose Prior with Gradient Fields Paper link: https://arxiv.org/pdf/2212.08641.pdf Code link: https://github.com/Embracing/GFPose

Thank you!

Mar 21 '23 10:03 ShirleyMaxx

Paper name/title: Diversity-Aware Meta Visual Prompting Paper link: https://arxiv.org/abs/2303.08138 Code link: https://github.com/shikiw/DAM-VP

Thanks a lot!

Mar 23 '23 06:03 shikiw