chenjjcccc

Results 20 issues of


                                            chenjjcccc

CAT-Seg (CVPR'2023)模型复现

### 问题描述 Please describe your issue # CAT-Seg (CVPR'2023)模型复现 ## 任务描述 ### 任务背景 - CAT-Seg是open-vocabulary semantic segmentation的前沿模型，其提出了一种cost aggregation方法将CLIP表征应用于像素级分割任务，在多个数据集上达到了开放集分割的SOTA ### 完成步骤 1. 数据和模型、代码均已经开源。 2. 根据开源代码进行网络结构、评估指标转换，[代码链接](https://github.com/KU-CVLAB/CAT-Seg)。 3. 结合[论文复现指南](https://github.com/PaddlePaddle/models/blob/release%2F2.2/tutorials/article-implementation/ArticleReproduction_CV.md)和[复现指南-新]()ppsigs/article-implementation/论文复现指南-新.pdf，进行前反向对齐等操作，**达到论文Table.1中的指标**。 4. 进行TIPC验证lite train lite...

MedicalSeg增加滑窗推理功能

### 问题描述 Please describe your issue # MedicalSeg增加滑窗推理功能 ## 任务描述 ### 任务背景 - 3D医疗图像中缺少滑窗推理推理功能，滑窗推理可以进一步增强任意模型的精度 ### 完成步骤 1. 参考[预测代码](https://github.com/PaddlePaddle/PaddleSeg/blob/develop/tools/predict.py)，基于medicalseg撰写预测代码predict.py。 2. 参考[动态图滑窗推理代码](https://github.com/PaddlePaddle/PaddleSeg/blob/d294aed526e415033f4cf3de9e9edc7ffa23b593/paddleseg/core/infer.py#L153)进行3D的滑窗推理代码开发，并加入到mediecalseg的predict中。 3. 参考[PR提交规范](https://github.com/PaddlePaddle/PaddleSeg/blob/develop/docs/pr/pr/style_cn.md)提交代码PR到[ppseg](https://github.com/PaddlePaddle/PaddleSeg/tree/develop)中。 ### 提交内容： 1. 代码提交到PaddleSeg。

新增early stop功能

### 问题描述 Please describe your issue # 新增early stop功能 ## 任务描述 ### 任务背景 - early stop作为一种正则化的工具，可以用于模型开发的优化过程中，作为新增功能增加paddleseg中 ### 完成步骤 1. 参考early stopping的[实现](https://stackoverflow.com/questions/71998978/early-stopping-in-pytorch)，为训练增加early stopping的argment 配置，实现在损失在1000个iter后不再比之前更小就停止训练。 2. 参考[PR提交规范](https://github.com/PaddlePaddle/PaddleSeg/blob/develop/docs/pr/pr/style_cn.md)提交代码PR到[ppseg](https://github.com/PaddlePaddle/PaddleSeg/tree/develop)中。 ### 提交内容： 1. 提交对tools.train.py的修改到PaddleSeg。

增加类激活图

### 问题描述 Please describe your issue # 增加类激活图 ## 任务描述 ### 任务背景 - 激活图可视化能够可以帮助理解深度学习模型任务中的决策过程。通过观察模型关注的区域，可以了解模型是如何根据不同区域的特征来进行分类决策的，是一项十分有意义且重要的功能 ### 完成步骤 1. 参照[https://github.com/open-mmlab/mmsegmentation/pull/3324/](https://github.com/open-mmlab/mmsegmentation/pull/3324/)增加类激活图可视化功能。 2. 增加代码到[https://github.com/PaddlePaddle/PaddleSeg/tree/develop/tools/data](https://github.com/PaddlePaddle/PaddleSeg/tree/develop/tools/data)。 3. 并将此功能作为predict中的可选功能。 4. 参考[PR提交规范](https://github.com/PaddlePaddle/PaddleSeg/blob/develop/docs/pr/pr/style_cn.md)提交代码PR到[ppseg](https://github.com/PaddlePaddle/PaddleSeg/tree/develop)中。 ### 提交内容： 1. 代码提交到到PaddleSeg。

VPD模型+下游任务（视觉感知、图像分割、深度估计）

### 问题描述 Please describe your issue # VPD模型+下游任务（视觉感知、图像分割、深度估计） ## 任务描述 ### 任务背景 - VPD是结合Diffusion Models的图文预训练模型，可以广泛的应用于下游任务，如视觉感知、图像分割、深度估计等等，且均取得了不错的效果。可以将VPD接入PaddleSeg中，并应用于下游任务中 1. 数据和模型、代码均已经开源。 2. 根据开源代码进行网络结构、评估指标转换，[代码链接](https://github.com/wl-zhao/VPD)。 3. 结合[论文复现指南](https://github.com/PaddlePaddle/models/blob/release%2F2.2/tutorials/article-implementation/ArticleReproduction_CV.md)和[复现指南-新]()ppsigs/article-implementation/论文复现指南-新.pdf，进行前反向对齐等操作，**达到论文Table.1中的指标**。 4. 进行TIPC验证lite train lite infer 链条。 5. 参考[PR提交规范](https://github.com/PaddlePaddle/PaddleSeg/blob/develop/docs/pr/pr/style_cn.md)提交代码PR到[ppseg](https://github.com/PaddlePaddle/PaddleSeg/tree/develop)中。 ###...

新增图文对话模型X-GPT

### 问题描述 Please describe your issue # 新增图文对话模型X-GPT ## 任务描述 ### 任务背景 - X-Decoder 集成了图像理解的多类任务，结合GPT和SD相关生成模型就可以实现All-in-One的图文对话式agnet。[参考代码](https://github.com/microsoft/X-Decoder/tree/xgpt)。 ### 完成步骤 1. 基于Paddle复现X-Decoder，其可以进行检测、分割、VQA、取标题等多种应用。（如果训练对齐遇到不可抗问题，可以仅进行前向对齐。） 2. 结合PaddleMix中基础模型ppdiffuser，进行生成模型迁移。 3. 结合开源对话模型例如chatglm v2或者llama v2，来实现XGPT，给出使用示例和文档以及UI。提交至PaddleSeg/contrib/XGPT。 4. 参考[PR提交规范](https://github.com/PaddlePaddle/PaddleSeg/blob/develop/docs/pr/pr/style_cn.md)提交代码PR到[ppseg](https://github.com/PaddlePaddle/PaddleSeg/tree/develop)中。 ### 提交内容： 1....

验证并提升SAM+Clip在语义分割场景下的zero-shot分割精度

### 问题描述 Please describe your issue # 验证并提升SAM+Clip在语义分割场景下的zero-shot分割精度 ## 任务描述 ### 任务背景 - 以语义分割为代表的视觉任务存在泛化性差的问题，即每次在新数据上都需要重新训练。大模型的发展利用图文链接的形式大大提升了模型的泛化性，但是[前沿论文](https://paperswithcode.com/paper/learning-mask-aware-clip-representations-for)对于zero-shot的研究表明，完全的zero-shot的分割精度依旧较低。因此我们借用clip中对zero-shot的定义，即在未见过的图片而非是未见过的类别上，查看CLIP+SAM模型的分割效果（这一定义也十分有实用意义），并借用[前沿论文](https://paperswithcode.com/paper/learning-mask-aware-clip-representations-for)的思想对baseline进一步优化。这一举动将验证并优化语义分割模型在未见过的数据上的泛化性 ### 完成步骤 1. 使用PaddleSeg中的SegmentAnything代码，在cityscapes和ADE20k上直接分割，查看评估精度。 2. 使用冻结的CLIP模型对SA-1B数据进行高置信度标签筛选标注。 3. 参照[前沿论文](https://paperswithcode.com/paper/learning-mask-aware-clip-representations-for)的代码，对CLIP在SA-1B上进行微调训练，查看训练后在cityscapes上的精度。 4. 进行各类论文调研和优化，最后超过或能对比到监督训练的模型精度。 5. 参考[PR提交规范](https://github.com/PaddlePaddle/PaddleSeg/blob/develop/docs/pr/pr/style_cn.md)提交代码PR到[ppseg](https://github.com/PaddlePaddle/PaddleSeg/tree/develop)中。 ### 提交内容： 1. 代码提交到PaddleSeg。

【Bug Fix】humanseg显存泄漏

### 问题描述 Please describe your issue # 【Bug Fix】humanseg显存泄漏 ## 任务描述 ### 任务背景 - 使用PaddleSeg进行[人像分割](https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.8/contrib/PP-HumanSeg)时，对大批量数据进行人像分割推理时，内存释放不充分，出现内存堆积问题，触发Linux OOM机制导致程序被kill。参考（[#3486](https://github.com/PaddlePaddle/PaddleSeg/issues/3486)）。 ### 完成步骤 1. 按照Readme.md进行操作，问题复现。 2. 深入了解问题并采取相应解决措施。 3. 参考[PR提交规范](https://github.com/PaddlePaddle/PaddleSeg/blob/develop/docs/pr/pr/style_cn.md)提交代码PR到[ppseg](https://github.com/PaddlePaddle/PaddleSeg/tree/develop)中。 ### 提交内容： 1. 代码提交到PaddleSeg。

【Bug Fix】modnet推理问题

### 问题描述 Please describe your issue # 【Bug Fix】modnet推理问题 ## 任务描述 ### 任务背景 - 使用modnet进行image matting，在将其转换为 paddlelite 兼容模型时，出现报错，具体参考（[#3477](https://github.com/PaddlePaddle/PaddleSeg/issues/3477)）。 ### 完成步骤 1. 按照Readme.md进行操作，问题复现。 2. 深入了解问题并采取相应解决措施。 3. 参考[PR提交规范](https://github.com/PaddlePaddle/PaddleSeg/blob/develop/docs/pr/pr/style_cn.md)提交代码PR到[ppseg](https://github.com/PaddlePaddle/PaddleSeg/tree/develop)中。 ### 提交内容： 1. 代码提交到PaddleSeg。

增加训练图像、推理图像、标签图像可视化

### 问题描述 Please describe your issue # 增加训练图像、推理图像、标签图像可视化 ## 任务描述 ### 任务背景 - 飞浆支持强大的训练可视化工具VisualDL，用于记录和监控训练过程，可以在每次模型保存过程中，增加训练图像、推理图像、标签图像可视化，更直观地感受训练效果。 ### 完成步骤 1. 学习可视化工具VisualDL。 2. 在每次保存模型时，增加训练图像、推理图像、标签图像可视化。 3. 参考[PR提交规范](https://github.com/PaddlePaddle/PaddleSeg/blob/develop/docs/pr/pr/style_cn.md)提交代码PR到[ppseg](https://github.com/PaddlePaddle/PaddleSeg/tree/develop)中。 ### 提交内容： 1. 代码提交到PaddleSeg。