Paddle issues

训练PaddleDetection仓库的ppyolo网络，loss=nan

4

### bug描述 Describe the Bug 基于Paddle develop分支 - PaddleDetection develop分支，以及Paddle release2.3分支 - PaddleDetection release2.4分支，跑ppyolo网络，无论是4卡还是8卡，均会有50%的概率出现 loss=nan。环境和超参数修改如下： **原生参数8卡参数为（base_lr: 0.01，batch_size: 24）** - **出现loss=nan的8卡参数和脚本** 参数为（base_lr: 0.005，batch_size: 12），学习率和batch减半，训练脚本如下： `python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py...

qipengh

status/new-issue

type/bug-report

Remove calibration file path when deploy quantize model

1

### PR types Bug fixes ### PR changes Others ### Describe Remove calibration file path when deploy quantize model

yeliang2258

all_reduce之后为什么每张卡上的数值大小不同

1

### 请提出你的问题 Please ask your question 我想同步每张卡上loss_buffer，每次只是同步某一行。但是同步之后，每张卡打印出来的变量不一样，是为什么呢？代码和打印log如下： loss_buffer = paddle.zeros(shape=[100,4]) #[epoch,loss_num] paddle.distributed.barrier() paddle.distributed.all_reduce(loss_buffer[epoch_id]) print("loss_buffer[epoch_id]",loss_buffer[epoch_id]) ![image](https://user-images.githubusercontent.com/31760851/191503594-4e6f162a-5e4c-41e0-8cdd-7a5810879e09.png)

bixiaopeng0

status/new-issue

type/question

Fix the En docs (delete some expression like 'This OP')

4

### PR types Others ### PR changes Docs ### Describe 1. Delete some expression like 'This Op' 2. remove import numpy as np - cn docs pr: https://github.com/PaddlePaddle/docs/pull/5285

Liyulingyue

contributor

status: proposed

[Dygraph] Fix bugs of mp in eager mode

1

### PR types Bug fixes ### PR changes Others ### Describe [Dygraph] Fix bugs of mp in eager mode

haohongxiang

[BugFix]Fix pooling output_size bug if encounter list[Tensor]

1

### PR types Function optimization ### PR changes APIs ### Describe [Check]Enhance pooling output_size type check

Aurelius84

Remove code that used in yaml's invoke

1

### PR types Others ### PR changes Others ### Describe 当前yaml配置趋于完善，这个pr主要是删除前期由于机制不完善使用invoke配置的yaml及代码

YuanRisheng

Support rsqrt_p

1

### PR types Others ### PR changes Others ### Describe This PR support rsqrt_p in incubate

JiabinYang

[Sparse] Support static graph

1

### PR types Others ### PR changes Others ### Describe sparse支持静态图: 1. 添加infer_meta：[#46016](https://github.com/PaddlePaddle/Paddle/pull/46016) 2. framework，pybind添加对SparseCooTensor的支持，后面再支持SparseCsrTensor 3. feed_op、fetch_op添加SparseCooTensor的支持，后面再支持SparseCsrTensor 4. 添加sparse_manual_op_sig.cc，sparse_manual_op.cc，当前主要添加3D点云模型中使用的算子： - sparse_coo_tensor，indices，values，to_dense, conv, relu， add 5. 当前先支持静态图的推理，后续再添加反向op。

zkh2016

[CodeStyle] add pre-commit hook `remove-tabs` for python files

### PR types Others ### PR changes Others ### Describe - Flake8 tracking issue: #46039 添加 remove-tabs pre-commit hook，以自动删除 tab，并且移除 Flake8 配置中 ignore 的 W191 和 E101 错误码本 PR...

SigureMo

contributor

Paddle
Paddle copied to clipboard

Metadata

训练PaddleDetection仓库的ppyolo网络，loss=nan

Remove calibration file path when deploy quantize model

all_reduce之后为什么每张卡上的数值大小不同

Fix the En docs (delete some expression like 'This OP')

[Dygraph] Fix bugs of mp in eager mode

[BugFix]Fix pooling output_size bug if encounter list[Tensor]

Remove code that used in yaml's invoke

Support rsqrt_p

[Sparse] Support static graph

[CodeStyle] add pre-commit hook `remove-tabs` for python files

← Metadata

Owner

Metadata

Paddle Paddle copied to clipboard

Metadata

← Metadata

Owner

Metadata

Paddle
Paddle copied to clipboard