PaddleSeg
PaddleSeg copied to clipboard
运行出现bug
问题确认 Search before asking
- [X] 我已经查询历史issue(包括open与closed),没有发现相似的bug。I have searched the open and closed issues and found no similar bug report.
Bug描述 Describe the Bug
2024-08-08 06:16:46 [INFO] ------------Environment Information------------- platform: Linux-5.4.0-139-generic-x86_64-with-debian-stretch-sid Python: 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0] Paddle compiled with cuda: True NVCC: Build cuda_11.2.r11.2/compiler.29618528_0 cudnn: 8.2 GPUs used: 1 CUDA_VISIBLE_DEVICES: None GPU: ['GPU 0: Tesla V100-SXM2-32GB'] GCC: gcc (Ubuntu 7.5.0-3ubuntu1~16.04) 7.5.0 PaddleSeg: 2.7.0 PaddlePaddle: 2.3.2 OpenCV: 4.1.1
2024-08-08 06:16:46 [INFO] ---------------Config Information--------------- batch_size: 16 iters: 30000 loss: coef:
- 1 types:
- coef:
- 0.8
- 0.2 losses:
- type: CrossEntropyLoss
- type: LovaszSoftmaxLoss type: MixedLoss lr_scheduler: learning_rate: 6.0e-05 power: 1 type: PolynomialDecay model: align_corners: true backbone: in_channels: 1 pretrained: https://bj.bcebos.com/paddleseg/dygraph/backbone/mix_vision_transformer_b3.tar.gz type: MixVisionTransformer_B3 embedding_dim: 768 num_classes: 7 type: SegFormer optimizer: beta1: 0.9 beta2: 0.999 type: AdamW weight_decay: 0.01 train_dataset: dataset_root: /home/aistudio/data/src/ img_channels: 1 mode: train num_classes: 7 train_path: /home/aistudio/data_split/train.txt transforms:
- max_scale_factor: 1.25 min_scale_factor: 0.75 scale_step_size: 0.25 type: ResizeStepScaling
- type: RandomVerticalFlip
- type: RandomBlur
- type: RandomRotation
- type: RandomHorizontalFlip
- crop_size:
- 512
- 512 type: RandomPaddingCrop
- type: Normalize type: Dataset val_dataset: dataset_root: /home/aistudio/data/src/ img_channels: 1 mode: val num_classes: 7 transforms:
- type: Normalize type: Dataset val_path: /home/aistudio/data_split/val.txt
W0808 06:16:46.936419 6406 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.0, Runtime API Version: 11.2
W0808 06:16:46.936477 6406 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
2024-08-08 06:16:48 [INFO] Loading pretrained model from https://bj.bcebos.com/paddleseg/dygraph/backbone/mix_vision_transformer_b3.tar.gz
2024-08-08 06:16:48 [WARNING] [SKIP] Shape of pretrained params patch_embed1.proj.weight doesn't match.(Pretrained: [64, 3, 7, 7], Actual: [64, 1, 7, 7])
2024-08-08 06:16:48 [INFO] There are 571/572 variables loaded into MixVisionTransformer.
2024-08-08 06:16:48 [INFO] use AMP to train. AMP level = O1
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/math_op_patch.py:278: UserWarning: The dtype of left and right variables are not the same, left dtype is paddle.float32, but right dtype is paddle.float16, the right dtype will convert to paddle.float32
format(lhs_dtype, rhs_dtype, lhs_dtype))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/math_op_patch.py:278: UserWarning: The dtype of left and right variables are not the same, left dtype is paddle.float16, but right dtype is paddle.float32, the right dtype will convert to paddle.float16
format(lhs_dtype, rhs_dtype, lhs_dtype))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/norm.py:654: UserWarning: When training, we now always track global mean and variance.
"When training, we now always track global mean and variance.")
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/math_op_patch.py:278: UserWarning: The dtype of left and right variables are not the same, left dtype is paddle.float32, but right dtype is paddle.int64, the right dtype will convert to paddle.float32
format(lhs_dtype, rhs_dtype, lhs_dtype))
Traceback (most recent call last):
File "/home/aistudio/PaddleSeg/train.py", line 262, in
复现环境 Environment
paddlepaddle 2.7 三天之前刚运行过程序,没有任何问题,中间也没有对程序有任何修改,今天运行程序突然出现了上面的错误,不知道是什么问题造成的,目前我已经大概知道问题在哪里了,因为我用了--precision fp16这个,如果我把这个去掉的话,程序能正常运行,但是前几天我也是加的这个啊,也能正常运行,请问是因为你们更新了什么吗
Bug描述确认 Bug description confirmation
- [X] 我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息,确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.
是否愿意提交PR? Are you willing to submit a PR?
- [X] 我愿意提交PR!I'd like to help by submitting a PR!