Kim Yann
Kim Yann
进一步发现爆显存可能发生在 eval 阶段, https://github.com/PaddlePaddle/PaddleClas/blob/f820473d1d4d5174e57a5a6b08a42f672eb13390/ppcls/configs/ImageNet/ResNet/ResNet50.yaml#L8 `eval_during_train: False` 看到不到相关 oom
@qili93 Intel gpu support for paddle is paused due to some policy and market change, I guess we can't be sure before next gen Falcon Shore GPU
主框架是使用的 flat_hash_map 中使用的 https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/utils/flat_hash_map.h#L743 deallocate_data 在 LLVM 中是非法的
使用的是纯 CPU 还是 GPU 或者 custom-device 呢?
> 看起来是每次train完一个epoch之后进行eval时,就会涨一波内存,并且没有释放。 我这边也遇到了同样的问题https://github.com/PaddlePaddle/PaddleCustomDevice/issues/670 可以 https://github.com/PaddlePaddle/PaddleCustomDevice/blob/develop/backends/intel_gpu/runtime/runtime.cc#L250 一样加一个 VLOG(); 抓一下 CustomDevice 的 Allocate /Deallcate 的调用
lazy mode w/o graph ```bash python text_to_image_generation.py \ --model_name_or_path black-forest-labs/FLUX.1-schnell \ --prompts "A cat holding a sign that says hello world" \ --num_images_per_prompt 1 \ --batch_size 1 \ --num_inference_steps 28...
graph mode: ```bash python text_to_image_generation.py \ --model_name_or_path black-forest-labs/FLUX.1-schnell \ --prompts "A cat holding a sign that says hello world" \ --num_images_per_prompt 1 \ --batch_size 1 \ --num_inference_steps 28 \ --image_save_dir...
eager: ```bash PT_HPU_LAZY_MODE=0 \ python text_to_image_generation.py \ --model_name_or_path black-forest-labs/FLUX.1-schnell \ --prompts "A cat holding a sign that says hello world" \ --num_images_per_prompt 1 \ --batch_size 1 \ --num_inference_steps 28 \...
Performance With Batching Enabled: |Device|Mode |Prompts|Image Per Prompts|BS|Steps|FPS | |------|-----|-------|-----------------|--|-----|-----| |G2H |Graph|1 |4 |4 |28 |0.113| |G2H |Graph|5 |1 |5 |28 |0.113|
> please delete measure_all_500, measure_all etc. binary files like npz needn't be uploaded @ssarkar2 removed, pls review again