yiakwy-xpu-ml-framework-team comments

Results 18 comments of


                                            yiakwy-xpu-ml-framework-team

fix dataset problems release 2.7

B.t.w I just found this line when I scan the dataset file: > resized_image = resized_image.astype('float32') The image can be effectively worked with uint8, this allows 8-bit IO workload and...

fix dataset problems release 2.7

> @yiakwy-xpu-ml-framework-team you'll need to create a PR, target for dygraph branch. we decided not merging this into release/2.7. I guess I can cherry-pick this commit onto the dygraph branch....

0.00000 combineloss when using ch_PP-OCRv4_det_cml.yaml

@tink2123 det_cml is just teacher - student (x2) distillation (with KL loss to make the smaller students mimic the teacher output). So you can simply use det_student for baseline. The...

BFloat16 support in multi_tensor_*

@zhengwy888 @yuvalkirstain bf has already been supported in apex aten::Tensor api https://github.com/NVIDIA/apex/pull/1407/files Hence I believe this issue should be closed ?

CUDA error: out of memory

> Ok, I've done some debugging, this function is literally designed to allocate GPU memory as max as possible (up to what's set in the `gpu_memory_utilization`). What's the reasoning behind...

CUDA error: out of memory

您好，邮箱主人会认真阅读！谢谢关注/

add Dockerfile

Could you update gcc, cuda to gcc-9(10, 11 are also welcome), details can be found in TensorRT OSS repository docker file. gcc-6 is definitely not the right version for Ubuntu-18.04,...

add Dockerfile

您好，邮箱主人会认真阅读！谢谢关注/

[BUG] text generation not working for --position-embedding-type rope

@1049451037 This simply because you were using a m-core model (m-core model has bugs). You switch to legacy model : --use-legacy-model (now m-core is the default)

[backend]Ensure coalescing and saturation for every memop in a slice.

@htyu @ThomasRaoux is the optimization still on the menu? I am just learning how to auto coalesce global access to SMEM (to make sure data load store continuously). Is there...