Pegessi

Results 4 comments of Pegessi

> It seems like model is loaded to the device during `transformers.AutoModelForCausalLM.from_pretrained` and num_beams error is caused by 'inf' 'nan' or ele

I believe this work is remarkable for the combination of memory and parallelism and is great for bringing higher throughput. However, insufficient part is that experiments about Megatron-Deepspeed as baseline...

I have integrated GMLake into torch2.1.0 manually by myself. These code cannot directly be used to replace the file in pytorch2.1.0 because of some changes about interfaces in cudacachingallocator.h&cpp. Although...

It seems that you have not build the libtaso_runtime.so