wintersurvival

Results 9 issues of wintersurvival

Thanks for great work and sharing! When set top_k = 1 and top_p < 0, the generated image of ruDALL-E Malevich (XL) is bad: text: 'радуга на фоне ночного города'...

When training with 8 GPU, the throughput printed by Deepspeed is much smaller than throughput calculated by training code: deepspeed SamplesPerSec=505 sample_per_sec: 50120 It seems that the throughput calculated by...

When set top_k in generate.py to 1.0, it often generate blank pictures. In my understanding, it will select the maximum probability image token when top_k=1.0. Why does this happen?

跑fashionbert的多卡,发现8卡的性能跟4卡的性能差不多。 请问是用BundleCSVReader读数据部分的代码的问题吗? 请问这份代码在多卡的情况下验证过吗?

在wide_deep目录下运行:python -u ../../../tools/trainer.py config_gpups.yaml 报错了: ValueError: (InvalidArgument) Variable value (input) of OP(fluid.layers.embedding) expected >= 0 and < 1024, but got 737395. Please check input value. [Hint: Expected ids[i] < row_number,...

Sometimes a onnx file has Cast Ops that cast data type to INT64. The case needs to be handled as well, by just modifying the attribute of Cast Ops to...

when running transformer, bias is not existed in selfAttention. mesh_tensorflow/bert has bias in selfAttention. what's the meaning of relative_attention_type transformer_layer.SelfAttention? how could I get the bias in transformer_layer.SelfAttention?