SamuelSiu issues

Results 7 issues of


                                            SamuelSiu

lightseq.inference has no attribute 'bert'

I tried to install the fp32 version by building from source. After finishing the installation of cmake, hdf5 and so on, I ran the conmmand 'PATH=/usr/local/hdf5/:$PATH ENABLE_FP32=1 ENABLE_DEBUG=1 pip3 install...

gallic2022那个baseline中的convert_to_ids函数很慢

和bert4keras对比，感觉这里的分词和转换id很慢，是不是有什么地方没优化好？

关于gradient-checkpointing的支持

你好！非常感谢作者编写的这套torch框架，gradient-checkpointing是种可以节省显存的训练方法，对于资源紧张下训练大模型有比较大的帮助作用，在苏神的博客上也有介绍，huggingface的transformers也内置了相关支持，是否能在后期加上这个功能？

对gradient checkpointing的支持似乎有问题

你好！在使用roformer v2微调的时候开启gradient checkpointing的时候会产生报错： File "/root/conda/envs/highbase/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/root/conda/envs/highbase/lib/python3.7/site-packages/roformer/modeling_roformer.py", line 1120, in forward return_dict=return_dict, File "/root/conda/envs/highbase/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input,...

The transformation between repetition_penalty and presence_penalty seems to be incorrect

https://github.com/huggingface/text-generation-inference/blob/ed72e9212620d4de10fbe476f0b7af2ab94e4cd7/router/src/server.rs#L1006 When repetition_penalty=1.0 or presence_penalty=0.0, there is no repeat penalty in the generation. However the transformation in the code set the repetition_penalty to 2 when presence_penalty=0 which violates it.

The settings of top_k, typical_p, do_sample in the request do not affect the generation?

https://github.com/huggingface/text-generation-inference/blob/ed72e9212620d4de10fbe476f0b7af2ab94e4cd7/router/src/server.rs#L1053 according to the code here, since the GenerateParameters only use the req.temperature, req.frequency_penalty and req.top_logprobs, it seems that we can not set the top_k, do_sample through request with the...

The implementation of stage 2 with axolotl

Thanks for the wonderful work. I am trying to improve the performance with medusa2. But when I start the training of stage 2 based on the model from stage 1,...