SamuelSiu
SamuelSiu
I tried to install the fp32 version by building from source. After finishing the installation of cmake, hdf5 and so on, I ran the conmmand 'PATH=/usr/local/hdf5/:$PATH ENABLE_FP32=1 ENABLE_DEBUG=1 pip3 install...
和bert4keras对比,感觉这里的分词和转换id很慢,是不是有什么地方没优化好?
你好! 非常感谢作者编写的这套torch框架,gradient-checkpointing是种可以节省显存的训练方法,对于资源紧张下训练大模型有比较大的帮助作用,在苏神的博客上也有介绍,huggingface的transformers也内置了相关支持,是否能在后期加上这个功能?
你好! 在使用roformer v2微调的时候开启gradient checkpointing的时候会产生报错: File "/root/conda/envs/highbase/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/root/conda/envs/highbase/lib/python3.7/site-packages/roformer/modeling_roformer.py", line 1120, in forward return_dict=return_dict, File "/root/conda/envs/highbase/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input,...
https://github.com/huggingface/text-generation-inference/blob/ed72e9212620d4de10fbe476f0b7af2ab94e4cd7/router/src/server.rs#L1006 When repetition_penalty=1.0 or presence_penalty=0.0, there is no repeat penalty in the generation. However the transformation in the code set the repetition_penalty to 2 when presence_penalty=0 which violates it.
https://github.com/huggingface/text-generation-inference/blob/ed72e9212620d4de10fbe476f0b7af2ab94e4cd7/router/src/server.rs#L1053 according to the code here, since the GenerateParameters only use the req.temperature, req.frequency_penalty and req.top_logprobs, it seems that we can not set the top_k, do_sample through request with the...
Thanks for the wonderful work. I am trying to improve the performance with medusa2. But when I start the training of stage 2 based on the model from stage 1,...