SCAUapc comments

Results 7 comments of


                                            SCAUapc

Is the value different from the nvidia-smi?

I found that Nvidia-smi use MiB，and your tool use Mb， maybe there is some different? I go to check

Is the value different from the nvidia-smi?

> Thanks!

无法按照指定字数生成文章？

因为使用的分词器不是按字切分的，用的是sentencePiece，这个感觉就没办法准确吧，模型也不知道每个token的长度本身是多少个字

为什么量化后指定不了哪一块儿显卡，运行还是占用第0块儿

import os os.environ['CUDA_VISIBLE_DEVICES'] = '1,2' @zxcvbn114514 > > try this, add them to the top of your code > > import os os.environ['CUDA_VISIBLE_DEVICES'] = '1' > > Is there any...

我看设置里vocab的词表有15W，这很大。我曾经有个大规模分类的项目里用BERT，但鉴于最后一层输出的类别（几千）很多，发现最后的计算每个类别的概率还有softmax这一步很耗时和计算量。我觉得如果仅在中文场景的话可以基于BERT之类的词表做词表压缩，并且把第一层的token embedding layer进行重新改写（把不要的token embedding去掉，相应词表也要做改变）这应该能减少一些耗时

推理速度有点慢，有什么好方法加速吗

> 我看设置里vocab的词表有15W，这很大。我曾经有个大规模分类的项目里用BERT，但鉴于最后一层输出的类别（几千）很多，发现最后的计算每个类别的概率还有softmax这一步很耗时和计算量。我觉得如果仅在中文场景的话可以基于BERT之类的词表做词表压缩，并且把第一层的token embedding layer进行重新改写（把不要的token embedding去掉，相应词表也要做改变）这应该能减少一些耗时好吧今天看了下，底层不是像以前BERT GPT之类的分字，而是实打实的分词了，那就没办法了

issue about the conv2d operation

I have the same problem as you. And i agree with you. But I wish can get the reply by the author