fastllm icon indicating copy to clipboard operation
fastllm copied to clipboard

反馈一个bug,关于Tokenizer

Open lonelywm opened this issue 1 year ago • 4 comments

在ChatGLM的Official实现中,token采用了import sentencepiece as spm,这样的一个库,这个库在 self.sp.EncodeAsPieces(text),这一句会把英文单词比如“hello”处理成"▁hello",注意前面的两个杠不是下划线。这应该是最标准的方式,而本项目好像没有做类似的处理。

lonelywm avatar May 30 '23 09:05 lonelywm

是的.. 这个有点麻烦,暂时还没弄,不过中文模型效果都还可以 之后会考虑加上,或者用py接口,这样可以用原生的tokenizer

ztxz16 avatar Jun 16 '23 09:06 ztxz16

@ztxz16 请帮忙看看这个小问题 V100测试LLama: [ user: "Below is an instruction that describes a task. Write a response that appropriately completes the request.

Instruction:

Building a website can be done in 10 simple steps.

Response:", model: "Buildingawebsitecanbeacrazyeasyprocess.First,youneedtopickawebsitebuilder.Therearemanymanytoppickfrom.Someofthemarefree,someofthemarepaid.Youneedtopickonethatfitsyourneeds.Next,youneedtopickawordpresstheme.Thiswillbeyourwebsite'sdesign.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpressplugin.Thiswilladdextrafunctionalitytoyourwebsite.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpresstemplate.Thiswillbeyourwebsite'slayout.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpresspage.Thiswillbeyourwebsite'scontent.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpresscategory.Thiswillorganizeyourwebsite'scontent.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpresspost.Thiswilladdnewcontenttoyourwebsite.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpressuser.Thiswillallowyouandotherstomanageyourwebsite.Choosethatwhichbestsuitsyourneeds.Finally,youneedtopublishyourwebsite.Choosethatwhichbestsuitsyourneeds.Andthat'sit!Buildingawebsiteisasimpleasthrowingapartyforfriends."]

batch: 1 output 336 tokens use 18.232920 s speed = 18.428205 tokens / s

为啥model的Response没有空格隔开呀? 这个是std::string Tokenizer::DecodeTokens(const std::vector &tokens)里的bug吗?

authwork avatar Jul 10 '23 06:07 authwork

@ztxz16 请帮忙看看这个小问题 V100测试LLama: [ user: "Below is an instruction that describes a task. Write a response that appropriately completes the request.

Instruction:

Building a website can be done in 10 simple steps.

Response:", model: "Buildingawebsitecanbeacrazyeasyprocess.First,youneedtopickawebsitebuilder.Therearemanymanytoppickfrom.Someofthemarefree,someofthemarepaid.Youneedtopickonethatfitsyourneeds.Next,youneedtopickawordpresstheme.Thiswillbeyourwebsite'sdesign.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpressplugin.Thiswilladdextrafunctionalitytoyourwebsite.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpresstemplate.Thiswillbeyourwebsite'slayout.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpresspage.Thiswillbeyourwebsite'scontent.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpresscategory.Thiswillorganizeyourwebsite'scontent.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpresspost.Thiswilladdnewcontenttoyourwebsite.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpressuser.Thiswillallowyouandotherstomanageyourwebsite.Choosethatwhichbestsuitsyourneeds.Finally,youneedtopublishyourwebsite.Choosethatwhichbestsuitsyourneeds.Andthat'sit!Buildingawebsiteisasimpleasthrowingapartyforfriends."]

batch: 1 output 336 tokens use 18.232920 s speed = 18.428205 tokens / s

为啥model的Response没有空格隔开呀? 这个是std::string Tokenizer::DecodeTokens(const std::vector &tokens)里的bug吗?

是bug.. 现在可以先试试用python的tokenizer

ztxz16 avatar Jul 10 '23 07:07 ztxz16

@ztxz16 在V100上测试也有一点点问题, 同样的Prompt "Building a website can be done in 10 simple steps.":

i8, batch=2 batch: 2 output 670 tokens (正常) use 43.564709 s speed = 15.379421 tokens / s

f16, batch = 1 batch: 1 output 336 tokens (正常) use 17.873089 s speed = 18.799213 tokens / s

f16, batch=2 batch: 2 output 14 tokens (突然中断) use 1.104086 s speed = 12.680171 tokens / s

而且i8的性能和f16差不太多, 是因为GPU型号的问题吗.... 另外请问用python接口有测试性能的测试文件吗?

authwork avatar Jul 10 '23 09:07 authwork

@ztxz16 请帮忙看看这个小问题 V100测试LLama: [ user: "Below is an instruction that describes a task. Write a response that appropriately completes the request.

Instruction:

Building a website can be done in 10 simple steps.

Response:", model: "Buildingawebsitecanbeacrazyeasyprocess.First,youneedtopickawebsitebuilder.Therearemanymanytoppickfrom.Someofthemarefree,someofthemarepaid.Youneedtopickonethatfitsyourneeds.Next,youneedtopickawordpresstheme.Thiswillbeyourwebsite'sdesign.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpressplugin.Thiswilladdextrafunctionalitytoyourwebsite.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpresstemplate.Thiswillbeyourwebsite'slayout.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpresspage.Thiswillbeyourwebsite'scontent.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpresscategory.Thiswillorganizeyourwebsite'scontent.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpresspost.Thiswilladdnewcontenttoyourwebsite.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpressuser.Thiswillallowyouandotherstomanageyourwebsite.Choosethatwhichbestsuitsyourneeds.Finally,youneedtopublishyourwebsite.Choosethatwhichbestsuitsyourneeds.Andthat'sit!Buildingawebsiteisasimpleasthrowingapartyforfriends."]

batch: 1 output 336 tokens use 18.232920 s speed = 18.428205 tokens / s 为啥model的Response没有空格隔开呀? 这个是std::string Tokenizer::DecodeTokens(const std::vector &tokens)里的bug吗?

是bug.. 现在可以先试试用python的tokenizer

你好,python的tokenizer应该怎么用呢?有例子吗?

levishen avatar Jul 18 '23 09:07 levishen