fastllm 反馈一个bug，关于Tokenizer

在ChatGLM的Official实现中，token采用了import sentencepiece as spm，这样的一个库，这个库在 self.sp.EncodeAsPieces(text)，这一句会把英文单词比如“hello”处理成"▁hello"，注意前面的两个杠不是下划线。这应该是最标准的方式，而本项目好像没有做类似的处理。

May 30 '23 09:05 lonelywm

是的.. 这个有点麻烦，暂时还没弄，不过中文模型效果都还可以之后会考虑加上，或者用py接口，这样可以用原生的tokenizer

Jun 16 '23 09:06 ztxz16

@ztxz16 请帮忙看看这个小问题 V100测试LLama: [ user: "Below is an instruction that describes a task. Write a response that appropriately completes the request.

Instruction:

Building a website can be done in 10 simple steps.

Response:", model: "Buildingawebsitecanbeacrazyeasyprocess.First,youneedtopickawebsitebuilder.Therearemanymanytoppickfrom.Someofthemarefree,someofthemarepaid.Youneedtopickonethatfitsyourneeds.Next,youneedtopickawordpresstheme.Thiswillbeyourwebsite'sdesign.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpressplugin.Thiswilladdextrafunctionalitytoyourwebsite.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpresstemplate.Thiswillbeyourwebsite'slayout.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpresspage.Thiswillbeyourwebsite'scontent.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpresscategory.Thiswillorganizeyourwebsite'scontent.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpresspost.Thiswilladdnewcontenttoyourwebsite.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpressuser.Thiswillallowyouandotherstomanageyourwebsite.Choosethatwhichbestsuitsyourneeds.Finally,youneedtopublishyourwebsite.Choosethatwhichbestsuitsyourneeds.Andthat'sit!Buildingawebsiteisasimpleasthrowingapartyforfriends."]

batch: 1 output 336 tokens use 18.232920 s speed = 18.428205 tokens / s

为啥model的Response没有空格隔开呀? 这个是std::string Tokenizer::DecodeTokens(const std::vector &tokens)里的bug吗?

Jul 10 '23 06:07 authwork

@ztxz16 请帮忙看看这个小问题 V100测试LLama: [ user: "Below is an instruction that describes a task. Write a response that appropriately completes the request.

Instruction:

Building a website can be done in 10 simple steps.

Response:", model: "Buildingawebsitecanbeacrazyeasyprocess.First,youneedtopickawebsitebuilder.Therearemanymanytoppickfrom.Someofthemarefree,someofthemarepaid.Youneedtopickonethatfitsyourneeds.Next,youneedtopickawordpresstheme.Thiswillbeyourwebsite'sdesign.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpressplugin.Thiswilladdextrafunctionalitytoyourwebsite.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpresstemplate.Thiswillbeyourwebsite'slayout.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpresspage.Thiswillbeyourwebsite'scontent.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpresscategory.Thiswillorganizeyourwebsite'scontent.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpresspost.Thiswilladdnewcontenttoyourwebsite.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpressuser.Thiswillallowyouandotherstomanageyourwebsite.Choosethatwhichbestsuitsyourneeds.Finally,youneedtopublishyourwebsite.Choosethatwhichbestsuitsyourneeds.Andthat'sit!Buildingawebsiteisasimpleasthrowingapartyforfriends."]

batch: 1 output 336 tokens use 18.232920 s speed = 18.428205 tokens / s

为啥model的Response没有空格隔开呀? 这个是std::string Tokenizer::DecodeTokens(const std::vector &tokens)里的bug吗?

是bug.. 现在可以先试试用python的tokenizer

Jul 10 '23 07:07 ztxz16

@ztxz16 在V100上测试也有一点点问题, 同样的Prompt "Building a website can be done in 10 simple steps.":

i8, batch=2 batch: 2 output 670 tokens (正常) use 43.564709 s speed = 15.379421 tokens / s

f16, batch = 1 batch: 1 output 336 tokens (正常) use 17.873089 s speed = 18.799213 tokens / s

f16, batch=2 batch: 2 output 14 tokens (突然中断) use 1.104086 s speed = 12.680171 tokens / s

而且i8的性能和f16差不太多, 是因为GPU型号的问题吗.... 另外请问用python接口有测试性能的测试文件吗?

Jul 10 '23 09:07 authwork

@ztxz16 请帮忙看看这个小问题 V100测试LLama: [ user: "Below is an instruction that describes a task. Write a response that appropriately completes the request.

Instruction:

Building a website can be done in 10 simple steps.

Response:", model: "Buildingawebsitecanbeacrazyeasyprocess.First,youneedtopickawebsitebuilder.Therearemanymanytoppickfrom.Someofthemarefree,someofthemarepaid.Youneedtopickonethatfitsyourneeds.Next,youneedtopickawordpresstheme.Thiswillbeyourwebsite'sdesign.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpressplugin.Thiswilladdextrafunctionalitytoyourwebsite.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpresstemplate.Thiswillbeyourwebsite'slayout.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpresspage.Thiswillbeyourwebsite'scontent.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpresscategory.Thiswillorganizeyourwebsite'scontent.Choosethatwhichbestsuitsyourneeds.Then,youneedtopickawordpresspost.Thiswilladdnewcontenttoyourwebsite.Choosethatwhichbestsuitsyourneeds.Next,youneedtopickawordpressuser.Thiswillallowyouandotherstomanageyourwebsite.Choosethatwhichbestsuitsyourneeds.Finally,youneedtopublishyourwebsite.Choosethatwhichbestsuitsyourneeds.Andthat'sit!Buildingawebsiteisasimpleasthrowingapartyforfriends."]

batch: 1 output 336 tokens use 18.232920 s speed = 18.428205 tokens / s 为啥model的Response没有空格隔开呀? 这个是std::string Tokenizer::DecodeTokens(const std::vector &tokens)里的bug吗?

是bug.. 现在可以先试试用python的tokenizer

你好，python的tokenizer应该怎么用呢？有例子吗？

Jul 18 '23 09:07 levishen

fastllm fastllm copied to clipboard

反馈一个bug，关于Tokenizer

Instruction:

Instruction:

Instruction:

fastllm
fastllm copied to clipboard