regexp dont support ?i
panic: regexp: Compile("(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\r\n\p{L}\p{N}]?\p{L}+|\p{N}{1}| ?[^\s\p{L}\p{N}\r\n]+|\s*[\r\n]+|\s+(?!\S)|\s+"): error parsing regexp: invalid or unsupported Perl syntax: (?! [recovered]
panic: regexp: Compile("(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\r\n\p{L}\p{N}]?\p{L}+|\p{N}{1}| ?[^\s\p{L}\p{N}\r\n]+|\s*[\r\n]+|\s+(?!\S)|\s+"): error parsing regexp: invalid or unsupported Perl syntax: (?!
Is anyone watching this?I got the same problem
@shibingli I have merged your repo and fixed some 'import/go.mod' error. Now it works. https://github.com/whitezhang/tokenizer
not able to count tokens for gpt-4 and gpt-3.5turbo getting this same error,
--- FAIL: TestGetTokenCountSugarMe (0.10s)
panic: regexp: Compile(`(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\r\n\p{L}\p{N}]?\p{L}+|\p{N}{1,3}| ?[^\s\p{L}\p{N}]+[\r\n]*|\s*[\r\n]+|\s+(?!\S)|\s+`): error parsing regexp: invalid or unsupported Perl syntax: `(?!` [recovered]
panic: regexp: Compile(`(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\r\n\p{L}\p{N}]?\p{L}+|\p{N}{1,3}| ?[^\s\p{L}\p{N}]+[\r\n]*|\s*[\r\n]+|\s+(?!\S)|\s+`): error parsing regexp: invalid or unsupported Perl syntax: `(?!`
model used: "Xenova/gpt-3.5-turbo", "Xenova/gpt-4",
If anyone can help? No bandwidth atm. Thanks
Hi, @sugarme , I have drafted PR #60 to fix this. It would be awesome if you give it a look.
Thanks for the fascinating project!
For package users, if you want to test the fix locally:
go mod edit -replace=github.com/sugarme/tokenizer=github.com/nanmu42/go-tokenizer@master
go mod tidy
Cheers.
Hi @sugarme, Thanks for your lib. I've encountered the same issue. Could you please let me know when it will be fixed? @nanmu42 this repo does not exists (github.com/nanmu42/go-tokenizer@master), anyone could solve this problem?
this repo does not exists (github.com/nanmu42/go-tokenizer@master), anyone could solve this problem?
It's still at https://github.com/nanmu42/go-tokenizer , no changes were made.
Try:
go mod edit -replace=github.com/sugarme/tokenizer=github.com/nanmu42/go-tokenizer@master
go mod tidy