GPT3-Tokenizer icon indicating copy to clipboard operation
GPT3-Tokenizer copied to clipboard

GPT-4 Support

Open collinhundley opened this issue 1 year ago • 4 comments

What would it take to support GPT-4 encoding with the cl100k_base vocab? I haven't found any Swift libraries that support this yet.

collinhundley avatar May 15 '23 21:05 collinhundley

Hi! I'm working on it. I have a Tiktoken's port written swift but it's not ready. When i finish it i will publish it under my github's user account.

This threat will be closed when this library get ready.

Thanks for your interesting on my project.

aespinilla avatar May 16 '23 07:05 aespinilla

@aespinilla that’s great news! Any estimate on how long it will be until you publish it?

collinhundley avatar May 16 '23 14:05 collinhundley

I have a proof of concept with some errors only on cl100k_base vocab and perform, i think I will be able to have a beta version on next week.

aespinilla avatar May 16 '23 17:05 aespinilla

Hi @collinhundley ! I had published a very basic openai's tiktoken. You can check on this repository: https://github.com/aespinilla/Tiktoken

Yes, it supports gpt-4 and more vocab 😉

aespinilla avatar May 18 '23 09:05 aespinilla