tokenizer topic

List tokenizer repositories

gpt3-tokenizer

172
Stars
19
Forks
Watchers

Isomorphic JavaScript/TypeScript Tokenizer for GPT-3 and Codex Models by OpenAI.

tiktoken-rs

209
Stars
39
Forks
Watchers

Ready-made tokenizer library for working with GPT and tiktoken

SharpToken

249
Stars
17
Forks
249
Watchers

SharpToken is a C# library for tokenizing natural language text. It's based on the tiktoken Python library and designed to be fast and accurate.

Cledev.OpenAI

111
Stars
20
Forks
Watchers

.NET 7 SDK for OpenAI with a Blazor Server playground

openai-tools

95
Stars
13
Forks
Watchers

A collection of tools for working with OpenAI

go-gpt-3-encoder

78
Stars
20
Forks
Watchers

Go BPE tokenizer (Encoder+Decoder) for GPT2 and GPT3

GPTEncoder

76
Stars
20
Forks
Watchers

Swift BPE Encoder/Decoder for OpenAI GPT Models. A programmatic interface for tokenizing text for OpenAI ChatGPT API.

Roy_VnTokenizer

53
Stars
36
Forks
Watchers

Vietnamese tokenizer (Maximum Matching and CRF)

talismane

48
Stars
14
Forks
Watchers

NLP framework: sentence detector, tokeniser, pos-tagger and dependency parser

alm

47
Stars
5
Forks
Watchers

Smart Language Model