outlines-core
outlines-core copied to clipboard
Vocabulary/ GPT2 : Bad interpretation of tokenId = 216
Describe the issue as clearly as possible:
The TokenId(216) of the GPT2 Alphabet which have the value "\u011c" has only the byte(28) in its Vec
Steps/code to reproduce the bug:
//
Expected result:
TokenId(226) = vec![0xC4, 0x9C];
Error message:
Outlines/Python version information:
Version information
```
(command output here)
```
Context for the issue:
No response