tiktoken icon indicating copy to clipboard operation
tiktoken copied to clipboard

Add Terminal-Based Visualization Tool for Tokenized Data Points in Tiktoken Tokenizer

Open LVivona opened this issue 1 year ago • 0 comments
trafficstars

Key Features

  • Token Visualization: Display token, and their positions in the input text.
  • Interactive Interface: Allows users to input text and see the tokenized output in real-time.
  • Enhanced Debugging: Facilitates better understanding of tokenizer behavior and helps in identifying issues.

Benefits

  • Improved Usability: Provides a straightforward way to visualize and analyze tokenized data.
  • Immediate Feedback: Users can quickly see the effects of changes to the tokenizer.
  • Educational: Helps users grasp model information and the tokenization process more effectively.
import tiktoken
enc = tiktoken.get_encoding("gpt2")
enc.environment()
Screenshot 2024-06-18 at 12 17 01 PM

LVivona avatar Jun 18 '24 17:06 LVivona