tiktoken icon indicating copy to clipboard operation
tiktoken copied to clipboard

NPM package is very huge

Open anzemur opened this issue 1 year ago • 6 comments

I just noticed that this package is around 13MB when unpackaged and I reached my AWS Lambda package size limit. This is absolutely too big for serverless deployment.

So my questions are:

  • Why are there saved encoder files inside the package (and in 3 different formats! - Is this really necessary?) ranging up to 1MB each? Does the Python version handle encoders similarly or are they always downloaded from the repository?
  • Which one of the three files is actually used for encoding (js, cjs or json) so I can manually remove the others from the build?
  • Is there a possibility to create a smaller package with only encoders that the user needs?

anzemur avatar Aug 29 '23 12:08 anzemur