Jim Schwoebel
Jim Schwoebel
Definitely thinking about a next version of the book here and hopefully will be able to add those examples 👍
Working on this now.
Sorry - been totally afk here on this project due to work priorities. I may get back to this a bit later this year.
Note this is mostly on Macbook pros / MacOS - trying this on linux now.
This is a known issue - working on this in future versions. Note that for now I recommend only setting 1 or 2 model trainers
Train new vocabularies and tokenize, using today's most used tokenizers. Extremely fast (both training and tokenization), thanks to the Rust implementation. Takes less than 20 seconds to tokenize a GB...
This can be solved by using ipv6 instead of ipv4 and edit the hosts file so that you have access to youtube.com and I use ping command line to comfirm...
Hey Sachindra - I could totally do this at some point - I've been heads-down in work so haven't had a lot of time to work on this, but will...
Thanks @Path-A for the PR here. Going to keep this open as most folks are on mac. But agree ytb-dlp is probably a better library to use moving forward for...
Absolutely - thanks for getting back so quickly. I'll let you know how it goes.