course icon indicating copy to clipboard operation
course copied to clipboard

a question about the "6.tokenizers library"

Open catsled opened this issue 1 year ago • 0 comments

when i studied the "6. The tokenizers liabrary -- Unigram tokenization" , i couldn't understand the following image why the P("pu") = 5/210, shouldn't it be the 17 / 210, because the P("g") = 20 / 210 according to the frequency of g is 20.

catsled avatar Apr 11 '23 23:04 catsled