course issues

Results 169 course issues

Sort by recently updated

a question about the "6.tokenizers library"

when i studied the "6. The tokenizers liabrary -- Unigram tokenization" , i couldn't understand the following why the P("pu") = 5/210, shouldn't it be the 17 / 210, because...

catsled

Mistranslation in Chinese(simplified)zh-CN version in chaper 3-2.mdx

Hi,thanks for your excellent course and translation.Recently,I found one (maybe) mistranslation during the learning process. In [2.mdx of chapter 3 in zh-CN](https://github.com/huggingface/course/blob/main/chapters/zh-CN/chapter3/2.mdx), I found that the word 'datasets' was misspelled...

wwzhuang01

docs(zh-cn): Reviewed 36_slice-and-dice-a-dataset-🔪.srt

docs(zh-cn): Reviewed 36_slice-and-dice-a-dataset-🔪.srt #390

tyisme614

araic setup

maljefairi

fix korean typo,

fix typo.

partrita

Changing `tokenized_dataset` to `tokenized_datasets`

The variable `tokenized_dataset` isn't initialized in the code. Leading to exception when the cell is reached. On the other hand `tokenized_datasets` var does exist. It was probably intended to be...

jpodivin

Incorrect variable name `tokenized_dataset` in Chapter 7 section 6 of course

Second portion of the chapter, which focuses on `accelerate` library, uses `tokenized_dataset` variable. However this variable doesn't exist, leading to error if person following the guide attempts to execute the...

jpodivin

Fixing the translation Zh-CN/chapter2/

make Easy to understand

buqieryul

Helpful tip

This PR adds a helpful tip for users to ensure proper setup before using the MRPC dataset. The tip reminds users to check if the `datasets` package is installed by...

tal7aouy

Imprecise description about removing token "pu" in section Unigram tokenization

https://huggingface.co/learn/nlp-course/chapter6/7?fw=pt says: > In this (very) particular case, we had two equivalent tokenizations of all the words: as we saw earlier, for example, "pug" could be tokenized ["p", "ug"] with...

yaojingguo

course
course copied to clipboard

Metadata

a question about the "6.tokenizers library"

Mistranslation in Chinese(simplified)zh-CN version in chaper 3-2.mdx

docs(zh-cn): Reviewed 36_slice-and-dice-a-dataset-🔪.srt

araic setup

fix korean typo,

Changing `tokenized_dataset` to `tokenized_datasets`

Incorrect variable name `tokenized_dataset` in Chapter 7 section 6 of course

Fixing the translation Zh-CN/chapter2/

Helpful tip

Imprecise description about removing token "pu" in section Unigram tokenization

← Metadata

Owner

Metadata

course course copied to clipboard

Metadata

← Metadata

Owner

Metadata

course
course copied to clipboard