transformers 🌐 [i18n-KO] Translating docs to Korean

Hi!

Let's bring the documentation to all the Korean-speaking community 🌏 (currently 9 out of 77 complete)

Would you want to translate? Please follow the 🤗 TRANSLATING guide. Here is a list of the files ready for translation. Let us know in this issue if you'd like to translate any, and we'll add your name to the list.

Some notes:

Please translate using an informal tone (imagine you are talking with a friend about transformers 🤗).
Please translate in a gender-neutral way.
Add your translations to the folder called ko inside the source folder.
Register your translation in ko/_toctree.yml; please follow the order of the English version.
Once you're finished, open a pull request and tag this issue by including #issue-number in the description, where issue-number is the number of this issue. Please ping @ArthurZucker, @sgugger and @eunseojo for review.
🙋 If you'd like others to help you with the translation, you can also post in the 🤗 forums.
With the HuggingFace Documentation l10n initiative of Pseudo Lab, full translation will be done even faster. 🎉 Please give us your support! Cheers to our team 👍@0525hhgus, @KIHOON71, @gabrielwithappy, @jungnerd, @sim-so, @HanNayeoniee, @wonhyeongseo

안녕하세요!

한국어를 사용하는 모두가 기술 문서를 읽을 수 있게 해보아요 🌏 (현재 77개 문서 중 9개 완료)

번역에 참여하고 싶으신가요? 🤗 번역 가이드를 먼저 읽어보시기 바랍니다. 끝 부분에 번역해야할 파일들이 나열되어 있습니다. 작업하고 계신 파일이 있다면 여기에 간단히 알려주세요. 중복되지 않도록 작업중으로 표시해둘게요.

참고 사항:

기술 문서이지만 (친구에게 설명 듣듯이) 쉽게 읽히면 좋겠습니다. 존댓말 로 써주시면 감사하겠습니다.
성별은 일부 언어(스페인어, 프랑스어 등)에만 적용되는 사항으로, 한국어의 경우 번역기를 사용하신 후 문장 기호와 조사 등이 알맞는지 확인해주시기 바랍니다.
소스 폴더 아래 ko 폴더에 번역본을 넣어주세요.
목차(ko/_toctree.yml)도 함께 업데이트해주세요. 영어 목차와 순서가 동일해야 합니다.
모두 마치셨다면, 기록이 원활하도록 PR을 여실 때 현재 이슈(#20179)를 내용에 넣어주시기 바랍니다. 리뷰 요청은 @ArthurZucker님, @sgugger님, @eunseojo님께 요청해주세요.
🙋 커뮤니티에 마음껏 홍보해주시기 바랍니다! 🤗 포럼에 올리셔도 좋아요.
가짜연구소의 이니셔티브로 번역이 더욱 빠르게 진행될 예정입니다. 🎉 많은 응원 부탁드려요! 우리팀 화이팅 👍 @0525hhgus, @KIHOON71, @gabrielwithappy, @jungnerd, @sim-so, @HanNayeoniee, @wonhyeongseo

GET STARTED

[x] 🤗 Transformers https://github.com/huggingface/transformers/pull/20180
[x] Quick tour https://github.com/huggingface/transformers/pull/20946
[x] Installation https://github.com/huggingface/transformers/pull/20948

TUTORIAL

[x] Pipelines for inference https://github.com/huggingface/transformers/pull/22508
[x] Load pretrained instances with an AutoClass https://github.com/huggingface/transformers/pull/22533
[x] Preprocess https://github.com/huggingface/transformers/pull/22578
[x] Fine-tune a pretrained model https://github.com/huggingface/transformers/pull/22670
[x] Distributed training with 🤗 Accelerate https://github.com/huggingface/transformers/pull/22830
[ ] Share a model

HOW-TO GUIDES

GENERAL USAGE

[x] Create a custom architecture https://github.com/huggingface/transformers/pull/22754
[x] Sharing custom models https://github.com/huggingface/transformers/pull/22534
[x] Train with a script https://github.com/huggingface/transformers/pull/22793
[x] Run training on Amazon SageMaker https://github.com/huggingface/transformers/pull/22509
[ ] Converting from TensorFlow checkpoints
[x] Export to ONNX https://github.com/huggingface/transformers/pull/22806
[ ] Export to TorchScript
[ ] Troubleshoot

NATURAL LANGUAGE PROCESSING

[x] Use tokenizers from 🤗 Tokenizers https://github.com/huggingface/transformers/pull/22956
[ ] Inference for multilingual models
[ ] Text generation strategies

TASK GUIDES

[x] Text classification https://github.com/huggingface/transformers/pull/22655
[x] Token classification https://github.com/huggingface/transformers/pull/22945
[ ] Question answering
[ ] Causal language modeling
[x] Masked language modeling https://github.com/huggingface/transformers/pull/22838
[x] Translation https://github.com/huggingface/transformers/pull/22805
[x] Summarization https://github.com/huggingface/transformers/pull/22783
[ ] Multiple choice

AUDIO

[ ] Audio classification
[ ] Automatic speech recognition

COMPUTER VISION

[ ] Image classification
[ ] Semantic segmentation
[ ] Video classification
[ ] Object detection
[ ] Zero-shot object detection
[ ] Zero-shot image classification
[ ] Depth estimation

MULTIMODAL

[x] Image captioning https://github.com/huggingface/transformers/pull/22943
[ ] Document Question Answering

PERFORMANCE AND SCALABILITY

[ ] Overview
[ ] Training on one GPU
[ ] Training on many GPUs
[ ] Training on CPU
[ ] Training on many CPUs
[ ] Training on TPUs
[ ] Training on TPU with TensorFlow
[ ] Training on Specialized Hardware
[ ] Inference on CPU
[ ] Inference on one GPU
[ ] Inference on many GPUs
[ ] Inference on Specialized Hardware
[ ] Custom hardware for training
[ ] Instantiating a big model
[ ] Debugging
[ ] Hyperparameter Search using Trainer API
[ ] XLA Integration for TensorFlow Models

CONTRIBUTE

[ ] How to contribute to transformers?
[ ] How to add a model to 🤗 Transformers?
[ ] How to convert a 🤗 Transformers model to TensorFlow?
[ ] How to add a pipeline to 🤗 Transformers?
[ ] Testing
[ ] Checks on a Pull Request
[ ] 🤗 Transformers Notebooks
[ ] Community resources
[ ] Benchmarks
[ ] Migrating from previous packages

CONCEPTUAL GUIDES

[ ] Philosophy
[ ] Glossary
[ ] What 🤗 Transformers can do
[ ] How 🤗 Transformers solve tasks
[ ] The Transformer model family
[ ] Summary of the tokenizers
[ ] Attention mechanisms
[ ] Padding and truncation
[ ] BERTology
[ ] Perplexity of fixed-length models
[ ] Pipelines for webserver inference

Other relevant PRs along the way

Enable easy Table of Contents editing https://github.com/huggingface/transformers/pull/22581
Added forgotten internal English anchors for sagemaker.mdx https://github.com/huggingface/transformers/pull/22549
Fixed anchor links for auto_class, training https://github.com/huggingface/transformers/pull/22796
Update ToC from upstream https://github.com/huggingface/transformers/pull/23112

Nov 12 '22 06:11 wonhyeongseo

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Dec 12 '22 15:12 github-actions[bot]

Hello @sgugger, may you please add the WIP tag to this issue? Thank you so much.

Dec 13 '22 16:12 wonhyeongseo

For contributors and PseudoLab team members, please see a PR template gist (raw) that could ease your first PR experience. @0525hhgus, @KIHOON71, @gabrielwithappy, @jungnerd, @sim-so, @HanNayeoniee, @wonhyeongseo

Apr 02 '23 04:04 wonhyeongseo

Dear @sgugger, would you add document label to this issue? I think other issues for the translation have a document label. Thank you in advance

@wonhyeongseo I changed my PR with a new PR template. would you change Load pretrained instances with an AutoClass to [WIP]🌐[i18n-KO] Translate autoclass_tutorial to Korean and Fix the typo of quicktour #22533

Apr 03 '23 14:04 gabrielwithappy

@sgugger wow! Thank you a million! :-)

Apr 03 '23 14:04 gabrielwithappy

@sgugger Dear HuggingFace Team,

I hope you are doing well. My name is Wonhyeong Seo from the Pseudo Lab team. As you may know, we are actively working on localizing the huggingface/transformers repository documentation into Korean. Our goal is to make this valuable resource more accessible to Korean-speaking users, thereby promoting the development of NLP and machine learning in Korea and beyond.

We are currently in the process of applying for government sponsorship to support our localization efforts. To strengthen our application, we kindly request your permission to use the documentation's Google Analytics data to include in our reports. This data will help us demonstrate the impact of our work and the potential benefits of localizing the documentation.

Additionally, we would be grateful for any feedback or suggestions from the HuggingFace team regarding our localization project. Your insights will be invaluable in ensuring our efforts align with your vision and standards, and in fostering a successful collaboration.

Thank you for considering our request. We look forward to your response and the opportunity to work together to expand the reach of the huggingface/transformers repository.

Best regards, Hyunseo Yun, Kihoon Son, Gabriel Yang, Sohyun Sim, Nayeon Han, Woojun Jung, Wonhyeong Seo The Localization Initiative members of Pseudo Lab

Apr 15 '23 16:04 wonhyeongseo

Hey @wonhyeongseo, thanks for all you work on translating the documentation to Korean!

Do you mind contacting me at lysandre at hf.co so we may see how best to help you?

Apr 17 '23 15:04 LysandreJik

Welcome to a simple guide on how to use ChatGPT to speed up the translation process. By following these guidelines, you can create a first draft in less than an hour. Please note that it is essential to proofread your work thoroughly before sharing it with your colleagues.

(Optional) If you want to extract only the content without code blocks, tables, and redundant new lines, you can use the command sed '/```/,/```/d' file.md | sed '/^|.*|$/d' | sed '/^$/N;/^\n$/D'. In case you are using a mobile device, you can check the link https://sed.js.org/ for using sed online.

To initiate the translation process, you need to provide your sentences as input to ChatGPT. Your first prompt should look like this:

What do these sentences about Hugging Face Transformers (a machine learning library) mean in Korean? Please do not translate the word after a 🤗 emoji as it is a product name.
```md
<your sentences>

After submitting the first prompt, you can use the following prefix for the next ten prompts:

```next-part
<your sentences>

Note that after ten prompts, you must remind ChatGPT of the task if you are not using LangChain.

By following these guidelines, you can create a first draft of your translation in a shorter time frame. However, it is crucial to emphasize that the quality of the final output depends on the accuracy of the input and the proofreading process.

PS: Please note that we do not have a Korean LLM that can automate the proofreading process at the moment. However, in July, Naver plans to launch their HyperCLOVA Korean LLM model, which might automate the entire process. We are optimistic that our government proposal will be accepted, allowing us to increase our talent pool and work towards achieving a more automated translation process with them.

Apr 23 '23 03:04 wonhyeongseo

Dear @LysandreJik ,

I hope you are doing well. I wanted to inform you that I have sent an email with the subject line "[i18n-KO] Request for Collaboration: Hugging Face Mentorship Program." Whenever you have a moment, please take a look and provide a response. Thank you so much for your interest to this collaboration. If you have any questions, please don't hesitate to contact me.

Best regards, Wonhyeong Seo

Apr 24 '23 12:04 wonhyeongseo

@gabrielwithappy @sim-so @jungnerd @HanNayeoniee @0525hhgus @KIHOON71 From this merge of model_sharing.mdx #22991 , I learned that we don't have to git rebase -i as other open source libraries mandate. Therefore, I propose we commit in 4 steps like this:

docs: ko: <file-name> - As we always do for the first commit. Copy the initial English file under ko and edit TOC: both external and (soon-to-be-automated) internal.

From this point forward, you may need to squash commits in each step.

feat: [nmt|manual] draft - Machine-translate the entire file with: dedicated translators, prompts, or any kind of automation. You may choose to translate manually, and that is ok as long as you specify it in the commit message.
fix: manual edits - Proofread the draft thoroughly.
fix: resolve suggestions - Get reviews and resolve suggestions.

With this, it will be easier for collaborators to see the original English and your changes side by side. Not to mention, we can use diffs as pre-training data for the in-house rlhf translation model.

@ArthurZucker @sgugger , when merging a PR, how is the main commit message decided if there are multiple commits? Do you have to manually write it, or is the first commit message of the PR selected? Thank you for your insights and continued support. Much love from Korea 🇰🇷💖💕🙏

Apr 28 '23 23:04 wonhyeongseo

The main commit message is the title of the PR.

May 01 '23 13:05 sgugger

Hey all! As some people were interested in a place to discuss about translations, we opened a category in the HF Discord server with a category for internationalization and translation efforts, including a Korean channel!

May 12 '23 20:05 osanseviero

Hi Pseudo Lab friends! I just wanted to provide a quick update on where the translation progress currently stands:

73% done ✅
6 PRs pending review; once merged, you'll be up to 81% 📈
15 files left to translate before ✨ 100% ✨

Great work, and big thanks again for all your contributions to fully translate the 🤗 Transformers documentation.

Aug 21 '23 23:08 stevhliu

안녕하세요 개인적으로 text generation part의 번역에 참여하고자 합니다. draft가 완성되면 PR보내드리겠습니다!

Hi All! I would like to participate the translation job (especailly the part of text generation). If a first draft is done, I will send a PR request and then let you know.

Oct 11 '23 00:10 zayunsna

huggingface_hub의 docs를 transformer로 잘못 멘션했습니다. 현재 수정해 두었으며, 바로 위 멘션은 무시해주세요. 죄송합니다.

I incorrectly mentioned huggingface_hub's docs as a transformer, I've fixed it now, please ignore the comment immediately above, sorry.

Nov 26 '23 03:11 heuristicwave

transformers transformers copied to clipboard

🌐 [i18n-KO] Translating docs to Korean

GET STARTED

TUTORIAL

HOW-TO GUIDES

GENERAL USAGE

NATURAL LANGUAGE PROCESSING

TASK GUIDES

AUDIO

COMPUTER VISION

MULTIMODAL

PERFORMANCE AND SCALABILITY

CONTRIBUTE

CONCEPTUAL GUIDES

Other relevant PRs along the way

transformers
transformers copied to clipboard