course
course copied to clipboard
Translation to Persian
Hi there 👋
Let's translate the course to Persian so that the whole community can benefit from this resource 🌎!
Before you get started please take a minute to read the latest version of our evolving translation guidelines (T). It is important that we maintain a common tone in our collective work, while contributing with our separate creative voices.
We have a glossary page (G) where we store our latest choice of Persian equivalents for words. This page may be subject to change with every PR and its review discussion. If there are changes, we will mention here that the glossary file has been updated. We need to retroactively apply the changes to our sections.
Check here for general instructions on contributing.
Here's the workflow for contributions:
- Please fork the Hugging Face course to your profile.
- Clone your fork to your local machine.
- Use this issue page for general discussion on word choices and whatnot.
- Fetch frequently from upstream to your fork and keep your local working tree updated.
- It is perfectly fine to link to your fork on this page for discussions.
- When you have the first draft of a page(s) done commit back to your fork and open a PR for that page(s) on the Hugging Face course repo. (Huggingface course/main branch <- Your fork/main or whatever branch you have)
- Ask someone to help you review the page(s) there. Commit the changes back to your fork and they will automatically be appended to the PR.
- If you have updates to the glossary try to include the stakeholders in the discussion(check commit history) and when done mention the changes on this page so we can all apply the changes retroactively to our sections.
- When done with the review, ask @lewtun to merge.
Below are the chapters and files that need translating - let us know here if you'd like to translate any and we'll add your name to the list. Once you're finished, open a pull request and tag this issue by including #issue-number
in the description, where issue-number
is the number of this issue.
🙋 If you'd like others to help you with the translation, you can also post in our forums or tag @_lewtun on Twitter to gain some visibility.
Chapters
0 - Setup
1 - Transformer models
- [x]
1.mdx
@schoobani - [ ]
2.mdx
@schoobani - [ ]
3.mdx
- [ ]
4.mdx
- [ ]
5.mdx
- [ ]
6.mdx
- [ ]
7.mdx
- [ ]
8.mdx
- [ ]
9.mdx
- [ ]
10.mdx
2 - Using 🤗 Transformers
- [x]
1.mdx
@jowharshamshiri - [x]
2.mdx
@jowharshamshiri - [x]
3.mdx
@jowharshamshiri - [ ]
4.mdx
WIP @jowharshamshiri - [ ]
5.mdx
- [ ]
6.mdx
- [ ]
7.mdx
- [ ]
8.mdx
3 - Fine-tuning a pretrained model
4 - Sharing models and tokenizers
- [x]
1.mdx
@hamedonline - [x]
2.mdx
@hamedonline - [ ]
3.mdx
WIP @hamedonline - [ ]
4.mdx
- [ ]
5.mdx
- [ ]
6.mdx
5 - The 🤗 Datasets library
6 - The 🤗 Tokenizers library
- [ ]
1.mdx
- [ ]
2.mdx
- [ ]
3.mdx
- [ ]
3b.mdx
- [ ]
4.mdx
- [ ]
5.mdx
- [ ]
6.mdx
- [ ]
7.mdx
- [ ]
8.mdx
- [ ]
9.mdx
- [ ]
10.mdx
7 - Main NLP tasks
8 - How to ask for help
Events
- [ ]
1.mdx
Hey, I would like to help on the project and translate the chapter 1. I will do my best to finalise it until the end of April.
@jowharshamshiri per, fas and fa can be language codes for persian. I suggest using fa.
Hey, Nice to get acquainted with you Saeid jan and thanks for your time. Agree with you on the lang code--that's what Huggingface uses, I think.
On Wed, Mar 30, 2022 at 12:43 AM Saeed Choobani @.***> wrote:
@jowharshamshiri https://github.com/jowharshamshiri per, fas and fa can be language codes for persian. I suggest using fa.
— Reply to this email directly, view it on GitHub https://github.com/huggingface/course/issues/50#issuecomment-1082329960, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB6KJRJFYIODHFUNSQOU7OTVCNP5ZANCNFSM5R64LCPQ . You are receiving this because you were mentioned.Message ID: @.***>
I am also delighted to contribute.
I forked and added the fa directory and the _toctree.yml
file. you can track my updates in here:
https://github.com/schoobani/course/tree/main/chapters/fa
Awesome, thanks!
On Wed, Mar 30, 2022 at 1:17 AM Saeed Choobani @.***> wrote:
I am also delighted to contribute.
I forked and added the fa directory and the _toctree.yml file. you can track my updates in here: https://github.com/schoobani/course/tree/main/chapters/fa
— Reply to this email directly, view it on GitHub https://github.com/huggingface/course/issues/50#issuecomment-1082357976, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB6KJRPS5CB5OODRZVQOCQDVCNT4JANCNFSM5R64LCPQ . You are receiving this because you were mentioned.Message ID: @.***>
Hi @schoobani and @jowharshamshiri thank you so much for this initiative and for offering to help! I've added your name @schoobani to the Chapter 1 list. In case it's easier, you don't have to translate the whole chapter in one go - feel free to just add the sections as you work through them!
@schoobani I propose we use this Persian style guide to ensure uniform style across all of our work. Looks intimidating but has has a lot of whitespace and is the best I could find. You don't need to read it, just refer to it for common Persian writing guidelines like prefixes and suffixes and spacing and such. https://apll.ir/wp-content/uploads/2018/10/D-1394.pdf
@schoobani I propose that in Persian text we replace the use of the Huggingface emoji with the transliteration of the word 'Huggingface' in Persian(هاگینگفیس). What do you think? Emoji's don't lend themselves easily to Persian text. @lewtun I don't know if that is against Huggingface policy, love to hear your thoughts on this.
@lewtun I added my own name to the Chapter 2 list.
@schoobani I propose that in Persian text we replace the use of the Huggingface emoji with the transliteration of the word 'Huggingface' in Persian(هاگینگفیس). What do you think? Emoji's don't lend themselves easily to Persian text. @lewtun I don't know if that is against Huggingface policy, love to hear your thoughts on this.
Thanks for raising this interesting point! If emoji's aren't a good fit for Persian then I think it's perfectly fine to transliterate "Hugging Face" or "Huggingface" if the latter is easier :)
Hello @schoobani has a nice PR here: https://github.com/huggingface/course/pull/71
Would a Persian speaker here be willing to have a quick review as I unfortunately don't speak Farsi 🙈 ?
@schoobani I propose that in Persian text we replace the use of the Huggingface emoji with the transliteration of the word 'Huggingface' in Persian(هاگینگفیس). What do you think? Emoji's don't lend themselves easily to Persian text. @lewtun I don't know if that is against Huggingface policy, love to hear your thoughts on this.
@jowharshamshiri I think using the emoji 🤗 is compatible with the Persian text. Personally for me, seeing the emoji in documents indicates a relation with huggingface features. We can decide about it after finalizing the translations.
@schoobani I propose that in Persian text we replace the use of the Huggingface emoji with the transliteration of the word 'Huggingface' in Persian(هاگینگفیس). What do you think? Emoji's don't lend themselves easily to Persian text. @lewtun I don't know if that is against Huggingface policy, love to hear your thoughts on this.
@jowharshamshiri I think using the emoji hugs is compatible with the Persian text. Personally for me, seeing the emoji in documents indicates a relation with huggingface features. We can decide about it after finalizing the translations.
I think we should make these choices as soon as possible, to minimize editing at the end. This one seems like a simple replace, but there will be so many little details like it that when taken together, will make for a quality translation and let us avoid a lot of unnecessary headache. That being said, these are my reasons for transliteration:
- This is a substitute for brand name. We should transliterate brand names, except when we have a specific reason not to. I do think your argument is valid that this is an immediate visual reference to a Hugging Face-related feature, but is this reason enough? I don't think so.
- Mixing non-Persian characters into Persian makes for a fragmentary and incomprehensible text. It should be avoided where at all possible and tolerated where necessary. Programming namespaces are one example that we have to tolerate. The effect might not be noticeable in this small choice, but I promise you is very noticeable for a new reader. It's like when I look at a painting and can't really describe why I like it. Art historians can put words to my feelings, but I do feel them nevertheless. Persian text should flow naturally--this breaks that flow.
Happy to hear your thoughts on this. :)
@schoobani I propose that in Persian text we replace the use of the Huggingface emoji with the transliteration of the word 'Huggingface' in Persian(هاگینگفیس). What do you think? Emoji's don't lend themselves easily to Persian text. @lewtun I don't know if that is against Huggingface policy, love to hear your thoughts on this.
Thanks for raising this interesting point! If emoji's aren't a good fit for Persian then I think it's perfectly fine to transliterate "Hugging Face" or "Huggingface" if the latter is easier :)
I went with 'Huggingface'(one word) to demonstrate our use of a half-space in Persian transliteration to you. The English is ofc Hugging Face(two words) but when transliterated to Persian, this is one semantic unit and a loan word that we are introducing to Persian, which is reflected in the half-space. You don't expect people to use each transliterated word elsewhere in Persian on its own. :)
@schoobani I propose that in Persian text we replace the use of the Huggingface emoji with the transliteration of the word 'Huggingface' in Persian(هاگینگفیس). What do you think? Emoji's don't lend themselves easily to Persian text. @lewtun I don't know if that is against Huggingface policy, love to hear your thoughts on this.
Thanks for raising this interesting point! If emoji's aren't a good fit for Persian then I think it's perfectly fine to transliterate "Hugging Face" or "Huggingface" if the latter is easier :)
I went with 'Huggingface'(one word) to demonstrate our use of a half-space in Persian transliteration to you. The English is ofc Hugging Face(two words) but when transliterated to Persian, this is one semantic unit and a loan word that we are introducing to Persian, which is reflected in the half-space. You don't expect people to use each transliterated word elsewhere in Persian on its own.
Sure, we should use the single word structure, and I will consider it on my translations. We must consider the correctly usage of half-space as a general rule in all translated documents not just the name.
@lewtun I added my name to chapter 0 and also chapter 2/page 2.
@jowharshamshiri I mistakenly translated the chapter 0. If you have already done it feel free to ignore my work. As soon as you let me know I send a pull request. so we can see the first previews of RTL format on huggingface docs.
@jowharshamshiri I am not sure how the translated numbers will be shown in the docs and which is the best approach.
- title: 0. نصب
sections:
- local: chapter0/1
title: مقدمه
Or?
- title: ۰. نصب
sections:
- local: chapter0/1
title: مقدمه
@jowharshamshiri I mistakenly translated the chapter 0. If you have already done it feel free to ignore my work. As soon as you let me know I send a pull request. so we can see the first previews of RTL format on huggingface docs.
No worries brother. I just checked the commit in your fork and I very much like the tone of your translation. It sounds natural and shows that you've spent time to refine it--that's valuable work. My PRs have "first draft" in the name for this very reason. We need to collaborate more swiftly by sharing drafts of our work and learning from each other. With smaller PRs and and more frequent PR reviews, we can keep our working trees updated with the latest changes. I will spend some time tonight, to create a second draft by mixing mine with yours and I will also add that empty file we need for chapter 1. I'll mention you in the PR when I'm done so you can review the result and it can be merged.
@jowharshamshiri I am not sure how the translated numbers will be shown in the docs and which is the best approach.
- title: 0. نصب sections: - local: chapter0/1 title: مقدمه
Or?
- title: ۰. نصب sections: - local: chapter0/1 title: مقدمه
I think this works, but let's wait and see the live previews.
- title: ۲- بکارگیری ترنسفورمرهای هاگینگفیس
sections:
- local: chapter2/1
Hi @lewtun. @schoobani suggested in my local fork that we should move the guidelines for translating into Persian, out of the glossary and to a new file. I'm confused as to how best to do that. Should I create a new file under the glossary folder? That seems counterintuitive. Maybe we can temporarily put it there? or maybe we should host the file somewhere else? Please tell me how to proceed, I want to include the change in my incoming PR. Same ofc would later go for the build instructions you talked about in the first PR. These need to be accessible to the contributors and referenced on this issue page but won't make it into the final work.
Hi @jowharshamshiri I suppose one idea would be to include a TRANSLATING.txt
file in chapters/fa
- it can't be a .md
file because doc-builder
will try to build it as part of the website.
This way we keep the glossary clean on the website. What do you think?
Hi @jowharshamshiri I suppose one idea would be to include a
TRANSLATING.txt
file inchapters/fa
- it can't be a.md
file becausedoc-builder
will try to build it as part of the website.This way we keep the glossary clean on the website. What do you think?
That's perfect thank you. Will do.
@jowharshamshiri
Added Chapter1 --> 1.mdx
and 2.mdx
and they are ready for the peer reviewing.
https://github.com/schoobani/course/tree/main/chapters/fa/chapter1
I used brackets to indicate phrases which doesn't appear to be in the glossary and I think they will appear again in the text. here is a list of them and my initial translations for them.
- Fine-tune (فاینتون)
- Research Engineer (مهندس محقق)
- Naive Base (بیز ساده)
- Task (مسئله)
- Demo (دمو)
- Prompt (متن اولیهی پرامپت)
I also faced the issue that beginning an RTL row with an english phrase destroys the whole line in github. I guess it would be the same in huggingface and we should somehow ignore doing so.
@jowharshamshiri @lewtun I think we should consider using another font for the persian/arabic in huggingface. In my opinion, current used font is very difficult to read and follow.
@jowharshamshiri Added Chapter1 -->
1.mdx
and2.mdx
and they are ready for the peer reviewing. https://github.com/schoobani/course/tree/main/chapters/fa/chapter1I used brackets to indicate phrases which doesn't appear to be in the glossary and I think they will appear again in the text. here is a list of them and my initial translations for them.
* Fine-tune (فاینتون) * Research Engineer (مهندس محقق) * Naive Base (بیز ساده) * Task (مسئله) * Demo (دمو) * Prompt (متن اولیهی پرامپت)
I also faced the issue that beginning an RTL row with an english phrase destroys the whole line in github. I guess it would be the same in huggingface and we should somehow ignore doing so.
Hey @schoobani. Great progress! Was planning on doing this tonight but have been swamped lately and honestly am having trouble keeping my eyes open. Will get back to you as soon as I can. :)
Hi, I'd like to contribute in this project. Please, let me know what I can do? I saw someone has already taken chapter 1. Can I start with Ch2?
Hi @kambizG! chapter1 is work in progress by me and chapter2 will be done by @jowharshamshiri. We can ask @lewtun to assign chapter3 Fine-tuning a pretrained model to you.
Hello and very happy to have you joining us @kambizG jan. Thanks for your precious time. Please have a look at the glossary before you start and tell us how you think we could do better.
Hi everyone! Great to see Persian community here 🤗. I would also like to contribute by taking over chapter4, if you will. Let's get it done ;)