OpenBB icon indicating copy to clipboard operation
OpenBB copied to clipboard

[WIP] Automatic i18n Translations with Huggingface 🤗 📖

Open martinb-ai opened this issue 2 years ago • 9 comments

The goal is to get more worldwide traction with many users from all across the globe. Let's get the ball rolling on this initiative. Always easier to add/edit files once they are already made.

Supports the following language translation from english:

  • Spanish
  • French
  • Italian
  • Portuguese (Brazilian)
  • Mandarin
  • Japanese
  • Russian
  • Arabic
  • Hindi
  • Indonesian

Next steps: [ ] Build an automatic checker for new entries for new auto translation [ ] Double check all translations with respective experts in languages

Note: The nature of this problem is challenging due to some commands being a single word or certain words having no direct translation without prior context - this is why each file will need a 30 min review from someone who is an expert in the language. You will see some files contain pure English - this is where the transformer model did not generate useful output. These all need to be translated into their respective language.

martinb-ai avatar Jan 04 '23 14:01 martinb-ai

Wow this is crazy.

Just a note - the pt one is not pt is pt-br

jose-donato avatar Jan 05 '23 01:01 jose-donato

Wow this is crazy.

Just a note - the pt one is not pt is pt-br

@jose-donato Oh so this is Brazilian Portuguese?

martinb-ai avatar Jan 05 '23 02:01 martinb-ai

Wow this is crazy. Just a note - the pt one is not pt is pt-br

@jose-donato Oh so this is Brazilian Portuguese?

The translation you have in the pr yes

jose-donato avatar Jan 05 '23 10:01 jose-donato

This is dope!!

DidierRLopes avatar Jan 05 '23 11:01 DidierRLopes

We'd like to chip in and provide OpenBB with a UI for contributors to add translations with inlang/inlang. See as an example https://inlang.com/editor/github.com/inlang/example.

The only requirement is a yaml plugin. We can build one.

samuelstroschein avatar Jan 26 '23 23:01 samuelstroschein

We'd like to chip in and provide OpenBB with a UI for contributors to add translations with inlang/inlang. See as an example https://inlang.com/editor/github.com/inlang/example.

The only requirement is a yaml plugin. We can build one.

Hey! That sounds awesome! What steps would be needed?

jmaslek avatar Jan 26 '23 23:01 jmaslek

@jmaslek From your side, nothing! But wups. I just saw that your repo is 1GB large!

We need to modify our git implementation to make inlang x OpenBB work. That's part of our the next git thesis though and therefore exactly what we want to do.

I started a project https://github.com/orgs/inlang/projects/9. Can't give a concrete timeline yet.

samuelstroschein avatar Jan 27 '23 00:01 samuelstroschein

@jmaslek From your side, nothing! But wups. I just saw that your repo is 1GB large!

We need to modify our git implementation to make inlang x OpenBB work. That's part of our the next git thesis though and therefore exactly what we want to do.

I started a project https://github.com/orgs/inlang/projects/9. Can't give a concrete timeline yet.

No worries! We look forward to it!

jmaslek avatar Jan 27 '23 00:01 jmaslek

  • I would remove Hindi because India has a lot more languages than just Hindi and it's either supporting more of them or sticking with a language that is still the most dominant in India (english that is)
  • I would remove Arabic because it's a right-to-left written language and we have not looked into support for RTL

P.S. @soggyomelette do you speak Arabic?

Unfortunately I don't but agree with sticking with English for India - any communication online is almost always done in English

soggyomelette avatar Feb 06 '23 21:02 soggyomelette