edit
edit copied to clipboard
Allow more languages to be maintained by the project
Right now, localizations are stored like this: https://github.com/microsoft/edit/blob/c13b8ab2f6f76288dd4682e24fba239c19e383b7/src/bin/edit/localization.rs#L244-L257
In other words, it's a matrix where the outer index is the string ID and the inner index is the language ID. Due to this it's impossible toggle languages on and off. What we need is ideally something like this:
const S_LANG_LUT: [[&str; _]; _] = generate_lut();
const generate_lut() -> [[&str; _]; _] {
let mut lut = [[""; _]; _];
#[cfg(feature = "lang-en")]
lut[NewFile][en] = "New File…";
#[cfg(feature = "lang-de")]
lut[NewFile][de] = "Neue Datei…";
#[cfg(feature = "lang-es")]
lut[NewFile][es] = "Nuevo archivo…";
#[cfg(feature = "lang-fr")]
lut[NewFile][fr] = "Nouveau fichier…";
#[cfg(feature = "lang-it")]
lut[NewFile][it] = "Nuovo file…";
#[cfg(feature = "lang-ja")]
lut[NewFile][ja] = "新規ファイル…";
#[cfg(feature = "lang-ko")]
lut[NewFile][ko] = "새 파일…";
#[cfg(feature = "lang-pt_br")]
lut[NewFile][pt_br] = "Novo arquivo…";
#[cfg(feature = "lang-ru")]
lut[NewFile][ru] = "Новый файл…";
#[cfg(feature = "lang-zh_hans")]
lut[NewFile][zh_hans] = "新建文件…";
#[cfg(feature = "lang-zh_hant")]
lut[NewFile][zh_hant] = "新增檔案…";
lut
}
Perhaps this can be abstracted away with a macro?
This then allows us to add more languages and toggle them on and off depending on the needs. Someone who wants the smallest possible editor for instance may disable everything except for lang-en. This is absolutely not a problem yet: The current localizations are only about 10kB large. Rather, I think that this may otherwise grow into an issue long-term, given that there's a lot of languages out there.
Can I suggest using external language packs?
Eg on Windows use resources/message dlls. Using string tables in resources can be tagged directly with the language/locale identifiers used by Windows derived directly from GetThreadLocale.
On POSIX I suggest using either nl_types.h or libintl.h to hold message translations in catalogs and use uselocale and similar to automatically select.
Use tools such as gencat or msgfmt to package the language translations.
As an example I maintain a master set of languages in string tables in edlin.rc. Then use a tool to generate the source for creating the POSIX message catalogs with either gencat or msgfmt for each language.
I'm somewhat hesitant to doing this as keeping the translations in Rust simplifies the build process quite a bit.
That may appear to be the case at the moment. My understanding is that this is intended to be a component of Windows, which currently supports 88 languages, so you will need to maintain 88 versions of every string directly in your code.
Traditionally a program is code plus data, where code can be compiled application along with libraries and data can be static data from the code, files read by the program and data entered by the user.
Resources are to data as shared libraries are to code, so you can build a program that only works in English then you add the language resources at runtime or application link time.
Eventually your build process will include building MSI, MSIX, Debian packages, RPM, OSX packages, so what you think is your build process now will be dwarfed by all that you need to release a professional program.
Traditionally applications export their strings into a spreadsheet and then language specialists, who know nothing about programming, translate the strings directly in the spreadsheet. You then take the spreadsheet and extract the values into your resource files.
Only having to compile and test the application in English first and then adding the other languages at build/release time is easier in the long run.
88
For what it's worth, developer tools on Windows are often only localized into a subset of ~11 languages identified to be in common use by developers.
Debian ... RPM ... OSX
It is currently out of the scope of this repository to provide packaging services for every platform and distribution combination. Even MSIX is a stretch[^1].
[^1]: It causes more problems than it solves.
My understanding is that when Microsoft builds region specific releases or language packs it uses existing common binaries then they add the new language translations to additional resource dlls. So the developers don't even see the languages that are eventually published. Yes, I am not a fan of MSIX at all, I was giving the same list that PowerShell already does.
MSIX is a very bad idea for a console application. Even worse for something that should work over remoting. Just dont
@schilive I notice you down voted my suggestion to use string resources. I suggest reading the link I posted just above to A brief and also incomplete history of Windows localization. My understanding is that this program will be packaged with future versions of Windows. String resources are how windows applications are localized.
But the rule prohibiting code changes remains in effect. Changing any code resets escrow, which means that the ship countdown clock gets reset back to its original value and all the testing performed up until that point needs to be redone in order to verify that the change did not affect them.
@rhubarb-geek-nz , I think you're right, since it's a Windows app. As long as English support is built-in, which I think always is, using string resources is best.
I was worried of the user being somehow at the mercy of Windows shenanigans and bugs, considering how protected language packs are. However, Edit is for IT people, I think, and I'm Brazilian, and every Brazilian IT professional is assumed to know English at least in a functional level. And this is the assumption of the industry. Using things in your own language is usually more of a shiny feature than, really, even a useful one. A lot of times, people prefer the original English over Portuguese. So, I think was overreacting, and the standard solution is best. I'm sorry, and will remove the downvote.