bug: kmdecomp.exe decomposes a .kmn file with a BOM
Describe the bug
I am using Keyman Developer 18.0.244 on Windows 11.
I ran kmdecomp yidheb2k.kmx yidheb2k.kmn
The resulting .kmn file indicates it is UTF-16 and it contains a BOM.
When I committed the new updated keyboard to the repo it built just fine, but some of the Keyman tools had a problem with it. Resulting discussions indicate we should make sure keyboards don't include the BOM.
- The tool should be updated to not include a BOM. I'm also not sure if we prefer UTF-8 instead?
- Keyman Developer should be updated to complain about the BOM or be able to save without the BOM.
Files attached.
Reproduce the bug
No response
Expected behavior
No response
Related issues
No response
Keyman apps
- [ ] Keyman for Android
- [ ] Keyman for iPhone and iPad
- [ ] Keyman for Linux
- [ ] Keyman for macOS
- [ ] Keyman for Windows
- [x] Keyman Developer
- [ ] KeymanWeb
- [ ] Other - give details at bottom of form
Keyman version
18.0.244
Operating system
Windows 11
Device
No response
Target application
No response
Browser
No response
Keyboard name
yidheb2k.kmx
Keyboard version
No response
Language name
Yiddish
Additional context
No response
but some of the Keyman tools had a problem with it.
Can you let me know which tools?
- The tool should be updated to not include a BOM. I'm also not sure if we prefer UTF-8 instead?
- Keyman Developer should be updated to complain about the BOM or be able to save without the BOM.
We continue to support UTF-16 with BOM as an alternative, legacy format. While I agree that kmdecomp should be updated to emit UTF-8 (probably without BOM), we need to be able to consume UTF-16 (only with BOM) as a backward compat option
Can you let me know which tools?
I don't know. I think @markcsinclair could say.
I am currently writing a replacement for the kmc-kmn compiler, called the next generatioin compiler or just ng-compiler (see #13553). It is not yet integrated with the kmc tooling, but is only being run through test code which was inadequate to handling the yiddish-hebrew.kmn file (see keyboards/#3807). Anyway, with @mcdurdin's help, a new file reading function has been written for my tests during compiler development (see buffertoString()) and something similar will be included in the final tool.