keyman icon indicating copy to clipboard operation
keyman copied to clipboard

bug: kmdecomp.exe decomposes a .kmn file with a BOM

Open LornaSIL opened this issue 2 weeks ago • 1 comments

Describe the bug

I am using Keyman Developer 18.0.244 on Windows 11.

I ran kmdecomp yidheb2k.kmx yidheb2k.kmn

The resulting .kmn file indicates it is UTF-16 and it contains a BOM.

When I committed the new updated keyboard to the repo it built just fine, but some of the Keyman tools had a problem with it. Resulting discussions indicate we should make sure keyboards don't include the BOM.

  • The tool should be updated to not include a BOM. I'm also not sure if we prefer UTF-8 instead?
  • Keyman Developer should be updated to complain about the BOM or be able to save without the BOM.

Files attached.

yidheb2k.zip

Reproduce the bug

No response

Expected behavior

No response

Related issues

No response

Keyman apps

  • [ ] Keyman for Android
  • [ ] Keyman for iPhone and iPad
  • [ ] Keyman for Linux
  • [ ] Keyman for macOS
  • [ ] Keyman for Windows
  • [x] Keyman Developer
  • [ ] KeymanWeb
  • [ ] Other - give details at bottom of form

Keyman version

18.0.244

Operating system

Windows 11

Device

No response

Target application

No response

Browser

No response

Keyboard name

yidheb2k.kmx

Keyboard version

No response

Language name

Yiddish

Additional context

No response

LornaSIL avatar Dec 10 '25 15:12 LornaSIL

but some of the Keyman tools had a problem with it.

Can you let me know which tools?

  • The tool should be updated to not include a BOM. I'm also not sure if we prefer UTF-8 instead?
  • Keyman Developer should be updated to complain about the BOM or be able to save without the BOM.

We continue to support UTF-16 with BOM as an alternative, legacy format. While I agree that kmdecomp should be updated to emit UTF-8 (probably without BOM), we need to be able to consume UTF-16 (only with BOM) as a backward compat option

mcdurdin avatar Dec 11 '25 04:12 mcdurdin

Can you let me know which tools?

I don't know. I think @markcsinclair could say.

LornaSIL avatar Dec 16 '25 17:12 LornaSIL

I am currently writing a replacement for the kmc-kmn compiler, called the next generatioin compiler or just ng-compiler (see #13553). It is not yet integrated with the kmc tooling, but is only being run through test code which was inadequate to handling the yiddish-hebrew.kmn file (see keyboards/#3807). Anyway, with @mcdurdin's help, a new file reading function has been written for my tests during compiler development (see buffertoString()) and something similar will be included in the final tool.

markcsinclair avatar Dec 17 '25 16:12 markcsinclair