MetaIO icon indicating copy to clipboard operation
MetaIO copied to clipboard

Assumed string encoding for .mhd differs on platforms

Open codeling opened this issue 6 years ago • 7 comments

Looking at the .mhd documentation, I see no explicit mention of any assumed encoding (or a setting for it).

Yet when ElementDataFile points to a filename containing special characters, the character encoding is highly relevant for MetaIO to find the actual file. In my experiments (utilizing the MetaIO included in ITK), on Windows, an encoding of cp1252 is assumed, while on Linux, an encoding of utf-8 is expected. This means that when the filename given under ElementDataFile contains special characters, a separate .mhd file is required for Linux and Windows (and potentially more for other platforms I have not tested). What is thus required to make this consistent (and .mhd files with special characters transferrable between platforms), in my opinion, is to implement one of two options:

  1. That .mhd files are required to have a specific encoding (utf-8 seems to be the logic choice), or
  2. To have a separate entry specifying the encoding of the .mhd file

Or am I missing something here, is there an encoding specification somewhere already?

codeling avatar Jan 10 '19 15:01 codeling

Option 1 (utf-8) seems more logical. It will require a bit of extra coding for Windows, see SO 30829364.

dzenanz avatar Jan 10 '19 16:01 dzenanz

+1 for Option 1 to keep things simple.

thewtex avatar Jan 10 '19 19:01 thewtex

Option 1 would be my preferred choice too. I'll probably look into it, might take a while though. Note that for windows user, this change will break backward compatibility, they'll have to convert any .mhd files with special characters created before...

codeling avatar Jan 10 '19 19:01 codeling

@codeling awesome!

thewtex avatar Jan 10 '19 19:01 thewtex

I wonder if any function from https://gitlab.kitware.com/cmake/cmake/tree/master/Source/kwsys could be used ?

@bradking Is there any code in CMake already doing this that could be factored out ?

jcfr avatar Jan 10 '19 20:01 jcfr

See KWSys Encoding.hxx and Encoding.h.

bradking avatar Jan 11 '19 12:01 bradking

@codeling Please see how this is handled in VTK, where the unicode policy is UTF-8 everywhere.

todoooo avatar Feb 24 '21 22:02 todoooo

Resolved with commit https://github.com/Kitware/MetaIO/commit/4108676ae83cda04a0f2a820fe982d6db33585bf

aylward avatar May 29 '24 21:05 aylward