MetaIO
MetaIO copied to clipboard
Make ElementDataFileName always use UTF-8 encoding
Attempt to fix https://github.com/Kitware/MetaIO/issues/68.
Limitations:
- Only handles the
ElementDataFileName
field in datasets of type metaImage; I have seen the same field being used in e.g. metaArray and metaFEMObject as well, but I cannot test these at the moment; as for other MET_STRING fields, I have no idea if any might need such handling as well, and if so what consequences this might have, therefore I didn't touch them. - Not sure if all possible cases are handled correctly; I only have limited ability to test, and I don't know all possibilites of how the code is used from VTK/ITK. There is for example a logic of determining the ElementDataFileName from the FileName in the ::Write method, which I don't fully get; or rather I don't get why it intermittently sets the ElementDataFileName to the full data file path, and only later cuts away the path again; because this makes it a bit tricky to handle if the path itself also contains special characters - one should not encode the full path immediately in utf-8, since later, the path is extracted again from the ElementDataFileName, and a comparisons to the non-utf-8 encoded version of the file path is done....
- The file name itself still has to be specified in local encoding (on Windows).
- I considered adding a test case to testMeta3Image.cxx, but couldn't figure out how to set the file name properly, as source files are by default stored in utf-8 (but we would need to provide an ANSI filename on Windows)
Maybe a solution using vtksys iostreams (https://gitlab.kitware.com/paraview/paraview/-/merge_requests/3850) would be preferrable? Though this would probably break backwards-compatibility, since all file names would then be expected to be in UTF-8 format I guess?
Sorry but I think introducing the MET_FromUtf8ToLocalEncoding/MET_FromUtf8ToLocalEncoding macros is a terrible idea.
Please don't redefine ToWide()/ToNarrow() in the METAIO source as they are already defined in KWSYS. The best way to handle stream and file opening is by using vtksys input/output streams throughout METAIO as has already been demonstrated in VTK (https://gitlab.kitware.com/vtk/vtk/-/merge_requests/6301) and Paraview.
UTF-8 encoding is identical for plain English (ANSI) text so backwards compatibility is not an issue.
I agree that duplication is bad; from comments on https://github.com/Kitware/MetaIO/issues/68 I was under the impression though that the idea was to introduce just the required functionality from KWSys, not a dependency on that library (since MetaIO in general has very few dependencies, only zlib I think at the moment?). Regarding vtksys streams - correct me if I'm wrong, but wouldn't making MetaIO depend on vtksys introduce circular depencies (since currently VTK depends on MetaIO? I'd be glad if somebody comes up with a better solution of course; I just wanted to give it a quick go since nothing has happened on the topic in 3 years ;). Also I'm not sure the vtksys fstream would fix this issue - it's about the encoding of the .mhd file content, not about that of the file name directly.
correct me if I'm wrong, but wouldn't making MetaIO depend on vtksys introduce circular depencies (since currently VTK depends on MetaIO?
Since MetaIO and VTK are separate projects I don't believe that is a problem. The CMake builds in each will determine the location of vtksys.
Also I'm not sure the vtksys fstream would fix this issue - it's about the encoding of the .mhd file content, not about the file name
Please re-read the issue. https://github.com/Kitware/MetaIO/issues/68 is about the filename encoding not the file contents.
vtksys
and itksys
are a rename of kwsys. So within VTK, KWSys is compiled under the name vtksys
. Library content remains the same.
Please re-read the issue. https://github.com/Kitware/MetaIO/issues/68 is about the filename encoding not the file contents.
I wrote #68 ;) - maybe it's not fully clear from the description there, but the issue (which this pull request tries to address) is that the content of an .mhd file is not consistently encoded; explicitly the ElementDataFile
entry; and this is a problem when accessing the same .mhd/.raw file from both Linux and Windows.
Of course this is closely linked to the encoding in which the file name is passed to MetaIO - which is what the introduction of vtksys would be about, right? That's why I mentiond that one, also maybe a bit confusingly.
In the end we ideally of course would have UTF-8 encoding everywhere; and making all filenames use UTF-8 encoding might implicitly fix the issue with an inconsistent encoding in ElementDataFile as well...
I wrote https://github.com/Kitware/MetaIO/issues/68 ;) - maybe it's not fully clear from the description there, but the issue (which this pull request tries to address) is that the content of an .mhd file is not consistently encoded; explicitly the ElementDataFile entry; and this is a problem when accessing the same .mhd/.raw file from both Linux and Windows.
I think you're barking up the wrong tree. The header/data names passed to the Meta read/write methods should be utf-8 encoded already, so your solution does nothing to address the problem. What's the failing test case?
The solution here is to replace all occurrences of std::ifstream/std::ofstream in the MetaIO source with vtksys::ifstream/vtksys::ofstream and include the vtksys/FStream.hxx dependency. The vtksys streams convert utf8 path and filenames to wide strings for the Windows API calls.
You can copy what has already been done in VTK.
As @dzenanz pointed out, itksys and vtksys are just name mangled versions of kwsys. So, perhaps a better solution is to use a MET_FSTREAM preprocessor variable and if MetaIO is being compiled with ITK, then MET_FSTREAM:: becomes itksys::; and if MetaIO is being compiled with VTK, then MET_FSTREAM:: becomes vtksys::, and if MetaIO is being compiled independently, then MET_FSTREAM becomes either std:: or kwsys:: as determined by a cmake option (with the default being kwsys and with the developer responsible for providing KWSys via a find_package). The good news is that there is already logic for these three use cases in the namespace mangling routines in the CMakeLists.txt and metaIOConfig.h.in files in the src dir of MetaIO.
Of course, there are probably better names than MET_FSTREAM :)
Of course, there are probably better names than MET_FSTREAM :)
METAIO_STREAM https://github.com/Kitware/MetaIO/issues/106#issuecomment-1210615022
Wish METAIO_STREAM hadn't been removed. @hjmjohnson - do you recall the motivation for removing it?
This is a 5 minute fix. Is it going to languish for another 18 months?
Hi all, I ran into the original issue that this PR is trying to fix which I posted about over on the Slicer forum https://discourse.slicer.org/t/failures-saving-a-volume-to-path-with-sign/25657. Is there a path forward to help move along a solution to solve this issue in MetaIO?
Replaced by https://github.com/Kitware/MetaIO/pull/126