RapidCRC-Unicode
RapidCRC-Unicode copied to clipboard
bug: v0.3.22: file paths with special characters, like German "umlaut", are marked as Error or File not found
When verifying files using an existing .md5 file, if there is a German Umlaut in a file path (directory or file name) the file is marked as "Error", or "File not found". Program version 0.3.22
I have attached
- a screen shot of two instances of the program, the upper instance used to create an md5 file, and the green rectangle shows the correctly shown umlaut, the lower instance showing the errors, and the red rectangle the incorrectly read file names.
- the directory with the files themselves, and the md5 file
The md5 file itself looks good, but when reading it, the program seems to have a problem with the umlaut. Note: there is only one umlaut (German ae = ä) in the file path. Note: it seems that not any umlaut causes a problem. I tried a simpler directory name, with the same umlaut and that caused no problem.
Here is the option page, in case this is caused by an invalid mix of options :-).
One more finding: it seems this issue is caused by the program not recognizing correctly it's own Unicode UTF-8 generated md5 files. I checked the option "General / Default to codepage when opening / UTF-8", because I have also the option "File creation / Create Unicode Files / UTF-8" activated.
I still think this is a bug, as the generated file header says clearly that this md5 file IS UTF-8. Thanks. Klaus
There isn't much RCRC can do in this case. UTF-8 files usually have no byte order mark at the beginning and can not be discerned from files in your local codepage. RCRC uses windows functions to "guess" which encoding your file is in this case, and here it clearly fails and guesses wrong. This doesn't happen with utf16 files since they start with a byte order mark that can be detected.
Understood, I overlooked that aspect. Would it be possible, then, to avoid this pitfall for users like me, who are not so aware of this issue, to set the default for file creation and reading to UTF-16? Thanks. Klaus