dir_monitor icon indicating copy to clipboard operation
dir_monitor copied to clipboard

Removed unnecessary converting to utf8

Open Vizor opened this issue 8 years ago • 18 comments

The boost path used wchar_t on Windows. On Czech Windows there was an error: the resulting utf8 string has odd chars, that was incorrectly converted back to wchar in boost path class.

Vizor avatar Mar 30 '16 13:03 Vizor

please could you provide an example of directory naming in your case?

sbelsky avatar Mar 30 '16 14:03 sbelsky

Of course, there is: template – kopie - the long dash is the cause of problems.

Vizor avatar Mar 30 '16 14:03 Vizor

Does it mean that helper::to_utf8 is working incorrectly?

The transformation should be reversible.

berkus avatar Mar 30 '16 16:03 berkus

Of course in case of modern Windows there shouldn't be any problem in using just std::wstring, as that's the primary string type (althought it's UTF-16 not plain UCS-2).

berkus avatar Mar 30 '16 16:03 berkus

Merged this to develop, keeping open for discussion though.

berkus avatar Mar 30 '16 16:03 berkus

hmmm i hadn't such problems with my russian clients (although they very like to name directories in native language). I think this issue requires additional investigation.

sbelsky avatar Mar 30 '16 17:03 sbelsky

boost::filesystem::path use boost::filesystem::path_traits::convert() to convert std::string back to std::wstring, that use internally. I don't know what it use undercover but I think, it is a different method based on std::codecvt template. But why convert the utf16 to utf8 and then back if it is unnecessary? Independently of whether there is an error or not?

Vizor avatar Mar 31 '16 01:03 Vizor

sorry, i haven't yet fully migrate to current implementation of dir_monitor. and my local work_thread() use: impl->pushback_event(dir_monitor_event(ck->dirname, to_utf8(fni->FileName, fni->FileNameLength / sizeof(WCHAR)), type)); to process result of monitoring. so @Vizor you are right and it's my misunderstanding :)

sbelsky avatar Mar 31 '16 08:03 sbelsky

It's ok, but isn't it the same function?

Vizor avatar Mar 31 '16 10:03 Vizor

no, dir_monitor_event definition has been changed since https://github.com/berkus/dir_monitor/commit/e23eea204f992df94f2f62ba2329338574df50a7.

By the way, looks like we can remove helper::to_utf8 at all.

sbelsky avatar Mar 31 '16 10:03 sbelsky

@Vizor after some investigation i have another portion of questions. mainly: how did you reproduce current issue? how did you pass you Czech string into add_directory function? like in my test case?

sbelsky avatar Apr 01 '16 09:04 sbelsky

I didn´t. It come from en event, when somthing changes in monitored directory. I have recursive monitoring enabled and then a create subdirectory with the long dash in name (in monitored directory). I use async monitoring, so the wrong utf8 name come into the handler in dir_monitor_event::path.

Vizor avatar Apr 04 '16 10:04 Vizor

heh next time please describe you reprodution steps right away. it will help a lot :)

@berkus i think you can close this issue.

sbelsky avatar Apr 04 '16 11:04 sbelsky

So the previously invalid name is now handled properly by our utf16 conversion?

berkus avatar Apr 04 '16 14:04 berkus

previous name is a valid utf8 name...

i didn't test especially Vizor's case, but i believe it must be also handled by my PR https://github.com/berkus/dir_monitor/pull/39 changes.

sbelsky avatar Apr 04 '16 14:04 sbelsky

although.. current PR actually solves case mentioned by Vizor (appearance of subdirectories of monitored directory with non-ascii names) and my PR 39 solves case with passing non-ascii utf8-encoded string to add_directory.

sbelsky avatar Apr 04 '16 14:04 sbelsky

Ok, sorry. I thought it was obvious from the modified code.

Vizor avatar Apr 04 '16 20:04 Vizor

Sorry guys, I was busy for a while but will get back and review the PRs.

Is there anything changed here or the code seems good to go to you?

berkus avatar Jan 30 '17 08:01 berkus