String (and path) encoding, terminal in/out
Channel
C++Weekly
Topics
In the unix world, all is well and your terminal uses utf-8 encoding, std::filesystem::path uses char.
On windows things go sideways real quick: if you're reading command line parameters with non-ascii chars you probably need to use wmain, then narrow to std::string. codecvt header was deprecated at c++17, googling/stackoverflow is not helpful because the vast majority of what you find still relies on it. fs::path uses wchar_t too.
Length
Probably long form, 10-20min.
Note: should this be picked up as a strong candidate, I'm willing to contribute to the episode preparation. I can provide an example use case, some test code, potential solutions, etc.
On Windows with NTFS, file and directory names are just a series of 16-bit integers. There is no requirement for them to be valid Unicode/UTF-16, so it's not always possible to correctly round-trip them through UTF-8. This is why std::filesystem::path uses wchar_t on Windows as its native representation.
However, file and directory names that can't be converted to UTF-8 are rare, and could be considered a bug if ever generated. Microsoft seems to be taking this approach, since they now encourage developers to use UTF-8 and the A APIs instead of the W APIs. Internally, the A APIs just convert automatically (and with more optimized code than you could write yourself) and in practice they don't error on invalid Unicode, they just convert it in a lossy fashion.