cfgpath Support Windows Unicode

This update introduces a type cfgpathchar_t which will be treated as wchar_t and use wchar_t string functions if the constant UNICODE is defined when building for Windows; char and the C standard library will be used otherwise.

The macro CFGPATHTEXT is also introduced so string constants will be prefixed with L when the constant UNICODE is defined when building for Windows.

Additionally, the usage examples display sizeof(buf)/sizeof(buf[0]) being passed in for the maxlen arguments because cfgpathchar_t is not guaranteed to have a size of 1 (it will be at least 2 if building with UNICODE defined on Windows).

Now if a path contains more than just standard ASCII, Windows will still return a usable path. Consider the configuration file path... C:\Documents and Settings\Name\Application Data\myapp.ini ... where Name contains characters such as ò, ü, etc. As long as the user has #define UNICODE before #include "cfgpath.h", the propagated cfgpathchar_t can be used with _wfopen.

I have tested on Windows (both with and without UNICODE) and Linux to confirm everything is still working.

Feb 16 '19 11:02 Nuk510

It looks good, except it's a bit of a pain having to redefine all the string handling functions to have cfgpath-specific variants. That will likely make it harder to integrate in with other code, not knowing which function to call.

It's been a while since I've used these functions on Windows, but don't they automatically get replaced with wide variants if you are compiling in Unicode mode? Taking advantage of something like that would be ideal so that strcat, printf and the other functions don't need to be treated specially.

Feb 17 '19 05:02 Malvineous

Even when compiling in Unicode mode, the C standard library functions strcat, printf, etc. remain available for any code that needs to use the char equivalents. wchar_t is a Windows-only thing and separate from the C standard library, so unfortunately, there is no other way except for selecting the wchar_t versions of these functions while compiling. (Microsoft Docs)

The good news is, old code using plain char and "standard string literals" is still compatible with the updated cfgpath.h. The only place where old code will cause errors is when compiling for Windows in Unicode mode, and the compiler is sure to let the user know of the datatype mismatches so they know what's wrong and where.

The bad news is having to introduce an additional check in projects that are going to support Unicode Windows. Even if cfgpath.h were to always return a char datatype by running a function like wchar_to_utf8 on Unicode Windows, the programmer would still have to follow up with a utf8_to_wchar and still have to use _wfopen to actually open the configuration file in question. Defining the variables as type cfgpathchar_t and using cfgpath__x functions reduces the conversion overhead and I think it is the lesser of the two evils.

Consider this custom cfgpath__fopen as an example of supporting Unicode Windows in addition to Linux/Mac/non-Unicode Windows:

FILE *cfgpath__fopen(cfgpathchar_t *fn, cfgpathchar_t *mode) {
	#if defined(WIN32) && defined(UNICODE)
		return _wfopen(fn, mode);
	#else
		return fopen(fn, mode);
	#endif
}

void test(void) {
	FILE *fp;
	cfgpathchar_t buffer[MAX_PATH];

// write config file test
	get_user_config_file(buffer, sizeof(buffer)/sizeof(buffer[0]), CFGPATHTEXT("myapp"));
	if (buffer[0] == 0) {
		printf("get_user_config_file() fail\n");
		return;
	}
	fp = cfgpath__fopen(buffer, CFGPATHTEXT("wb"));
	if(fp==NULL) {
		cfgpath__printf(CFGPATHTEXT("fopen(%s) fail\n"), buffer);
		return;
	}
	fclose(fp);
	cfgpath__printf(CFGPATHTEXT("wrote '%s'!\n"), buffer);

// write test file in config folder
	get_user_config_folder(buffer, sizeof(buffer)/sizeof(buffer[0]), CFGPATHTEXT("myapp"));
	if (buffer[0] == 0) {
		printf("get_user_config_folder() fail\n");
		return;
	}
	cfgpath__strcat(buffer, CFGPATHTEXT("test.txt"));
	fp = cfgpath__fopen(buffer, CFGPATHTEXT("wb"));
	if(fp==NULL) {
		cfgpath__printf(CFGPATHTEXT("fopen(%s) fail\n"), buffer);
		return;
	}
	fclose(fp);
	cfgpath__printf(CFGPATHTEXT("wrote '%s'!\n"), buffer);
}

Tested both with and without #define UNICODE with success. There may be a better way to do this, but this is the best fully-portable solution I'm able to come up with for now. Let me know what you think!

Feb 17 '19 12:02 Nuk510

Nuk510:master

Oct 26 '22 00:10 maheralbashek3