spirit icon indicating copy to clipboard operation
spirit copied to clipboard

X3 lacks boolean parser for unicode char type

Open saki7 opened this issue 5 years ago • 8 comments

I think we could provide simple typedefs in <boost/spirit/home/x3/numeric/bool.hpp>

EDIT: bool_policies must be updated too; however there's some hard-coded string literals in bool_policies.hpp which simply do not match the unicode char type

saki7 avatar Feb 17 '20 06:02 saki7

Do you expect it to match synoglyphs/homoglyphs, duplicate characters in Unicode and other Unicode quirks that I am not aware of?

Kojoley avatar Mar 04 '20 21:03 Kojoley

No. It seems like the potentially required workarounds to handle those cases are beyond the scope of Spirit.

For 'other Unicode quirks' you've mentioned -- I came up with normalization (NFKC, NFKD, etc.), which I'm currently pre-processing the script in my application before passing it to X3. I think we could leave it to the user.

I think it's nice to have a documentation for Unicode support in X3, which describes the real-world use-case for building/using Unicode AST. I'm using it because my app is utilizing code point count for UTF-32 characters. By passing std::u32string::iterator to X3, I was able to reduce the string conversion. (I think this is getting off-topic; I'd love to hear more opinions about Unicode though)

saki7 avatar Mar 04 '20 22:03 saki7

(I think this is getting off-topic; I'd love to hear more opinions about Unicode though)

Definitely not off-topic. I'd love to see more work done in this area.

djowel avatar Mar 04 '20 23:03 djowel

For my original use-case, I'll be satisfied if U"true" gets parsed into true, for instance.

I can't imagine more complex cases for now but I really like X3's recent Unicode support so I agree with @djowel. I'll try to report more issues about Unicode soon.

saki7 avatar Mar 05 '20 00:03 saki7

My suggestion is changing bool_policies from

template <typename T = bool>
struct bool_policies { /* ... */ };

to

template <typename CharT, typename T = bool>
struct bool_policies; // declaration only

template <typename T = bool> // partial specialization
struct bool_policies<char32_t, T> { /* ... */ };

(A cleaner way is to use if constexpr inside the body -- IIRC it's permitted to use C++17 in X3, yes?)

If we could reach consensus then I'll try to PR.

saki7 avatar Mar 05 '20 01:03 saki7

(A cleaner way is to use if constexpr inside the body -- IIRC it's permitted to use C++17 in X3, yes?)

I'm absolutely for a move to c++17! There will have to be some minor doc changes as I recall the requirement was c++14. It's 2020 now and we have c++20. Time to move on.

djowel avatar Mar 05 '20 02:03 djowel

(A cleaner way is to use if constexpr inside the body -- IIRC it's permitted to use C++17 in X3, yes?)

Oh and a lot of the infrastructure code in x3 can be simplified with c++17. Your thoughts @Kojoley ?

djowel avatar Mar 05 '20 02:03 djowel