avro
avro copied to clipboard
AVRO-3532: Naming in c
AVRO-3532 : Enlarge naming rules for C In Java, implemented naming rules accept accent or Chinese alphabet ... The aim of JIRA is to officially accept this implementation, the aim of this PR is to accept it for C lang.
A compromise could be to let the application provide a callback function that replaces is_avro_id. Then the application developer would be responsible for any locale or ICU dependencies. The library could provide a default function that implements the traditional A-Za-z0-9_ rules. An application that wants to minimize its dependencies and trusts that the schemata are valid could define a trivial callback that just accepts everything.
On Windows, another pain point is that iswalpha
cannot recognize supplementary characters because wchar_t
is UTF-16. To fix that, I think one needs to use ICU instead. Microsoft integrated ICU into Windows 10 Creators Update, so if the Avro tools use ICU and target at least this version of Windows, then they don't need to get ICU libraries from elsewhere.
So, i used ICU and add possibility to provide a call back function instead of is_avro_id. (For the second, users have to be very careful, it can lead to errors)
I took the liberty of changing these two PRs (https://github.com/apache/avro/pull/1787 and https://github.com/apache/avro/pull/1798) to draft status, just to prevent any accidents!
These behaviour changes should be merged after a change to the specification, and we really should have a stronger consensus around the whether this is the right thing to do before changing the spec.