avro icon indicating copy to clipboard operation
avro copied to clipboard

AVRO-3532: Naming in c

Open clesaec opened this issue 1 year ago • 4 comments

AVRO-3532 : Enlarge naming rules for C In Java, implemented naming rules accept accent or Chinese alphabet ... The aim of JIRA is to officially accept this implementation, the aim of this PR is to accept it for C lang.

clesaec avatar Aug 01 '22 07:08 clesaec

A compromise could be to let the application provide a callback function that replaces is_avro_id. Then the application developer would be responsible for any locale or ICU dependencies. The library could provide a default function that implements the traditional A-Za-z0-9_ rules. An application that wants to minimize its dependencies and trusts that the schemata are valid could define a trivial callback that just accepts everything.

KalleOlaviNiemitalo avatar Aug 01 '22 09:08 KalleOlaviNiemitalo

On Windows, another pain point is that iswalpha cannot recognize supplementary characters because wchar_t is UTF-16. To fix that, I think one needs to use ICU instead. Microsoft integrated ICU into Windows 10 Creators Update, so if the Avro tools use ICU and target at least this version of Windows, then they don't need to get ICU libraries from elsewhere.

KalleOlaviNiemitalo avatar Aug 01 '22 09:08 KalleOlaviNiemitalo

So, i used ICU and add possibility to provide a call back function instead of is_avro_id. (For the second, users have to be very careful, it can lead to errors)

clesaec avatar Aug 01 '22 15:08 clesaec

I took the liberty of changing these two PRs (https://github.com/apache/avro/pull/1787 and https://github.com/apache/avro/pull/1798) to draft status, just to prevent any accidents!

These behaviour changes should be merged after a change to the specification, and we really should have a stronger consensus around the whether this is the right thing to do before changing the spec.

RyanSkraba avatar Nov 04 '22 17:11 RyanSkraba