Problems caused by encoding
mavlink\pymavlink\generator\mavgen.py +200
with open(xmlfile, 'r') as f:
change
with open(xmlfile, 'r', encoding="utf-8") as f:
@peterbarker What do you make of this suggestion? The existing code does not break the standard messages, so I suspect someone wants to have non-ascii characters in their dialect.
In theory it is nice to allow unicode values, and Python runs smoother if you enforce unicode encoding. Are there any risks to this? I.e. will our generators or generated languages break because they assume ascii?
@peterbarker What do you make of this suggestion? The existing code does not break the standard messages, so I suspect someone wants to have non-ascii characters in their dialect.
In theory it is nice to allow unicode values, and Python runs smoother if you enforce unicode encoding. Are there any risks to this? I.e. will our generators or generated languages break because they assume ascii?
I think it's entirely reasonable to allow non-ascii in things like xml comments and field descriptions.
I'd be very leery of allowing utf-8 in things like field names and the like, however - I could see a lot of things breaking there!
I'm not sure we could allow the above patch unless we were to add some constraints around the more technical aspects of the XML document - or if they're there that we know they work :-) My reasoning is that we might find the above patch doesn't cause issues for our current CI but someone does use a unicode character in a field name and it works for a subset of mavlink stacks.
It would be nice to know exactly what problem is being solves with the patch - we're just guessing that it's i18n support...
@peterbarker my suggestion that it might be related to windows, because nowadays most of linuxes use utf-8.
Maybe it'll be better to open file in binary mode (rb) and then let xml library to decide charset header?
@vooon Thanks, you might be right. @peterbarker is correct - we need to ask why they need this.
It might be that we should validate against this anyway. We could match on a pattern like <xsd:pattern value="{IsBasicLatin}*"/>. Thing is we'd need to be pretty careful to make sure we got all the cases we needed to and didn't block any we wanted. What we'd really like to do is say "globally ascii" except for these exceptions - but you'd need to be better at XSD than I am.
@qq854051086 Can you please outline what you were trying to do when you ran into the problem for which this was the response?
Suggest close - should be against pymavlink
Closing. @qq854051086 if you want this, please repost in https://github.com/ArduPilot/pymavlink with an explanation of the specific problem you are trying to solve.