vosk-api
vosk-api copied to clipboard
JSON formatting broken in locale with comma separator
Hello, first off, thank you for this piece of software. It runs very well and is easy to use. It seems though that I ran into a bug (sorry if I'm wrong). My locale is fr_FR and therefore floating numbers are formatted with a comma (ie 1,3 and not 1.3 as in English). Now when I set vosk_recognizer_set_max_alternatives to more than one, vosk_recognizer_result and vosk_recognizer_final_result return a json string containing several alternatives with a confidence level for each one expressed as a floating number. It turns out that this number is formatted according to the locale (with a comma) which makes the string unreadable by common json parsers since we get things like : "alternatives" : [{ "confidence" : 361,788818, "text" : "" }]
and commas are used by json to separate values. If I save the locale, change it to us_US, execute one of the above functions and change it back to what it was, it works (floating numbers are formatted with a dot). I suspect we should get the same problem with vosk_recognizer_set_words (I didn't try though). Thank you.
Yeah, it might be a problem. Let me check
Thanks for this great library. Just to let you know we were also affected by this issue (reference above).
Oh yeah, in 20 years of C++ development they haven't come to a method to convert float to string in locale-independent way. There is std::format in c+20, but I don't want to require such a new compiler and I suppose it is not fully supported in many places. Maybe we can just force conversion from , to ..
Do you have any idea when the above fix will get pull in the tree ? I'd love to strip my current workaround from my gstreamer plugin so that it might, hopefully, get a bit quicker. Thanks!
Also https://github.com/alphacep/vosk-api/pull/1055