commons-codec icon indicating copy to clipboard operation
commons-codec copied to clipboard

Add form feed and vertical tab whitespace to isWhiteSpace()

Open jamesgiu opened this issue 5 years ago • 4 comments

Ran into an issue with these characters not being detected and having false positive isBase64() detection as a result - would be good to get them counted as whitespace.

Ref. C/C++ isspace() function considers these characters as whitespace. http://www.cplusplus.com/reference/cctype/isspace/ https://www.oreilly.com/library/view/c-in-a/0596006977/re129.html

Cheers!

jamesgiu avatar Oct 13 '20 14:10 jamesgiu

Coverage Status

Coverage remained the same at 94.308% when pulling 90248547061e7708c6e2e2807e60feacfcb12680 on jamesgiu:VT-and-FF-in-BNCodec into 475910a521bd23dbba26091be505ad2a71d3e901 on apache:master.

coveralls avatar Oct 13 '20 14:10 coveralls

Hi all, any chance this will be looked at? Cheers :)

jamesgiu avatar Feb 16 '21 00:02 jamesgiu

This method or PR is not great because we are defining our own whitespace instead of reusing Charater.isWhitespace or Charater.isSpaceChar.

Shouldn't we be using Charater.isSomething?

garydgregory avatar Feb 16 '21 01:02 garydgregory

Ping. Also, how does this square with the RFCs for Base64, Base32, and Base16 (all the subclasses we implement)? IOW, what C/C++ does is completely irrelevant, it's the RFCs that matter.

garydgregory avatar Feb 15 '22 14:02 garydgregory