kpeg icon indicating copy to clipboard operation
kpeg copied to clipboard

Windows Unicode Character Prop fix, update appveyor, travis

Open MSP-Greg opened this issue 7 years ago • 7 comments

I ended up here while trying to test the rdoc repo under windows. It uses kpeg to build two files.

There is a quirk in windows regexp where a Unicode character property like \p{**} can be used in a hard coded regexp, but one can't create one unless the encoding of it is UTF-8. So, this commit fixes that, in addition to several minor fixes:

  1. .travis.yml - added current 2.2 thru 2.4.

  2. appveyor.yml - added it, passes 2.0 thru trunk. See here. Going forward, if you don't want to set it up but would like a check, ping me and I can run one.

  3. lib/kpeg/format_parser.rb - commented out two unused variables.

  4. lib/kpeg/grammar.rb - this contains the windows patch re encoding. Tried to keep the constraint on it pretty tight.

  5. test/test_kpeg.rb, test/test_kpeg_code_generator.rb - changed some tests over to assert_nil from assert_equal.

  6. test/test_kpeg_string_escape.rb - ends with two blank lines, removed one...

Thanks, Greg

MSP-Greg avatar Dec 03 '17 03:12 MSP-Greg

Update - with this PR, RDoc builds on Windows for all versions tested (2.2 - trunk). Without the patch, all versions fail.

MSP-Greg avatar Dec 03 '17 15:12 MSP-Greg

@evanphx This Pull Request is very important for RDoc because of test on AppVeyor fails. Please review this.

aycabta avatar Dec 27 '17 16:12 aycabta

This is actually a wider issue that does not affect only Windows. Trying to use kpeg on Linux with a non-UTF-8 locale results in similar issues. This breaks building the rdoc literals. Downstream bug report: https://bugs.gentoo.org/640150

Simple reproducer based on the rdoc source code:

LANG=C ruby -S kpeg -fsv -o lib/rdoc/markdown/literals.rb lib/rdoc/markdown/literals.kpeg

/usr/lib64/ruby/gems/2.3.0/gems/kpeg-1.1.0/lib/kpeg/grammar.rb:133:in `initialize': invalid character property name {Zl}: /\n|\r\n?|\p{Zl}|\p{Zp}/ (RegexpError)
	from /usr/lib64/ruby/gems/2.3.0/gems/kpeg-1.1.0/lib/kpeg/grammar.rb:133:in `new'
	from /usr/lib64/ruby/gems/2.3.0/gems/kpeg-1.1.0/lib/kpeg/grammar.rb:133:in `initialize'

I have generalized the patch to exclude the windows test and instead force encoding when it is not Encoding::UTF_8. This works as expected and allows things to work with LANG=C.

graaff avatar May 20 '18 07:05 graaff

I realize it's been absolutely forever since this PR was opened. I'm happy to fix it up if we still need it, please let me know @MSP-Greg.

evanphx avatar Nov 01 '22 23:11 evanphx

No problem. Obviously, GitHub provides a pretty good UI for reviewing one's open PR's, and I need to use it.

This came out of running CI for Windows in RDoc, which is being done currently. Let me look into it. If it's still needed, maybe change to GitHub Actions?

MSP-Greg avatar Nov 02 '22 00:11 MSP-Greg

Let me know, happy to help figure it out.

evanphx avatar Nov 02 '22 00:11 evanphx

I've just checked that my reproducer mentioned earlier still causes the same issue with kpeg 1.3.2 and the rdoc 6.4.0 sources on ruby 3.0.

graaff avatar Nov 04 '22 11:11 graaff