URL detected as "mixed"
PHP
'xmlns:excerpt' => "http://wordpress.org/export/{$this->wxr_version}/excerpt/",
https://raw.githubusercontent.com/wp-cli/wp-cli/master/php/export/class-wp-export-wxr-formatter.php
thx for that bug report!
The reason is: the regualr expression /\/{2}/ used for finding single line comments also matches if found in some string like "I'm not a comment // I'm a string". (see sloc.coffe getCommentExpressions and countMixed )
http://regexr.com/3amvs
Do you have an idea how to fix? This is a general bug that applies to most languages. So as far as i can see, we need to build better regular expression. This could lead to things like:
(['"])(?:(?!\1|\\).|\\.)*\1|
\/(?![*/])(?:[^\\/]|\\.)+\/[igm]*|
\/\/[^\n]*(?:\n|$)|
\/\*(?:[^*]|\*(?!\/))*\*\/
that is not so obvious ;-)
I am not even sure if this can be solved using a regualr expression. (At least I suspect without using a library, you can't solve this with a single Regular expression in JS, as it misses some features). I do not have the time to currently look at the code, but I guess witching from plain RegExps to using funtions for finding particular parts should be feasable (and independent from this issue much more powerful, but this also means a lot of refactoring). Maybe you need 'lookbehind' which is missing in JS?
On Thu, Jul 2, 2015 at 3:30 PM, flosse [email protected] wrote:
Do you have an idea how to fix? This is a general bug that applies to most languages. So as far as i can see, we need to build better regular expression. This could lead to things like:
(['"])(?:(?!\1|).|.)\1| /(?![/])(?:[^/]|.)+/[igm]| //[^\n](?:\n|$)| /*(?:[^]|_(?!/))_/
that is not so obvious ;-)
— Reply to this email directly or view it on GitHub https://github.com/flosse/sloc/issues/49#issuecomment-118033946.