slimit icon indicating copy to clipboard operation
slimit copied to clipboard

Illegal character "'" at 1:35518 after LexToken(COMMA,',',1,35516)

Open fletchowns opened this issue 11 years ago • 3 comments

slimit seems to be having trouble minifying bootstrap-datepicker.js

$ wget https://raw.github.com/eternicode/bootstrap-datepicker/511c1b0241eb9804892df6f9388e0afd00107253/js/bootstrap-datepicker.js
$ slimit bootstrap-datepicker.js

Results in:

Illegal character "'" at 1:35518 after LexToken(COMMA,',',1,35516)
Illegal character '\\' at 1:35519 after LexToken(COMMA,',',1,35516)
Illegal character '\\' at 1:35531 after LexToken(STRING,"').split('",1,35521)
Traceback (most recent call last):
  File "/home/fletch/my-venv/bin/slimit", line 9, in <module>
    load_entry_point('slimit==0.8.1', 'console_scripts', 'slimit')()
  File "/home/fletch/my-venv/local/lib/python2.7/site-packages/slimit/minifier.py", line 69, in main
    text, mangle=options.mangle, mangle_toplevel=options.mangle_toplevel)
  File "/home/fletch/my-venv/local/lib/python2.7/site-packages/slimit/minifier.py", line 38, in minify
    tree = parser.parse(text)
  File "/home/fletch/my-venv/local/lib/python2.7/site-packages/slimit/parser.py", line 93, in parse
    return self.parser.parse(text, lexer=self.lexer, debug=debug)
  File "/home/fletch/my-venv/local/lib/python2.7/site-packages/ply/yacc.py", line 265, in parse
    return self.parseopt_notrack(input,lexer,debug,tracking,tokenfunc)
  File "/home/fletch/my-venv/local/lib/python2.7/site-packages/ply/yacc.py", line 1047, in parseopt_notrack
    tok = self.errorfunc(errtoken)
  File "/home/fletch/my-venv/local/lib/python2.7/site-packages/slimit/parser.py", line 116, in p_error
    self._raise_syntax_error(token)
  File "/home/fletch/my-venv/local/lib/python2.7/site-packages/slimit/parser.py", line 89, in _raise_syntax_error
    self.lexer.prev_token, self.lexer.token())
SyntaxError: Unexpected token (STRING, "').split('") at 1:35521 between LexToken(NUMBER,'0',1,35520) and LexToken(NUMBER,'0',1,35532)

fletchowns avatar Sep 03 '13 23:09 fletchowns

agree and support this request.

Arvi3d avatar Mar 27 '14 19:03 Arvi3d

Just looked at this: it appears to be unable to handle a single character escape sequence if it is part of a function call. If you replace

var separators = format.replace(this.validParts, '\0').split('\0'), parts = format.match(this.validParts);

with

var escape_null = String.fromCharCode(0)[0];
var separators = format.replace(this.validParts, escape_null), parts = format.match(this.validParts);
separators = separators.split(escape_null);

Then it does the right thing. Seems like an easy enough bug, but my Python isn't good enough to figure out where in the lexer things are going wrong.

fomojola avatar Jun 18 '14 07:06 fomojola

For what it's worth, I use slimit in js2xml and I had to add a simple OR to the lexer's string literal regexes, \\\d{1,}:

It's probably not acurate, for example after reading http://mathiasbynens.be/notes/javascript-escapes#octal , but it works for me

    string = r"""
    (?:
        # double quoted string
        (?:"                               # opening double quote
            (?: [^"\\\n\r]                 # no \, line terminators or "
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
            )*?                            # zero or many times
            (?: \\\n                       # multiline ?
              (?:
                [^"\\\n\r]                 # no \, line terminators or "
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
              )*?                          # zero or many times
            )*
        ")                                 # closing double quote
        |
        # single quoted string
        (?:'                               # opening single quote
            (?: [^'\\\n\r]                 # no \, line terminators or '
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
            )*?                            # zero or many times
            (?: \\\n                       # multiline ?
              (?:
                [^'\\\n\r]                 # no \, line terminators or '
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
              )*?                          # zero or many times
            )*
        ')                                 # closing single quote
    )
    """ 

became:

    string = r"""
    (?:
        # double quoted string
        (?:"                               # opening double quote
            (?: [^"\\\n\r]                 # no \, line terminators or "
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\\d{1,}
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
            )*?                            # zero or many times
            (?: \\\n                       # multiline ?
              (?:
                [^"\\\n\r]                 # no \, line terminators or "
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\\d{1,}
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
              )*?                          # zero or many times
            )*
        ")                                 # closing double quote
        |
        # single quoted string
        (?:'                               # opening single quote
            (?: [^'\\\n\r]                 # no \, line terminators or '
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\\d{1,}
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
            )*?                            # zero or many times
            (?: \\\n                       # multiline ?
              (?:
                [^'\\\n\r]                 # no \, line terminators or '
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\\d{1,}
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
              )*?                          # zero or many times
            )*
        ')                                 # closing single quote
    )
    """ 

https://github.com/redapple/js2xml/blob/master/js2xml/lexer.py

redapple avatar Jun 18 '14 08:06 redapple