instaparse icon indicating copy to clipboard operation
instaparse copied to clipboard

Amplify the recommendation to use resource files whenever escaping

Open matanox opened this issue 9 years ago • 0 comments

This is an excellent implementation with great documentation. One thing I think, is that building rules inside a clojure source file can be a nice gameful challenge, yet tedious, and more importantly unreadable during later maintenance ― whenever there is the need to escape characters. E.g. consider this definition below, even the comment inside it requires escaping, not just the quote signs and back-slashes. It might be good to slightly more explicitly recommend, in the readme, as a rule of thumb, switching to resource files as early as the need to escape anything arise.

(def wikiextractor-parser
      "a parser for the output of wikiextractor (https://github.com/attardi/wikiextractor)"
      (parser
        "
          S = Entry*
          Entry = <Header> ContentAsText <Trailer> <OptionalPadding*>
          Header = '<doc' (' ' HeaderProp)* '>'
          HeaderProp = #'[^=]*' '=' '\"' #'[^\"]*' '\"' (* e.g. id=\"4030\" *)
          ContentAsText = Anychar*
          Anychar = #'(?sm).'
          Trailer = '</doc>'
          OptionalPadding = #'\\s'
      ")
    )

matanox avatar Oct 29 '16 15:10 matanox