esrap
esrap copied to clipboard
Common Lisp packrat parser
#+TITLE: ESRAP -- a packrat parser for Common Lisp
-
Introduction
In addition to regular Packrat / Parsing Grammar / TDPL features ESRAP supports:
- dynamic redefinition of nonterminals
- inline grammars
- semantic predicates
- introspective facilities (describing grammars, tracing, setting breaks)
- left-recursive grammars
- functions as terminals
- accurate, customizable parse error reports
Homepage & Documentation
https://scymtym.github.io/esrap/
#+ATTR_HTML: :alt "build status image" :title Build Status :align right [[https://travis-ci.org/scymtym/esrap][https://travis-ci.org/scymtym/esrap.svg]]
References
-
Bryan Ford, 2002, "Packrat Parsing: a Practical Linear Time Algorithm with Backtracking".
http://pdos.csail.mit.edu/~baford/packrat/thesis/
-
A. Warth et al, 2008, "Packrat Parsers Can Support Left Recursion".
http://www.vpri.org/pdf/tr2007002_packrat.pdf
License
#+BEGIN_EXAMPLE Copyright (c) 2007-2013 Nikodemus Siivola [email protected] Copyright (c) 2012-2019 Jan Moringen [email protected]
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #+END_EXAMPLE
-
Syntax Overview
#+BEGIN_EXAMPLE
-- case-sensitive terminal (~ ) -- case-insensitive terminal character -- any single character (string ) -- any string of length (character-ranges ) -- character ranges ( ) -- semantic parsing (function ) -- call to parse some text (not
) -- complement of expression (and &rest ) -- sequence (or &rest ) -- ordered-choices (*
) -- greedy-repetition (+ ) -- greedy-positive-repetition (? ) -- optional (& ) -- followed-by; does not consume (! ) -- not-followed-by; does not consume (< ) -- lookbehind characters; does not consume (> ) -- lookahead characters; does not consume #+END_EXAMPLE -
Trivial Examples
#+BEGIN_SRC lisp :results silent :exports both :session "doc" (ql:quickload :esrap) #+END_SRC
The =parse= function takes an expression:
#+BEGIN_SRC lisp :results value :exports both :session "doc" (multiple-value-list (esrap:parse '(or "foo" "bar") "foo")) #+END_SRC
#+RESULTS: #+BEGIN_SRC lisp ("foo" NIL T) #+END_SRC
New rules can be added.
Normally you'd use the declarative =defrule= interface to define new rules, but everything it does can be done directly by building instances of the =rule= class and using =add-rule= to activate them.
#+BEGIN_SRC lisp :results value :exports both :session "doc" (progn (esrap:add-rule 'foo+ (make-instance 'esrap:rule :expression '(+ "foo")))
(multiple-value-list (esrap:parse 'foo+ "foofoofoo")))
#+END_SRC
#+RESULTS: #+BEGIN_SRC lisp (("foo" "foo" "foo") NIL T) #+END_SRC
The equivalent =defrule= form is
#+BEGIN_SRC lisp :results silent :exports code :session "doc" (esrap:defrule foo+ (+ "foo")) #+END_SRC
Note that rules can be redefined, i.e. this =defrule= form replaces the previous definition of the =foo+= rule.
Rules can transform their matches:
#+BEGIN_SRC lisp :results silent :exports code :session "doc" (esrap:add-rule 'decimal (make-instance 'esrap:rule :expression '(+ (or "0" "1" "2" "3" "4" "5" "6" "7" "8" "9")) :transform (lambda (list start end) (declare (ignore start end)) (parse-integer (format nil "~{~A~}" list))))) #+END_SRC
or using =defrule=
#+BEGIN_SRC lisp :results silent :exports both :session "doc" (esrap:defrule decimal (+ (or "0" "1" "2" "3" "4" "5" "6" "7" "8" "9")) (:lambda (list) (parse-integer (format nil "~{~A~}" list)))) #+END_SRC
Any lisp function can be used as a semantic predicate:
#+BEGIN_SRC lisp :results value :exports both :session "doc" (list (multiple-value-list (esrap:parse '(oddp decimal) "123")) (multiple-value-list (esrap:parse '(evenp decimal) "123" :junk-allowed t))) #+END_SRC
#+RESULTS: #+BEGIN_SRC lisp
((123 NIL T) (NIL 0)) #+END_SRC
-
Example Files
More complete examples can be found in the following self-contained example files:
- [[file:examples/sexp.lisp]]: complete sample grammar and usage
- [[file:examples/symbol-table.lisp]]: grammar with lexical scope
- [[file:examples/left-recursion.lisp]]: multiple grammars with left recursion
- [[file:examples/function-terminals.lisp]]: grammars with functions as terminals