sxml icon indicating copy to clipboard operation
sxml copied to clipboard

"bad syntax" errors with `ssax:make-parser`

Open t0mpr1c3 opened this issue 2 years ago • 5 comments

Any attempt to manipulate the procedure ssax:make-parser results in a "bad syntax" error.

e.g.

(print ssax:xml->sxml) ; OK
(print ssax:make-parser) ; not OK

t0mpr1c3 avatar Mar 03 '23 01:03 t0mpr1c3

Right, that's expected. That's because "ssax:make-parser" is a macro, not a procedure. I just took a look at the definition of make-parser, and I see that it's doing some interesting and somewhat sketchy stuff in order to simulate keyword arguments in a cross-implementation way. It might make sense to provide a version that uses Racket keywords, so you could write (e.g.)

(ssax:make-parser #:new-level-seed remove-markup-nls
                  #:finish-element remove-markup-fe
                  #:char-data-handler remove-markup-cdh)

... and make-parser would indeed be a procedure. FWIW, there's apparently a "make-parser/positional" procedure, if you prefer.

jbclements avatar Mar 03 '23 05:03 jbclements

Yes, I think that reimplementation would be an improvement. The particular problem for me is that since is not a procedure, it cannot be used in typed Racket using require/typed.

t0mpr1c3 avatar Mar 03 '23 20:03 t0mpr1c3

Is make-parser/positional not provided? Seems like you could use that instead?

jbclements avatar Mar 06 '23 02:03 jbclements

I see ssax:make-parser/positional-args in SSAX-code.rkt, but it is not provided by main.rkt. It would be easy enough for me to fork it and check it out though -- thanks. If it turns out to be useful I'll make a PR and maybe add some documentation.

t0mpr1c3 avatar Mar 06 '23 05:03 t0mpr1c3

It turns out that ssax:make-parser/positional-args is also defined as a macro, and is an even less convenient form. FWIW I did get the following example to work:

#lang racket

(require racket/string sxml)
 
(define (remove-markup xml-port)
  (let* ([parser
          (ssax:make-parser/positional-args
           (λ (port docname systemid internal-subset? seed)
             (values #f null null seed)) ;; handler-DOCTYPE (default)
           (λ (elem-gi seed)
             (values #f null null seed)) ;; handler-UNDECL-ROOT (default)
           (λ (elem-gi seed) seed)       ;; handler-DECL-ROOT (default)
           remove-markup-nls             ;; handler-NEW-LEVEL-SEED (required)
           remove-markup-fe              ;; handler-FINISH-ELEMENT (required)
           remove-markup-cdh             ;; handler-CHAR-DATA-HANDLER (required)
           ()                            ;; handler-PI (default)
           )]
         [strings (parser xml-port null)])
    (string-join (reverse strings) "")))
 
(define (remove-markup-nls gi attributes namespaces expected-content
                           seed)
  seed)
 
(define (remove-markup-fe gi attributes namespaces parent-seed seed)
  seed)
 
(define (remove-markup-cdh string-1 string-2 seed)
  (let ([seed (cons string-1 seed)])
    (if (non-empty-string? string-2)
        (cons string-2 seed)
        seed)))
 
(remove-markup
 (open-input-string
  "<foo>Hell<bar>o, world!</bar></foo>"))

t0mpr1c3 avatar Mar 06 '23 06:03 t0mpr1c3