rebellion
rebellion copied to clipboard
Regular expression types?
Something like this:
(define-regular-expression-type memory-usage-row
#px"\\s*(\\S+):\\s+(\\d+)\\s+(\\d+)"
(usage-type current-usage cumulative-usage))
(define memory-dump
#<<END
<variable-code>: 3307 105824
<application-code>: 14761 1007952
<unary-application-c: 18140 580480
<binary-application-: 19174 766960
END
)
> (match-memory-usage-row memory-dump)
(memory-usage-row "<variable-code>" 3307 105824)
> (sequence->list (in-matched-memory-usage-rows memory-dump))
(list (memory-usage-row "<variable-code>" 3307 105824)
(memory-usage-row "<application-code>" 14761 1007952)
(memory-usage-row "<unary-application-c" 18140 580480)
(memory-usage-row "<binary-application-" 19174 766960))
Use case came up when I was trying to parse some output from (dump-memory-stats)
.
Note: needs to be smart enough to parse (\\d+)
into a number?
instead of a string of digits.
I like this idea, but having to hack apart regular expressions is annoying. It would be easier if there was already a SRE like layer and if Racket regex supported named subpatterns.
To handle numbers (and other cases) perhaps to modify your initial example to
(define-regular-expression-type memory-usage-row
#px"\\s*(\\S+):\\s+(\\d+)\\s+(\\d+)"
(usage-type [current-usage string->number] [cumulative-usage string->number]))
Being able to generate a record
(from rebellion/collection/record
) would be nice as well.
API needs to define the behavior when a pattern match fails. Ideas:
- error
- use
present
andabsent
- return
#f
(or#f
element for sequence generating) - use a failure thunk similar to
hash-ref
and friends
On failure, would you want information about why the match failed? Maybe using result
objects instead of present
and absent
would be the way to go.
I think using a result
is also a valid choice on it's face, but the Racket (and most other) regex engines provide mostly useless failure information beyond "the match failed".
Although I could see a case where the failure branch could carry what failed to match in the error. Which could avoid a lot of threading acrobatics. Example:
(define-regular-expression-type
the-stuff-i-want <pat> <fields>)
(define (log-failures a-result)
(result-case
a-result
#:success (lambda (v) #t)
#:failure (lambda (e) (log e) #f)))
(transduce (in-lines data-in)
(mapping match-the-stuff-i-want)
(filtering log-failures)
...)