regal icon indicating copy to clipboard operation
regal copied to clipboard

Support backreferences

Open jjttjj opened this issue 3 years ago • 1 comments

As discussed on the lambdaisland channel of the clojurians slack (https://clojurians.slack.com/archives/C1DV21X9P/p1643329544995089)

I'm trying to replicate this in regal using a backrefernce to match any 3 or more of the same consecutive character:

;;=> ["1111" "1"]```

But can't seem to get it:
```(regal/regex
  [:cat [:capture  :any] [:repeat ::_ 3 nil]]
  {:resolver (fn [x] "\\1")})

;;=>  #"(.)(?:\\1){3,}"

(regal/regex
  [:capture [:capture  :any] [:repeat ::_ 3 nil]]
  {:resolver (fn [x] "\1")})

;;=> #"((.){3,})"


(regal/regex
  [:capture [:capture  :any] [:repeat ::_ 3 nil]]
  {:resolver (fn [x] "\\\\1")})

;;=> #"((.)(?:\\\\1){3,})"```

Any tips?

@plexus Responded:

(ns repl-sessions.poke
  (:require [lambdaisland.regal :as regal]
            [lambdaisland.regal.parse :as regal-parse]))

(regal-parse/parse #"(.)\1{3,}")
;; => [:cat
;;     [:capture :any]
;;     [:repeat [:lambdaisland.regal.parse/not-implemented [:BackReference "1"]] 3]]

backreferences aren't implemented, but seems like a common enough feature that they should be. Would you mind creating a ticket?

here's a workaround you can do yourself:

(defmethod regal/-regal->ir [:ref :common] [[op idx] opts]
  `^::regal/grouped ("\\" ~(str idx)))

(regal/regex
 [:cat
  [:capture  :any]
  [:repeat [:ref 1] 3 nil]])
;; => #"(.)\1{3,}"

It was also questioned if all engines, Java, ECMA, Re2 support this. It looks the like the first two do but Re2 doesn't https://github.com/google/re2/issues/101

jjttjj avatar Jan 29 '22 17:01 jjttjj

Unlike the other missing features, I think these are relatively frequently useful so I"m going to keep this open.

alysbrooks avatar Jul 08 '23 03:07 alysbrooks