instagenerate icon indicating copy to clipboard operation
instagenerate copied to clipboard

Negative lookahead seems non relational but possible

Open zmaril opened this issue 9 years ago • 7 comments

I've taken a few passes at implementing negative lookahead and thought I should report back on some of the difficulties I've had. My main focus has been modifying the code for positive lookahead to see if it would work for negative lookahead somehow. I was trying to model negative lookahead as the negation of positive lookahead and so I quickly got dragged into the nonrelational parts of core.logic. I'm no expert on either logic programming/constaint logic programming nor how instagenerate has been implemented so I'll try to keep from theorizing about the following:

Interesting definition for negative lookahead

(defmethod combinator-parseo :neg
  [{comb :parser :keys [hide] :as combinator} grammar strings parse-tree remaining-strings remaining-parse-tree]
  (let [comb (assoc comb :hide true)]
    (fresh [fake-remaining-strings]
      (!= strings remaining-strings)
      (!= parse-tree remaining-parse-tree)
      (partial-parseo comb grammar strings parse-tree fake-remaining-strings parse-tree))))

Simple grammar with interesting results:

(def grammar
  (insta/parser
    "S = !('B') ('A' | 'B' | 'C')"))

(run* [input]
  (fresh [abc]
    (instaparseo grammar input
                 [:S abc])))
;; (((B . _0) :- (!= (("A") (_1)))) ((B . _0) :- (!= (("C") (_1)))))

So in some weird sense instagenerate can do negative lookahead but you would have to carefully rip apart all the cons to get the correct results. It seems like if I understand how this all tied together it would be straightforward to pull these results out in the right way but so far it has escaped me.

In addition to this, I tried to project strings and check that the negative lookahead's parsers didn't successfully parse the value of strings. I don't believe I ever got strings to be ground though and so I nevere got anything but exceptions trying this. Also, I tried using nafc but it created similiar output as above that was far more verbose while being effectively the same.

zmaril avatar Oct 27 '14 19:10 zmaril

Thanks for taking a stab at this. I notice right off the bat that your "!=" statements should actually stay "==". Those statements are intended to state "we're constraining stuff about the input, but we're not actually eating characters while doing so." Thus, the strings == remaining strings, and the parse tree == remaining parse tree. This is why I have "fake remaining strings", so the parser can pretend it's eating characters but in reality it isn't doing any damage to the input. As for the parse tree, I just set :hide to true, and presumably it will already take care of not affecting the parse tree.

Because negative look ahead ALSO doesn't eat characters (or parse tree), it doesn't make sense to me to simply negate those statements. Somehow we have to negate that "partial-parseo" statement instead.

On Monday, October 27, 2014, Citizen [email protected] wrote:

I've taken a few passes at implementing negative lookahead and thought I should report back on some of the difficulties I've had. My main focus has been modifying the code for positive lookahead to see if it would work for negative lookahead somehow. I was trying to model negative lookahead as the negation of positive lookahead and so I quickly got dragged into the nonrelational parts of core.logic. I'm no expert on either logic programming/constaint logic programming nor how instagenerate has been implemented so I'll try to keep from theorizing about the following:

Interesting definition for negative lookahead

(defmethod combinator-parseo :neg [{comb :parser :keys [hide] :as combinator} grammar strings parse-tree remaining-strings remaining-parse-tree](let [comb %28assoc comb :hide true%29] %28fresh [fake-remaining-strings] %28!= strings remaining-strings%29 %28!= parse-tree remaining-parse-tree%29 %28partial-parseo comb grammar strings parse-tree fake-remaining-strings parse-tree%29%29))

Simple grammar with interesting results:

(def grammar (insta/parser "S = !('B') ('A' | 'B' | 'C')")) (run* [input](fresh [abc] %28instaparseo grammar input [:S abc]%29));; (((B . _0) :- (!= (("A") (_1)))) ((B . _0) :- (!= (("C") (_1)))))

So in some weird sense instagenerate can do negative lookahead but you would have to carefully rip apart all the cons to get the correct results. It seems like if I understand how this all tied together it would be straightforward to pull these results out in the right way but so far it has escaped me.

In addition to this, I tried to project strings and check that the negative lookahead's parsers didn't successfully parse the value of strings. I don't believe I ever got strings to be ground though and so I nevere got anything but exceptions trying this. Also, I tried using nafc but it created similiar output as above that was far more verbose while being effectively the same.

— Reply to this email directly or view it on GitHub https://github.com/aengelberg/instagenerate/issues/1.

--Alex

aengelberg avatar Oct 27 '14 20:10 aengelberg

Yeah I understood a little bit better what I had been doing about twenty minutes later. The above code gives the complement of the set of results that we want. If we negate all the rules that I had above, or equivalently negate only the last constraint, then we should get what we are looking for. I'll look more into what's possible on this front then.

zmaril avatar Oct 27 '14 21:10 zmaril

Here is something that doesn't work but has the right spirit:

(defmethod combinator-parseo :neg
  [{comb :parser :keys [hide] :as combinator} grammar strings parse-tree remaining-strings remaining-parse-tree]
  (let [comb (assoc comb :hide true)]
    (fresh [fake-remaining-strings]
      (== strings remaining-strings)
      (== parse-tree remaining-parse-tree)
      (nafc partial-parseo comb grammar strings parse-tree fake-remaining-strings parse-tree))))

nafc stands for "negation as failure constraint", I believe, and should be similiar to what we want. If you look at the source for nafc then you'll see that it is only runnable when all the arguments are ground (line 2692). Additionally, after throwing in some println's, it sems that nafc only checks for runnablitly once. The previous example produces ((\A) (\B) (\C)), which is the same output as if the constraint wasn't run at all. I'm not sure if this isn't a bug or if it's the intended result. At the time it checks, comb and grammar are ground while the rest of the arguments are not. nafc is supposed to be delayed if any of the arguments aren't ground, but I'm not sure if delayed means it will be retried later (no evidence this is the case in the places I've looked so far) or if delayed means forgotten entirely.

Prayer to @swannodette: could you please clarify how nafc should work if the terms supplied to the constraint aren't ground?

zmaril avatar Oct 27 '14 22:10 zmaril

In my toy examples at the REPL, nafc seemed to work as expected. If the nafc docstring is true, in this case it would never activate because "fake-remaining-strings" is always fresh, so the constraint would never be triggered.

Here is my version, which works sometimes but not always: (defmethod combinator-parseo :neg [{comb :parser :keys [hide] :as combinator} grammar strings parse-tree remaining-strings remaining-parse-tree](let [comb %28assoc comb :hide true%29] %28all %28== strings remaining-strings%29 %28== parse-tree remaining-parse-tree%29 %28let [helper %28fn [strings] %28fresh [fake-remaining-strings] %28partial-parseo comb grammar strings parse-tree fake-remaining-strings parse-tree%29%29%29] %28nafc helper strings%29%29%29))

I'm using a closure there because I'm willing to execute the partial-parseo check as soon as strings is ground. In reality, I would like to short-circuit even earlier (e.g. if strings is '(\a . _0) and I'm doing a negative lookahead on "a") but this is a reasonable compromise as I predict it would work for most use cases.

This case doesn't work: => (run* [in out](instaparseo %28insta/parser) in out)) ([(\a) (:S "a")])

But this works: => (run* [in out](instaparseo %28insta/parser) (seq "a") out)) () Even more surprisingly, this doesn't work: => (run* [in out](instaparseo %28insta/parser) in out) (== in (seq "a"))) ([(\a) (:S "a")])

--Alex

On Mon, Oct 27, 2014 at 3:09 PM, Citizen [email protected] wrote:

Here is something that doesn't work but has the right spirit:

(defmethod combinator-parseo :neg [{comb :parser :keys [hide] :as combinator} grammar strings parse-tree remaining-strings remaining-parse-tree](let [comb %28assoc comb :hide true%29] %28fresh [fake-remaining-strings] %28== strings remaining-strings%29 %28== parse-tree remaining-parse-tree%29 %28nafc partial-parseo comb grammar strings parse-tree fake-remaining-strings parse-tree%29%29))

nafc stands for "negation as failure constraint", I believe, and should be similiar to what we want. If you look at the source for nafc https://github.com/clojure/core.logic/blob/fa9451ed57ba9647399e1e6e1d5a723e422d7c6b/src/main/clojure/clojure/core/logic.clj#L2680 then you'll see that it is only runnable when all the arguments are ground (line 2692). Additionally, after throwing in some println's, it sems that nafc only checks for runnablitly once. The previous example produces ((\A) (\B) (\C)), which is the same output as if the constraint wasn't run at all. I'm not sure if this isn't a bug or if it's the intended result. At the time it checks, comb and grammar are ground while the rest of the arguments are not. nafc is supposed to be delayed if any of the arguments aren't ground, but I'm not sure if delayed means it will be retried later (no evidence this is the case in the places I've looked so far) or if delayed means forgotten entirely.

Prayer to @swannodette https://github.com/swannodette: could you please clarify how nafc should work if the terms supplied to the constraint aren't ground?

— Reply to this email directly or view it on GitHub https://github.com/aengelberg/instagenerate/issues/1#issuecomment-60678325 .

aengelberg avatar Oct 28 '14 00:10 aengelberg

Here's a version of nafc that I've found useful:

(defn -my-nafc
  ([c args]
     (reify
       IConstraintStep
       (-step [this s]
         (reify
           clojure.lang.IFn
           (invoke [_ s]
             (when-not (tramp ((apply c args) s))
               ((remcg this) s)))
           IRunnable
           (-runnable? [_]
             (println "RUNNABLE?")
             (println (map #(ground-term? % s) args))
             (every? #(ground-term? % s) args))))
       IConstraintOp
       (-rator [_]
         `my-nafc)
       (-rands [_]
         (vec (concat [c] args)))
       IReifiableConstraint
       (-reifyc [_ v r s]
         `(my-nafc ~c ~@(-reify s args r)))
       IConstraintWatchedStores
       (-watched-stores [this] #{::subst}))))

(defn my-nafc
  "EXPERIMENTAL: negation as failure constraint. All arguments to the goal c
must be ground. If some argument is not ground the execution of this constraint
will be delayed."
  [c & args]
  (cgoal (-my-nafc c args)))

Nice thinking with the closure. In all the cases that didn't work, the constraint wasn't runnable because strings wasn't ground. This runs though:

(run* [in out]
  (== in (seq "a"))
  (instaparseo (insta/parser "S = !'a' 'a'") in out))

It looks like strings becomes ground via the first constraint and allows the nafc to be evaluated. Interesting that the order of the constraints matters here.

zmaril avatar Oct 28 '14 14:10 zmaril

I'm surprised that strings was not ground in those cases. Wasn't I setting the value of strings when binding "in" to (seq "a")? I was setting it after the constraint was established, however.

At this point I'm probably exhausting my own expertise of core.logic, since I haven't gotten into the whole "constraint" side of things. I will need further guidance to understand when constraints can trigger. (Maybe it is a known limitation that constraints are doomed if the logic program goes out of scope of the constraint?)

--Ale ​x​ On Tue, Oct 28, 2014 at 7:38 AM, Citizen [email protected] wrote:

Here's a version of nafc that I've found useful:

(defn -my-nafc ([c args](reify IConstraintStep %28-step [this s] %28reify clojure.lang.IFn %28invoke [_ s] %28when-not %28tramp %28%28apply c args%29 s%29%29 %28%28remcg this%29 s%29%29%29 IRunnable %28-runnable? [] %28println) (println (map #(ground-term? % s) args)) (every? #(ground-term? % s) args)))) IConstraintOp (-rator [] my-nafc) (-rands [_](vec %28concat [c] args%29)) IReifiableConstraint (-reifyc [_ v r s] (my-nafc ~c ~@(-reify s args r))) IConstraintWatchedStores (-watched-stores [this] #{::subst})))) (defn my-nafc "EXPERIMENTAL: negation as failure constraint. All arguments to the goal cmust be ground. If some argument is not ground the execution of this constraintwill be delayed." [c & args](cgoal %28-my-nafc c args%29))

Nice thinking with the closure. In all the cases that didn't work, the constraint wasn't runnable because strings wasn't ground. This runs though:

(run* [in out](== in %28seq)) (instaparseo (insta/parser "S = !'a' 'a'") in out))

It looks like strings becomes ground via the first constraint and allows the nafc to be evaluated. Interesting that the order of the constraints matters here.

— Reply to this email directly or view it on GitHub https://github.com/aengelberg/instagenerate/issues/1#issuecomment-60765504 .

aengelberg avatar Oct 28 '14 16:10 aengelberg

I'm still trying to figure the terminology out but I think nafc is non-relational. core.logic is all about creating graphs of equality (relations). By introducing nonequality into the programs, we are no longer purely relational and so we run into situations where the semi-guarantees of relational programming (like the order of the constraints supposedly not changing the results) no longer hold. But yeah, I'm way outside my expertise already so far. I'll see how far I can get with just this for now.

zmaril avatar Oct 30 '14 17:10 zmaril