rewrite-clj icon indicating copy to clipboard operation
rewrite-clj copied to clipboard

S-expr-ed tagged literal results in unqualified reference to read-string

Open borkdude opened this issue 10 months ago • 3 comments

Sexpr-ing a tagged literal results in an expression like (read-string "#foo/bar 1) . Imo this should be fully qualified to prevent clashes with referred vars from elsewhere.

PR incoming.

borkdude avatar Jan 31 '25 15:01 borkdude

I think I get the rationale, but I need more words to explore/verify. Lemme know if I'm on the right track here!

Background

Rewrite-clj emits (read-string ...) for sexpr when it feels it cannot do better for a given node. This includes tagged literals.

(require '[rewrite-clj.node :as n]
         '[rewrite-clj.parser :as p])

(-> "#inst \"2018-03-28T10:48:00.000\""
    p/parse-string-all
    n/sexpr)
;; => (read-string "#inst \"2018-03-28T10:48:00.000\"")

Problem

The implication is that the user might, in turn, eval this (read-string ...).

(-> "#inst \"2018-03-28T10:48:00.000\""
    p/parse-string-all
    n/sexpr
    eval
    ((juxt type str)))
;; => [java.util.Date "Wed Mar 28 06:48:00 EDT 2018"]

And that because read-string is unqualified, the eval might use an unintended read-string. Also clojure.core/read-string is not safe.

Snag

But read-string varies by platform. It is in clojure.core for clojure but not in core for cljs.

So we probably want to emit a qualifier that makes sense across platforms.

Ideas:

  1. One idea from https://github.com/clj-commons/rewrite-clj/pull/332#issuecomment-2628083487: clojure.edn/read-string. This is safe and available across platforms. The ns clojure.edn ns does not technically need to be required to be used (oh, that maybe is not true for cljs).

  2. Continue to emit read-string by default, but allow user to override through some option. Details TBD.

  3. If the same read-string does not work for all platforms, emit a platform-specific read-string. This is not inconsistent with sexpr and differences in clojure platforms.

Related

Not scoped in this issue but worth mentioning:

sexpr also emits read-string for reader conditionals

(-> "#?(:clj x :cljs y)"
    p/parse-string-all
    n/sexpr)
;; => (read-string "#?(:clj x :cljs y)")

Not sure how to handle this one. Would need :read-cond to actually eval, .i.e, from a clj jvm repl:

user=> (clojure.core/read-string {:read-cond :allow} "#?(:clj x :cljs y)")
x

Maybe handle reader conditionals better/differently with sexpr?

lread avatar Feb 12 '25 21:02 lread

Yeah I think clojure.edn/read-string makes more sense!

borkdude avatar Feb 12 '25 21:02 borkdude

Ok, I think that sounds good. It won't deal with reader conditionals, but that would have to be a separate issue.

lread avatar Feb 12 '25 22:02 lread

Ok. Finally getting back to this issue. Rewrite-clj currently returns (read-string "<str of some parsed reader macro>") for sexpr on reader macro nodes it can't (or doesn't yet) do anything better with.

This is currently tagged literals and reader conditionals.

Rewrite-clj currently doesn't distinguish between a tagged literal and a reader conditional. Is that correct? My memory does get fuzzy, let's verify in the REPL (and also use the internal node-type to check if there is some internal knowledge):

(require
  '[rewrite-clj.node :as n]
  '[rewrite-clj.node.protocols :as proto]
  '[rewrite-clj.parser :as p])

((juxt n/tag proto/node-type n/sexpr) (p/parse-string "#inst \"2018-03-28T10:48:00.000\""))
;; => [:reader-macro :reader-macro (read-string "#inst \"2018-03-28T10:48:00.000\"")]

((juxt n/tag proto/node-type n/sexpr) (p/parse-string "#foo/bar 1"))
;; => [:reader-macro :reader-macro (read-string "#foo/bar 1")]

((juxt n/tag proto/node-type n/sexpr) (p/parse-string "#?(:clj x :cljs y)"))
;; => [:reader-macro :reader-macro (read-string "#?(:clj x :cljs y)")]

Right. So rewrite-clj doesn't distinguish.

Ok, so while clojure.edn/read-string is probably is a good fit for tagged literals, it is not great for reader conditionals, which clojure.edn/read-string does not handle.

So what to do?

Option 0

Do nothing.

Cons:

  • does not solve the raised issue of var name clashes with read-string.

Pros:

  • does not break current libraries that depend on current behaviour (zprint and a umschreiben-clj test).

Option 1

Add more documentation. Suggest the user adapt the return from sexpr as they see fit.

Same as option 0 +

Pros:

  • gives some advice to users.

Option 2

Use clojure.core/read-string. Document, and let the user adapt the return from sexpr as they see fit.

Cons:

  • not great for cljs, which does not have clojure.core/read-string
  • breaks a couple of libs

Option 3

Use clojure.edn/read-string. Document and let the user adapt the return from sexpr as they see fit.

Cons:

  • not a great fit for reader conditionals.
  • breaks a couple of libs

Option 4

Start distinguishing reader-conditionals (at least internally), then use clojure.core/read-string for reader-conditionals and clojure.edn/read-string for tagged literals.

Cons:

  • not great for cljs which does not have clojure.core/read-string
  • breaks a couple of libs

Option 5

Let the user decide how to convert unhandled reader macros via new opt fn on sexpr. The default might be the current behaviour.

To be useful, the user would likely want to reliably and easily know if the reader-macro is a tagged literal or a reader-conditional.

Pros:

  • User can do what they want via rewrite-clj API instead of post-processing.

Cons:

  • More complex API.
  • Default behaviour does not address raised issue

Thoughts

I'm not sure.

I do see that clj-kondo went with option 2.

lread avatar Nov 19 '25 15:11 lread