leiningen icon indicating copy to clipboard operation
leiningen copied to clipboard

initial pass at adding spec validation to leiningen

Open bhauman opened this issue 9 years ago • 28 comments

This is just a proposal and food for thought ahead of the Conj. The hardest part of this is making the schema correct. The way this works is it only validates top level project keys if they are “similar” to keys that are defined in the schema.

The initial schema can be backed off to a small group of keys that are easily defined and validated and then over time expanded to more keys.

bhauman avatar Nov 29 '16 17:11 bhauman

Yeah it doesn't pass the tests I wanted to put this out there first. But I am looking at the tests now.

bhauman avatar Nov 29 '16 17:11 bhauman

Woah, this is impressive work and really appreciated. Definitely want something like this in when I've grokked it.

I'm packing my bags for my flight tomorrow morning, but I will have two 1.5 hour layovers on the trip to Austin. Will also download dependencies and whatnot for the 11 hour trip over the pond.

Looking forward to hear and talk more about it! =)

hypirion avatar Nov 29 '16 22:11 hypirion

Wow, this is beautiful. Well done, @bhauman

magnars avatar Nov 30 '16 06:11 magnars

@bhauman is there a difference between strictly-specking-standalone and strictly-specking? I see the former on clojars but can't seem to find the source code for it.

cprice404 avatar Dec 10 '16 21:12 cprice404

I played with this some today, mostly with the trapperkeeper project. It seemed to work great! That project uses lein-parent and lots of managed-dependencies stuff, and it worked fine. The fields it looked like it was validating were: (:description :pedantic? :plugins :aliases :main :dependencies :min-lein-version); I broke each of them individually and confirmed that this patch gave me a nice, explanatory error message.

Seems great so far! I'll try to grab some other libraries and test it with them as well.

cprice404 avatar Dec 10 '16 22:12 cprice404

I've tested against:

compojure
instaparse
liberator
ring-core
ring-devel
ring-jetty-adapter
ring-servlet
http-kit
timbre
clj-http
hiccup
cheshire
clj-time
puppetserver
puppetdb

Things I found:

  • throws an error for an empty vector value for jvm-opts, which seems like it should probably be legal (puppetdb)
  • fails on string "scm:git:https://github.com/clojure-liberator/liberator.git", for {:scm {:connection ...}} (liberator)

Otherwise all the projects seemed to work fine!

cprice404 avatar Dec 10 '16 23:12 cprice404

@cprice strictly-specking-standalone is just a transformation of strictly-specking that includes a vendor-ed and namespaced version of spec.

https://github.com/bhauman/strictly-specking/blob/master/scripts/standalone

bhauman avatar Dec 10 '16 23:12 bhauman

@bhauman ah, gotcha. The clojars page for the standalone artifact has a link to a github repo for strictly-specking-standalone that shows up as a 404 for me. Might be worth fixing that link to point to something in the real repo and/or adding a note in the README if this patch is on its way in to leiningen (which I am in favor of!).

cprice404 avatar Dec 12 '16 17:12 cprice404

I've tested against all the Clojure toolbox projects, and there are some false positives and a visual (..textual?) bug I've detected. Will write them down on this issue during the weekend.

@bhauman: Did you want to verify project names and versions as well, or would you like to leave that to another PR?

hypirion avatar Dec 16 '16 10:12 hypirion

@hyPiRion yeah let me know what the problems are and I'll fix them in a jiffy. Might be worth it to share the problem projects so that I can reproduce.

I will refactor error handling and so that they are exceptions caught in main.

I will add validation of the project name and the version while I'm in there.

bhauman avatar Dec 17 '16 20:12 bhauman

Also we can do a simple check on the top level forms to insure its an even number and to maybe check that the keys are distinct.

In the future I'd would like to write a more sophisticated reader parser to present helpful errors for arbitrary unbalanced maps and unbalanced parens in all read config files.

bhauman avatar Dec 17 '16 20:12 bhauman

So I made a small script which scrapes the clojure-toolbox projects, downloads them all, then runs deps to fetch dependencies and detect errors in their configuration:

#!/usr/bin/env bash

lst="$(curl -s http://www.clojure-toolbox.com/ \
            | sed -n 's/.*href="\([^"]*\).*/\1/p' \
            | grep -F 'https://github.com' | sort | uniq)"

dir="$PWD"

mkdir -p toolbox-projects
cd toolbox-projects

for p in $lst; do
    echo "checking out $p"
    pname="$(echo $p | sed 's%.*/%%')"
    gitRepo="$(echo $p | sed 's%^https://github\.com/%git@github\.com:%')"
    if [ ! -d "$pname" ]; then
        git clone --depth=1 "$gitRepo" "$pname"
    fi
done

rm -f "$dir/output.txt"

for d in */; do
    cd "$d"
    if [ -f project.clj ]; then
        echo "testing $d" 2>&1 | tee -a "$dir/output.txt"
        lein-master deps 2>&1 | tee -a "$dir/output.txt"
    fi
    cd ..
done

It assumes lein-master is linked to this PR branch.

I found some configuration errors in the project themselves, but also found a couple of errors in the code which we may want to fix:

------ Leiningen Configuration Error ------

Found unrecognized key :creds at path (:repositories 0 1)
Must be one of: :password, :sign-releases, :username, :update, :checksum, :releases, :snapshots or :url

/home/jeannikl/tmp/toolbox-projects/aggregate/project.clj:21:29
  16                                   [com.h2database/h2 "1.4.190"]
  17                                   [org.postgresql/postgresql "9.3-1101-jdbc4"]]}}
  18   :scm {:name "git"
  19         :url "https://github.com/friemen/aggregate"}
  20   :repositories [["clojars" {:url "https://clojars.org/repo"
  21                              :creds :gpg}]])
                                  ^---  The key :creds is unrecognized

-------------------------------------------

https://github.com/technomancy/leiningen/blob/master/doc/DEPLOY.md#gpg gives some information around the :creds keyword.

------ Leiningen Configuration Error ------

The Vector at (:aliases "test") contains a non-conforming value: "with-profile"
It should satisfy #{"do"}

  {:aliases
   {"test"
    ["with-profile"
     ^---- The value at key 0 doesn't conform
     "test"
     "do"
     ...
     ...]}}

-- Docs for key :aliases --
...
-------------------------------------------

Vector arguments are allowed and works just like "curried" things. e.g. the entry "foo" ["bar" "baz"] will expand lein foo quux into lein foo bar baz quux.

------ Leiningen Configuration Error ------

The Vector at (:dependencies 5) contains a non-conforming value: "0.7.2"
It should satisfy one of: 
[#{:optional}
 #{:scope}
 #{:classifier}
 #{:native-prefix}
 #{:extension}
 #{:exclusions}]


/home/jeannikl/tmp/toolbox-projects/clojurebot/project.clj:11:27
   6   :dependencies [[org.clojure/clojure "1.2.1"]
   7                  [org.clojure/tools.logging "0.2.6"]
   8                  [conduit-irc "2.0.1-SNAPSHOT"]
   9                  [org.ccil.cowan.tagsoup/tagsoup "1.2"]
  10                  [cheshire "5.0.2"]
  11                  [clj-http "0.7.2"
                                ^---  The value at key 1 doesn't conform
  12                   :exclude [commons-logging]]
  13                  [swank-clojure "1.3.2"]
  14                  [com.thelastcitadel/apropos "0.0.1"]
  15                  [ring "1.1.8"]
  16                  [compojure "1.1.5"]

Not entirely sure what's going on here – except that :exclude should be :exclusions.

------ Leiningen Configuration Error ------

The key :username at (:repositories 0 1) has a non-conforming value: :env
It should be a NonBlankString

/home/jeannikl/tmp/toolbox-projects/Clojush/project.clj:27:30
  22           :namespaces [#"^(?!clojush\.problems)"]
  23           :output-path "doc"
  24           :metadata {:doc/format :markdown}}
  25   :dev-dependencies [[lein-ccw "1.2.0"][lein-midje "3.1.3"]]
  26   :profiles {:dev {:dependencies [[midje "1.7.0"]]}}
  27   :repositories [["releases" {:url "https://clojars.org/repo"
                                   ^---  The value at key :username doesn't conform
  28                               :username :env
  29                               :sign-releases false
  30                               :password :env}]]
  31   :release-tasks [["vcs" "assert-committed"]
  32                   ["change" "version" "leiningen.release/bump-version"]

-------------------------------------------

Using :env is alright, but the rules can accept even more combinations. See https://github.com/technomancy/leiningen/blob/master/doc/DEPLOY.md#credentials-in-the-environment for more information.

Also, for some reason the error message has an off-by-one bug.

------ Leiningen Configuration Error ------

The value [] at key :jvm-opts should not be empty.
It should have at least 1 value

It's okay for :jvm-opts to be empty.

------ Leiningen Configuration Error ------

The key "test" at (:aliases) has a non-conforming value: "expectations"
It should satisfy one of: 
[coll?
 (cat
   :doo
   #{"do"}
   :rest
   (+
     (alt
       :com-el
       :leiningen.core.project-schema/command-element
       :sub-vec
       (every
         :leiningen.core.project-schema/command-element
         :min-count
         1))))]


  {:aliases
   {"test" "expectations"
    ^---- The value at key test doesn't conform
    }}

IIRC aliases can map to another command, in which case it could be a single string like this. (But I think we could break it for 3.x, as "foo" isn't that much harder to type than ["foo"] )

------ Leiningen Configuration Error ------

The value [] at key :source-paths should not be empty.
It should have at least 1 value

:source-paths could be empty – I use it in Java projects quite often actually.


------ Leiningen Configuration Error ------

Found unrecognized key :year at path (:license)
Must be one of: :name, :comments, :url or :distribution

I think it would be fine to have the licence map open for now.


------ Leiningen Configuration Error ------

The Vector at (:dependencies 7) contains a non-conforming value: "1.9.0"
It should satisfy one of: 
[#{:optional}
 #{:scope}
 #{:classifier}
 #{:native-prefix}
 #{:extension}
 #{:exclusions}]


/home/jeannikl/tmp/toolbox-projects/lein-ancient/project.clj:19:48
  14                                :exclusions [com.amazonaws/aws-java-sdk-s3
  15                                             clj-http]]
  16 
  17                  ;; S3 dependency is pinned because of conflicts of newer
  18                  ;; versions with Leiningen's precompiled dependencies.
  19                  [com.amazonaws/aws-java-sdk-s3 "1.9.0"
                 The value at key 1 doesn't conform  ^---
  20                   :exclusions [joda-time]
  21                   :upgrade? false]]
  22   :license {:name "MIT License"
  23             :url "https://opensource.org/licenses/MIT"
  24             :year 2013

I am guessing this is related to the one where :exclusions was misspelt, because here we got :upgrade? false which seems to catch the checker off guard. I'm also guessing it's fine to keep this map open, as upgrade? is lein-ancient-specific.

------ Leiningen Configuration Error ------

The key :pedantic? at [] has a non-conforming value: :warn
It should satisfy #{:abort true false :ranges}

I've seen this quite a lot of times, and since :warn is equivalent to true in this situation, I'd think we should allow :warn as well. (Perhaps it's better to remove true as it's not evident what that would do, although that's a different discussion for a later time)


All in all, I think this works very well. Since this will likely cause some false positives right after it's shipped, I think we should inform users that there is a way to turn this off. I would suggest commenting that you can turn off project.clj validation by adding :validate false to the project.clj, right after the actual error message.

Additionally, I think we should have a look at some of the maps. I think that most of the maps should be open in the beginning, to avoid errors like :year not recognized or :upgrade? not recognized.

Apart from that, this PR is pretty solid and adds extremely valuable feedback to users IMO.

hypirion avatar Jan 21 '17 21:01 hypirion

This would be extremely cool if it used spec.

kenrestivo avatar Feb 05 '17 19:02 kenrestivo

This is really interesting work; thanks.

Can you explain why an additional dependency is needed? Is it specifically there in order to avoid depending on a prerelease version of Clojure? I am very hesitant to add additional dependencies right now because of the work happening packaging Leiningen: https://wiki.debian.org/Clojure/Leiningen

technomancy avatar Mar 06 '17 17:03 technomancy

@technomancy yeah, well there is are a lot of reasons unfortunately, one is that spec is needed and I'm using a vendored spec so that it's internals are stable bc there is significant functionality layered on top of it so I can search the spec tree.

We could theoretically vendor this code inside of lein but ...

bhauman avatar Mar 07 '17 23:03 bhauman

After merging the changes by @Rovanion this PR is in better shape but it still needs a bit of work.

bhauman avatar Mar 07 '17 23:03 bhauman

Thanks. I am not saying we can't pull in a new dependency; just that I want to be sure to weigh the options.

technomancy avatar Mar 08 '17 00:03 technomancy

If we were to push spec validation into a plugin for the time being, would we need to make any changes to Leiningen to expose more things?

One thought I had was that while obviously having a specific "look for problems where my project.clj does not conform" task would be helpful, it might also be helpful to run that conformity check whenever there is an error, so that likely configuration problems could be found more easily.

What if we added the ability to load error handlers out of plugins just like you can currently do with hooks?

technomancy avatar May 22 '17 20:05 technomancy

In order for this to be a plugin we would need a hook to process the project.clj before anything else happens to it.

In a sense we would need to hook the defproject macro the way things are set up now.

The most interesting point here is that as a plugin it would be of little use to the people who really need it. Whereas, I think if this is embedded into Leiningen it might actually make a really big improvement in the first time Clojure experience.

I'm not of the opinion that there need to be a central hook for validation... As folks can include the library and validate the data that gets passed to them.

I'm sorry I haven't pursued this any further. Been pretty busy... And honestly this code is still a delicate flower that only works well if you set things up just right. You kinda need to know the underlying implementation and write the specs in a certain way to get the most helpful messages out of it.

So for now I'd like to call this a proof of concept. Unless someone really wants to steward it or if you are really interested in the completion of it.

PS. BTW I'm sorry I missed you at ClojureWest. Both Tobias and I were pretty sick for most of it.

On Mon, May 22, 2017 at 1:11 PM, Phil Hagelberg [email protected] wrote:

If we were to push spec validation into a plugin for the time being, would we need to make any changes to Leiningen to expose more things?

One thought I had was that while obviously having a specific "look for problems where my project.clj does not conform" task would be helpful, it might also be helpful to run that conformity check whenever there is an error, so that likely configuration problems could be found more easily.

What if we added the ability to load error handlers out of plugins just like you can currently do with hooks?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/technomancy/leiningen/pull/2223#issuecomment-303206966, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAKQMwm5VDkRmvH6PHbu14d8nsjH09Nks5r8evrgaJpZM4K_PFt .

bhauman avatar May 23 '17 13:05 bhauman

I have to agree with Bruce. For this work to be useful to those who really need it, those who are new to Clojure as a whole, it has to be included by default in Leiningen.

Rovanion avatar May 23 '17 14:05 Rovanion

Sure--I would like it included in Leiningen eventually. It's just that in its current state it is unlikely to stabilize in time for inclusion in the next release, and I want to make sure it can be distributed as a plugin first. That way we can gather feedback about it and evolve it independently of the Leiningen release cycle.

technomancy avatar May 23 '17 16:05 technomancy

That makes a lot of sense.

On Tue, May 23, 2017 at 9:05 AM, Phil Hagelberg [email protected] wrote:

Sure--I would like it included in Leiningen eventually. It's just that in its current state it is unlikely to stabilize in time for inclusion in the next release, and I want to make sure it can be distributed as a plugin first. That way we can gather feedback about it and evolve it independently of the Leiningen release cycle.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/technomancy/leiningen/pull/2223#issuecomment-303448555, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAKQMuCsuaVqwKtqL1OMM-LIssQrVY1ks5r8wPYgaJpZM4K_PFt .

bhauman avatar May 23 '17 16:05 bhauman

In order for this to be a plugin we would need a hook to process the project.clj before anything else happens to it.

Leiningen already provides an option for plugins to offer middleware which get passed the project and can make changes to it if desired; I wonder if that would do the job for this to allow it to evolve in a plugin.

technomancy avatar Aug 18 '17 04:08 technomancy

Hello, any news ?

marco-m avatar Jan 07 '18 16:01 marco-m

Since Clojure 1.9 is released now, we are no longer blocked on this if someone wants to take it up.

technomancy avatar Jan 07 '18 17:01 technomancy

So ... Integrating this is still a non-trivial amount of work. This is why it hasn't been done yet.

The strictly specking library, while it works pretty well for figwheel, is still very very "hacky", meaning that you have to write specs in a "certain way" to have a specific outcome. I.E. you need to understand the internals to get optimal feedback for a particular error.

This is why I have stalled on this for so long. The strictly-specking library is pretty much a mess, which is fine for figwheel but not so fine for Leiningen especially since specs have to evolve with the capabilities of Leiningen and if they evolve badly we will have a hard time.

So basically, I'm saying that I'm not quite up to spending the 1-2 months time it would take to make this serviceable.

Rich mentioned in his last talk that the next version of spec would be more "programmable", I'm wondering if a fresh simpler homegrown approach to this problem that isn't as strong and flawed as strictly-specking would be a good idea.

One approach that occurred to me is that you could preprocess the config map and prepend all the keys with namespaces that indicate their position in the project map.

You could now write specs for these keys:

:leinigen/mailinglist (map-of keyword? identity) :leinigen.mailinglist/name string?

This could be very serviceable solution because when you are dealing with spec, getting an actual path is very difficult for nested specs.

Also if you keep the predicates simple and bounded your explanations of the predicate failures will also be simple and bounded. You may even write custom specs for each type of predicate and override the explain method so that it includes an explanation string.

Basically for feedback simplicity you want a path to an explainable error.

Nested complex specs are very difficult to explain in a human way.

Also, if there is a homegrown solution you can balance the simplicity (i.e. maintainability) - feedback trade-off.

If anyone is interested in ping-ponging this back and forth with me I'd be happy to. I just don't want to take on this project alone as writing strickly-specking was way too intense of an endeavor.

bhauman avatar Jan 09 '18 16:01 bhauman

@bhauman thanks for the detailed explanation!

marco-m avatar Jan 09 '18 19:01 marco-m

I'd be interested in helping steward some form of this to completion, if you have time to download exactly where we are status wise with the project, @bhauman .

LoganGirard avatar Dec 31 '18 07:12 LoganGirard