Clojush icon indicating copy to clipboard operation
Clojush copied to clipboard

Report writing as a single concern

Open Vaguery opened this issue 9 years ago • 12 comments

I see some activity on particular aspects of "reporting", for instance in #102. But inside the pushgp loop is a lot of intermingled and overlapping reportage.

I'd like to sketch a (big) refactoring, which aims to extract all of reporting from the search function. Completely.

As I watch the various reports pile up in a file or in STDOUT, most look like fossilized one-off experiments: stuff that was written for a kind of visible feedback while certain parts of the search functionality were being written and reconceived, but which have never been removed or shifted to a self-contained function.

My refactoring roadmap feels something like this, depending on your feedback about Clojure idioms and stuff:

  1. review the current "reports" and classify them into "feedback", "annotations" and "data"
    • feedback will be "reports" aimed at a watching user, including warnings, faults, running counts, and general progress indicators
    • annotations are things like "last Github hash" and "problem name" (I'm hoping that's already in there!), which should technically be assumed to be metadata about a given run; these will probably appear repeatedly as headers or part of a template in most other reports
    • data is the stuff you're used to puking to screen or saving to files, which might reasonably be expected to persist and explain: fitnesses, answer scripts, times, that sort of thing
  2. create a push-report map, much like argmap or make-push-state (as I understand them): it will scour and merge the command line arguments, config file settings, and defaults to generate standardized suite of reports
  3. [actual functionality change] tease apart the notion of a "report" to be a function which is applied to a dataset. Some of the current "reports" print data and summary information together, some use a lot of extra verbiage where a word would do, some try to include the kitchen sink. Instead of that, I would rather see push-report create both the content and the view for every report: that is, if data collection is the goal, that data will be saved in a machine-readable format that can be shared between reports, but also a "view document" will be created when the "report" is made which can be used to visualize or summarize the data in the desired way. The model in my mind is CouchDB at the moment, but that's not important: think of every invocation of push-report as going through this sort of cascade:
for each active report definition (based on config state, at this moment):
  write the view document if it doesn't exist
  create the datastore if it doesn't exist
  write the cached data

The point of separating the concerns of reporting from search should be obvious. The point of separating the concerns of data collection from visualization is to better foster flexibility and reuse, and to permit "live" viewing by an engaged researcher. A generic "view document" could be as simple as a web page that exactly duplicates the current STDOUT puke, to a dashboard that shows how many evaluations have happened and how well they're doing, to a detailed exploratory tool that does calculations on the current data store when opened to display desired visualizations.

And yes, you should think of a "view document" as a "web page" running a trivial d3 widget or something like that.

consequences

The only negative consequence I can see is that when producing a "report" of the sort currently extant, you'd have to actually use the "view document" framework instead of println. In return, you'd have the ability to define generic reports, re-use reports between projects, build new functionality (like new search operators or selection schemes) with a mind to actually seeing what is changed instead of just sortof letting the muse tell you, and so on.

Also: you'd be able to monitor a run remotely, in realtime, without a bunch of scrolly words: with decent, humane communicative charts.

Also also: this is the camel's nose under the tent for a web-based project setup.

Vaguery avatar Feb 20 '15 23:02 Vaguery

When I say "view document" I'm basically talking about little boilerplate HTML and an embedded js script like http://fastly.github.io/epoch/real-time/, which could be totally static, or have a periodic AJAX reload.

Vaguery avatar Feb 21 '15 00:02 Vaguery

There's a lot of stuff here. Some of it sounds amazing (like live monitoring of stats of runs, which we've discussed for years and never convinced someone to do). Some of it sounds confusing. Some of it doesn't match my knowledge of Clojush (there's relatively little reporting that happens outside of report.clj, and most of that is outside for a reason).

One thing that you'll have to keep in mind if you take this on is that your version of useful is often different from our version of useful. We (or at least I) find the current printing of reports very useful. I have many scripts setup to scrape data from them, and they print things I need to know. Sure, some of that functionality could be taken over by pretty web pages etc., but until that functionality is Rock Solid, I'd ask that the functionality of the reports not change significantly. I need things printed to standard output, and right now we have things printed there that are useful.

The other thing to keep in mind, which knowing you you probably are, is that any changes should make it more friendly for a new user, not less. If new students want to try out Clojush and get a run going and can see things printed to their terminal, they know it's working. If they have to setup some database and generate web pages etc., it better be easier than viewing text in a terminal.

thelmuth avatar Feb 21 '15 16:02 thelmuth

Unless there were strong consensus, what I'd do is add a new report-generator like the one I'm describing, not remove the legacy of course.

I like the way you say "setup some database and generate web pages" like that was harder than reading and writing raw Clojure code and bash scripts. :)

Actually, on that note: Why are the scripts not in the repository? They really should be, if you use them for important tasks, should't they?

Vaguery avatar Feb 21 '15 17:02 Vaguery

The scripts (which are actually in Python) are specific for scraping data from mass-runs (often performed on the cluster). I often have offered to share them with people, but thus far no one has been interested. I could see putting some cleaned up versions in the repo if people are interested.

thelmuth avatar Feb 21 '15 20:02 thelmuth

FWIW I have a rag-tag collection of tools to pull stuff out of logs, but most of it is shell scripts that use grep and awk and, believe it or not, sometimes Common Lisp in the same scripts. Works for me but it's hard to imagine anyone else wanting to deal with this stuff.

lspector avatar Feb 21 '15 20:02 lspector

Sounds as though neither of you feels those private scripts should be part of the public release, but that the "raw" data dumped into reports needs those unreleased tools to be parsed and understood, and that because the reporting-and-parsing habits you have are pretty much "mission critical" you can't afford to change that situation....

So yeah, back to my stated plan :)

Vaguery avatar Feb 21 '15 20:02 Vaguery

I would like to understand all the things you've got in those scripts. Having them tucked away offline somewhere makes it much harder for a newcomer to understand the intention of each report thingie (where "thingie" is either a line of reported numbers, or some feedback, or some rows of hyphens... &c)

Vaguery avatar Feb 22 '15 11:02 Vaguery

This could be a handful of non-imperative lines of code, if it were refactored intelligently.

As it stands it's a huge risk for anybody actually wanting to use this codebase, meaning you: it's a bottleneck (because it's crucial), it's a dozen things piled on top of one another without boundaries, and I can see at least five utility functions jammed there right inline, right in the middle. (Defining quartile inline? really?)

Can I help with this please???

(defn report-and-check-for-success
  "Reports on the specified generation of a pushgp run. Returns the best
   individual of the generation."
  [population generation
   {:keys [error-function report-simplifications
           error-threshold max-generations population-size
           print-errors print-history print-cosmos-data print-timings
           problem-specific-report total-error-method
           parent-selection print-homology-data max-point-evaluations
           print-error-frequencies-by-case normalization
           ;; The following are for CSV or JSON logs
           print-csv-logs print-json-logs csv-log-filename json-log-filename
           log-fitnesses-for-all-cases json-log-program-strings
           ]
    :as argmap}]
  (println)
  (println ";;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;")
  (println ";; -*- Report at generation" generation)
  (let [point-evaluations-before-report @point-evaluations-count
        err-fn (if (= total-error-method :rmse) :weighted-error :total-error)
        sorted (sort-by err-fn < population)
        err-fn-best (first sorted)
        psr-best (problem-specific-report err-fn-best population generation error-function report-simplifications)
        best (if (= (type psr-best) clojush.individual.individual)
               psr-best
               err-fn-best)
        average (fn [nums]
                  (if (zero? (count nums))
                    "Cannot find average of zero numbers."
                    (float (/ (apply +' nums) (count nums)))))
        standard-deviation (fn [nums]
                             (if (<= (count nums) 1)
                               (str "Cannot find standard deviation of " (count nums) "numbers. Must have at least 2.")
                               (let [mean (average nums)]
                                 (Math/sqrt (/ (apply +' (map #(* (- % mean) (- % mean))
                                                              nums))
                                               (dec (count nums)))))))
        median (fn [nums]
                 (if (zero? (count nums))
                   "Cannot find median of zero numbers."
                   (let [sorted (sort nums)]
                     (if (odd? (count nums))
                       (nth sorted
                            (truncate (/ (count nums) 2)))
                       (/ (+' (nth sorted
                                   (/ (count nums) 2))
                              (nth sorted
                                   (dec (/ (count nums) 2))))
                          2.0)))))
        quartiles (fn [nums]
                    (if (zero? (count nums))
                      "Cannot find quartiles of zero numbers."
                      (let [sorted (sort nums)]
                        (vector (nth sorted
                                     (truncate (/ (count nums) 4)))
                                (nth sorted
                                     (truncate (/ (count nums) 2)))
                                (nth sorted
                                     (truncate (/ (* 3 (count nums)) 4)))))))
        ]
    (when print-error-frequencies-by-case
      (println "Error frequencies by case:" (doall (map frequencies (apply map vector (map :errors population))))))
    (when (some #{parent-selection} #{:lexicase :elitegroup-lexicase}) (lexicase-report population argmap))
    (when (= total-error-method :ifs) (implicit-fitness-sharing-report population argmap))
    (println (format "--- Best Program (%s) Statistics ---" (str "based on " (name err-fn))))
    (println "Best genome:" (pr-str (not-lazy (:genome best))))
    (println "Best program:" (pr-str (not-lazy (:program best))))
    (when (> report-simplifications 0)
      (println "Partial simplification:"
               (pr-str (not-lazy (:program (auto-simplify best error-function report-simplifications false 1000))))))
    (when print-errors (println "Errors:" (not-lazy (:errors best))))
    (println "Total:" (:total-error best))
    (println "Mean:" (float (/ (:total-error best)
                               (count (:errors best)))))
    (when (not= normalization :none)
      (println "Normalized error:" (:normalized-error best)))
    (case total-error-method
      :hah (println "HAH-error:" (:weighted-error best))
      :rmse (println "RMS-error:" (:weighted-error best))
      :ifs (println "IFS-error:" (:weighted-error best))
      nil)
    (when print-history (println "History:" (not-lazy (:history best))))
    (println "Genome size:" (count (:genome best)))
    (println "Size:" (count-points (:program best)))
    (printf "Percent parens: %.3f\n" (double (/ (count-parens (:program best)) (count-points (:program best))))) ;Number of (open) parens / points
    (println "--- Population Statistics ---")
    (when print-cosmos-data
      (println "Cosmos Data:" (let [quants (config/quantiles (count population))]
                                (zipmap quants (map #(:total-error (nth (sort-by :total-error population) %)) quants)))))
    (println "Average total errors in population:"
             (*' 1.0 (/ (reduce +' (map :total-error sorted)) (count population))))
    (println "Median total errors in population:"
             (:total-error (nth sorted (truncate (/ (count sorted) 2)))))
    (when print-errors (println "Error averages by case:"
                                (apply map (fn [& args] (*' 1.0 (/ (reduce +' args) (count args))))
                                       (map :errors population))))
    (when print-errors (println "Error minima by case:"
                                (apply map (fn [& args] (apply min args))
                                       (map :errors population))))
    (println "Average genome size in population (length):"
             (*' 1.0 (/ (reduce +' (map count (map :genome sorted)))
                        (count population))))
    (println "Average program size in population (points):"
             (*' 1.0 (/ (reduce +' (map count-points (map :program sorted)))
                        (count population))))
    (printf "Average percent parens in population: %.3f\n" (/ (apply + (map #(double (/ (count-parens (:program %)) (count-points (:program %)))) sorted))
                                                              (count population)))
    (let [frequency-map (frequencies (map :program population))]
      (println "Number of unique programs in population:" (count frequency-map))
      (println "Max copy number of one program:" (apply max (vals frequency-map)))
      (println "Min copy number of one program:" (apply min (vals frequency-map)))
      (println "Median copy number:" (nth (sort (vals frequency-map)) (Math/floor (/ (count frequency-map) 2)))))
    (when @global-print-behavioral-diversity
      (swap! population-behaviors #(take-last population-size %)) ; Only use behaviors during evaluation, not those during simplification
      (println "Behavioral diversity:" (behavioral-diversity))
      ;(println "Number of behaviors:" (count @population-behaviors))
      (reset! population-behaviors ()))
    (when print-homology-data
      (let [num-samples 1000
            sample-1 (sample-population-edit-distance population num-samples)
            [first-quart-1 median-1 third-quart-1] (quartiles sample-1)]
        (println "--- Population Homology Statistics (all stats reference the sampled population edit distance of programs) ---")
        (println "Number of homology samples:" num-samples)
        (println "Average:            " (average sample-1))
        (println "Standard deviation: " (standard-deviation sample-1))
        (println "First quartile: " first-quart-1)
        (println "Median:         " median-1)
        (println "Third quartile: " third-quart-1)
        ))
    (println "Number of program evaluations used so far:" @evaluations-count)
    (println "Number of point (instruction) evaluations so far:" point-evaluations-before-report)
    (reset! point-evaluations-count point-evaluations-before-report)
    (println "--- Timings ---")
    (println "Current time:" (System/currentTimeMillis) "milliseconds")
    (when print-timings
      (let [total-time (apply + (vals @timing-map))
            init (get @timing-map :initialization)
            reproduction (get @timing-map :reproduction)
            fitness (get @timing-map :fitness)
            report-time (get @timing-map :report)
            other (get @timing-map :other)]
        (printf "Total Time:      %8.1f seconds\n" (/ total-time 1000.0))
        (printf "Initialization:  %8.1f seconds, %4.1f%%\n" (/ init 1000.0) (* 100.0 (/ init total-time)))
        (printf "Reproduction:    %8.1f seconds, %4.1f%%\n" (/ reproduction 1000.0) (* 100.0 (/ reproduction total-time)))
        (printf "Fitness Testing: %8.1f seconds, %4.1f%%\n" (/ fitness 1000.0) (* 100.0 (/ fitness total-time)))
        (printf "Report:          %8.1f seconds, %4.1f%%\n" (/ report-time 1000.0) (* 100.0 (/ report-time total-time)))
        (printf "Other:           %8.1f seconds, %4.1f%%\n" (/ other 1000.0) (* 100.0 (/ other total-time)))))
    (println ";;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;")
    (flush)
    (when print-csv-logs (csv-print population generation csv-log-filename
                                    log-fitnesses-for-all-cases))
    (when print-json-logs (json-print population generation json-log-filename
                                      log-fitnesses-for-all-cases json-log-program-strings))
    (cond (or (<= (:total-error best) error-threshold)
              (:success best)) [:success best]
          (>= generation max-generations) [:failure best]
          (>= @point-evaluations-count max-point-evaluations) [:failure best]
          :else [:continue best])))

Vaguery avatar Feb 22 '15 11:02 Vaguery

Yes report-and-check-for-success is a classic example of what has happened with several people adding small new bits of functionality over many years, without refactoring, which I agree is called for. Plus there's the original sin of combining reporting (which is necessarily imperative, I think) with checking for success.

On analysis scripts, this will be very ugly but I have things like the following, which takes a string that matches only output lines of interest and a file name, and prints the minimum of the numbers found at the end of the matching lines:

if [ -z "$2" ] then echo "Requires 2 arguments: a pattern unique to data lines and a file name" else echo "(progn (princ (min " > tmpfile grep $1 $2 | gawk '{ print $NF }' >> tmpfile echo ")) (terpri) (quit))" >> tmpfile lisp -eval "(setq load-verbose nil)" -load tmpfile rm -f tmpfile fi

If that ain't ugly I don't know what is!

lspector avatar Feb 22 '15 14:02 lspector

I don't have time to organize them or make them pretty, but I just pushed all of my Python scripts to a new GitHub repo for your perusal:

https://github.com/thelmuth/Clojush-Tools

thelmuth avatar Feb 22 '15 14:02 thelmuth

Thanks. This is obviously a low-priority thing, but in general I'd argue that for other people who (hopefully) will want to use the codebase, we should explain these all, insofar as they cast a long shadow across the experience of running it.

As I said, I won't change anything, but will (maybe with @NicMcPhee's help) make a separate module that we can prove does everything you actually want.

Vaguery avatar Feb 22 '15 14:02 Vaguery

Coming in a bit late:

I really like @Vaguery's proposed design for handling the output; I think that will make it a lot more flexible moving forward. I also appreciate the need to maintain the existing output so we don't screw up @thelmuth's ability to graduate :-).

I think we definitely want to have at least some of those tools in a more visible place. As Bill said, without them (or at least being able to envision them), it can be hard to have a sense of "what next" after Clojush (or pretty much any EC system I've used) blurps out it's pile of stuff. Have Tom's new repo gives us a place to start, and we can maybe pick through some of that and move some over/incorporate some into the new reporting framework.

I think this quote is really crucial to understanding some of the (quite substantial) differences in perspective:

I like the way you say "setup some database and generate web pages" like that was harder than reading and writing raw Clojure code and bash scripts. :)

I really like the idea of saving results into more structured places (like DBs) and then accessing them in more flexible ways (like remote web dashboards). My poking around at that over the past few years, though, always gets stuck at the "setup some databases and web services" infrastructure end of things. Part of that, to be honest, is that I've got decades of experience with bash, where I'm still/constantly/forever chasing the baying pack of nifty, cutting edge tools that we could be using instead.

At the moment, for example, I have a U of MN VM running Neo4J that I can't for love nor money seem to get configured properly so that I can access it via the REST interface. Enough time has been sunk down that rabbit hole that my students and I are seriously considering just dumping out text files, copying them over to the VM, and then loading them into Neo4J there. Low-tech and stupid? Yes. But it'll work, and we can't seem to get the "right" way to play nice.

And part of it is straight up cost. Sure there are all these amazing, nifty cloud services out there, but to use them in any sort of "industrial" way costs money. Not tons, but it easily could turn into $100's to $1000's per month, against a research budget of effectively $0. Alternatively, I can dump out a bunch of text files, munge through the data with awk, write a paper or two, and wander away, all with no financial investment.

I'm hoping that I've just missed the boat and that there are technically straightforward, cost effective tools out there that I can use to do all the nifty cool things, which is part of why I'm keen to be working with @Vaguery on all this.

NicMcPhee avatar Feb 22 '15 22:02 NicMcPhee