synaptic icon indicating copy to clipboard operation
synaptic copied to clipboard

training-set options doesn't work

Open KGOH opened this issue 7 years ago • 1 comments

In synaptic/datasets.clj there is miisteke with options pass:

(defn training-set
  "Create a training set from samples and associated labels.
  The training set consists of one or more batches and optionally a validation set.
  It also has a map that will allow converting y's back to the original labels.
  
  Options:
    :name        - a name for the training set
    :type        - the type of training data (e.g. :binary-image, :grayscale-image ...)
    :fieldsize   - [width height] of each sample data (for images)
    :nvalid      - size of the validation set (default is 0, i.e. no validation set)
    :batch       - size of a mini-batch (default is the number of samples, after
                   having set apart the validation set)
    :online true - set this flag for online training (same as batch size = 1)
    :rand false  - unset this flag to keep original ordering (by default, samples
                   will be shuffled before partitioning)."
  [samples labels & [options]]
  {:pre [(= (count samples) (count labels))]}
  (let [batchsize  (if (:online options) 1 (:batch options))
        trainsize  (if (:nvalid options) (- (count samples) (:nvalid options)))
        randomize  (if (nil? (:rand options)) true (:rand options))
        [binlb uniquelb]    (u/tobinary labels)
        [smp lb]   (if randomize (shuffle-vecs samples binlb) [samples binlb])
        [trainsmp validsmp] (if trainsize (split-at trainsize smp) [smp nil])
        [trainlb  validlb]  (if trainsize (split-at trainsize lb) [lb nil])
        [batchsmp batchlb]  (partition-vecs batchsize trainsmp trainlb)
        trainsets  (mapv dataset batchsmp batchlb)
        validset   (if trainsize (dataset validsmp validlb))
        timestamp  (System/currentTimeMillis)
        header     {:name (or (:name options) timestamp)
                    :timestamp timestamp
                    :type (:type options)
                    :fieldsize (or (:fieldsize options)
                                   (u/divisors (count (first samples))))
                    :batches (mapv (partial count-labels uniquelb) batchlb)
                    :valid (count-labels uniquelb validlb)
                    :labels uniquelb}]
    (TrainingSet. header trainsets validset)))

Arguments for this function are [samples labels & [options]] but must be [samples labels & options] and in the first let as first assignment you must add options (apply hash-map options), so options in function will work. But now instead options you taking only first keyword

Here is code updated by me:

(defn training-set
  "Create a training set from samples and associated labels.
  The training set consists of one or more batches and optionally a validation set.
  It also has a map that will allow converting y's back to the original labels.
  
  Options:
    :name        - a name for the training set
    :type        - the type of training data (e.g. :binary-image, :grayscale-image ...)
    :fieldsize   - [width height] of each sample data (for images)
    :nvalid      - size of the validation set (default is 0, i.e. no validation set)
    :batch       - size of a mini-batch (default is the number of samples, after
                   having set apart the validation set)
    :online true - set this flag for online training (same as batch size = 1)
    :rand false  - unset this flag to keep original ordering (by default, samples
                   will be shuffled before partitioning)."
  [samples labels & options]
  {:pre [(= (count samples) (count labels))]}
  (let [options (apply hash-map options)
        batchsize  (if (:online options) 1 (:batch options))
        trainsize  (if (:nvalid options) (- (count samples) (:nvalid options)))
        randomize  (if (nil? (:rand options)) true (:rand options))
        [binlb uniquelb]    (u/tobinary labels)
        [smp lb]   (if randomize (shuffle-vecs samples binlb) [samples binlb])
        [trainsmp validsmp] (if trainsize (split-at trainsize smp) [smp nil])
        [trainlb  validlb]  (if trainsize (split-at trainsize lb) [lb nil])
        [batchsmp batchlb]  (partition-vecs batchsize trainsmp trainlb)
        trainsets  (mapv dataset batchsmp batchlb)
        validset   (if trainsize (dataset validsmp validlb))
        timestamp  (System/currentTimeMillis)
        header     {:name (or (:name options) timestamp)
                    :timestamp timestamp
                    :type (:type options)
                    :fieldsize (or (:fieldsize options)
                                   (u/divisors (count (first samples))))
                    :batches (mapv (partial count-labels uniquelb) batchlb)
                    :valid (count-labels uniquelb validlb)
                    :labels uniquelb}]
    (TrainingSet. header trainsets validset)))

KGOH avatar Mar 03 '17 14:03 KGOH

Also this bug appears in all functions taking arguments as keyword value pairs

KGOH avatar Mar 03 '17 14:03 KGOH