nolearn icon indicating copy to clipboard operation
nolearn copied to clipboard

More frequent feedback from NeuralNet

Open BenjaminBossan opened this issue 10 years ago • 1 comments

As mentioned here, there are situations where the user wants more frequent feedback from the net than just after each epoch. Especially so with the arrival of RNNs, which are hungry for tons of data but slow. Having more frequent feedback also allows more neat stuff, for instance, to stop early after, 2.5 epochs etc.

The solution proposed in the PR would solve the issue but feels a little bit like cheating, since the batch iterator will pretend the epoch to be over when really it isn't.

I have an implementation lying around that has an on_epoch_finished callback. Unfortunately, that complicates matters, since you have to synchronize the loop through train and eval (which in turn requires adjusting the batch size for eval).

So does anybody have another solution? I would help out with coding if necessary.

BenjaminBossan avatar Aug 13 '15 18:08 BenjaminBossan

A on_batch_finished handler has been added since. But it won't cover your use case where you do early stopping between epochs.

However, I think that @dirtysalt's MinibatchIterator works well enough for your case. Sure the output will say you iterated one epoch when you didn't, but I think that can be dealt with.

I'll reproduce the MinibatchIterator class here for the record:


class MiniBatchIterator(BatchIterator):
    def __init__(self, batch_size = 128, iterations = 32):
        BatchIterator.__init__(self, batch_size)
        self.iterations = iterations
        self.X = None
        self.y = None
        self.cidx = 0
        self.midx = 0

    def __call__(self, X, y = None):
        # if data set is reset
        if not (self.X is X and self.y is y):
            self.cidx = 0
            n_samples = X.shape[0]
            bs = self.batch_size
            self.midx = (n_samples + bs - 1) // bs
        self.X, self.y = X, y
        return self

    def __iter__(self):
        bs = self.batch_size
        for i in range(0, self.iterations):
            sl = slice(self.cidx * bs , (self.cidx + 1) * bs)
            self.cidx += 1
            # wrap up.
            if self.cidx >= self.midx: self.cidx = 0
            Xb = self.X[sl]
            if self.y is not None:
                yb = self.y[sl]
            else:
                yb = None
            yield self.transform(Xb, yb)

dnouri avatar Mar 26 '16 03:03 dnouri