Does not converege

Open dbaechtel opened this issue 8 years ago • 1 comments

Does not converge for sample program below. Why?

    private static double MinimumError = 0.1;
    private static NeuralNet _network;
    private static List<DataSet> _dataSets;
    #endregion

    static void Main(string[] args)
    {
        _numInputParameters = 2;
        _numOutputParameters = 1;
        _numHiddenLayers = _numInputParameters;
        _hiddenNeurons = _numInputParameters;
        _network = new NeuralNet(_numInputParameters, _hiddenNeurons, _numOutputParameters);
        _dataSets = new List<DataSet>();
        for(int i=0; i<10; i++)
        {
            for(int j=0; j<10; j++)
            {
                double[] inputs = { (double)i/10, (double)j/10 };
                double[] targets = { (double)i/10 + (double)j/10 };
                var ds = new DataSet(inputs, targets);
                _dataSets.Add(ds);
            }
        }
        _network.Train(_dataSets, MinimumError);
    }

Oct 23 '17 02:10 dbaechtel

Ok so there are a few issues. You are using the base learning rate and momentum, which for the granularity you are aiming for is far too high, something like .1, .1 would be better for you, (I have gotten it to work with base values with the further changes i describe). Your dataset is a bit screwy for the way this network trains. It goes through all the sets in sequence, (which it probably oughtn't but that's a separate issue) Because your dataset goes from lowest to highest values the back propagation gradually tunes the whole network higher and higher through each epoch, then starts out bad again when it gets to low values, if you let it go like 500 epochs instead of error .1 it would probably be close to always outputing 1 and lastly, this uses the sigmoid function for activation, which with this data has clamping problems. I wrote a small leaky ReLU class and replaced the sigmoid and it worked pretty great on this dataset shuffled. If you wanna try it with leaky ReLU use this class:

    public static class ReLU {
        public static double Output(double x)
        { return System.Math.Max(x * .01, x); }
        public static double Derivative(double x)
        { return x > 0 ? 1 : .01; }
    }

just paste it below sigmoid class in the neuron.cs file then replace the references to sigmoid with ReLU. With shuffling and relu it converges to error .01 in 400 epochs. I also tried adding a second hidden layer to test with shuffling and relu, it converged to .01 error in 20 epochs. (if you are interested in using multiple hidden layers i have forked this and added that feature on my fork. and i also submitted a patch to this repo as well. )

Dec 31 '17 08:12 Jalae