djl icon indicating copy to clipboard operation
djl copied to clipboard

Data normalization/de-normalization

Open jSaso opened this issue 4 years ago • 2 comments

Description

Create normalize class which has two method:

  • normalize - to normalize number to fit network model (range: 0 to 1, range: -1 to 1)
  • de-normalize number - de-normalize number when network training is finished (onEpoch or onComplete listeners)

Normalize object must also have parameter - min, max (what is the minimum and maximum number of our number range) and interval (interval: 0 to 1, interval: -1 to 1)

Example: we have range of real numbers that needs to be normalized: [1, 5, 7, 12, 16, 19, 23, 3, 6, 33] Normalize class will have:

  • normalize range interval (if we want range from 0 to 1, or -1 to 1)
  • minimum, which is in this case 1
  • maximum, which is in this case 33

With all this information, we can normalize number and it will be prepared for train/test model. Each number, which enter network input as normalized number will have normalized class defined. On training we can easily de-normalize every number and compare it with our test data set (which also needs to be de-normalized)

Will this change the current api? How?

Yes, Normalization should be part of data set. Each number in INDArray should have also normalization object. So each number that comes in network input is normalized - 0 to 1 or -1 to 1. Also when network is training and we use listener, we can easily de-normalize number in the data set - predicted numbers can easily be de-normalized and then compared with de-normalized test data set.

Who will benefit from this feature?

Everybody, normalized data set will be simplified with provided normalization/de-normalization of numbers which enters network input model and also when network training is in progress

References

Example: https://github.com/eclipse/deeplearning4j/blob/b5f0ec072f3fd0da566e32f82c0e43ca36553f39/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/dataset/api/preprocessor/MultiDataNormalization.java I think this one is not good enough, it is simple normalization

jSaso avatar Mar 19 '20 22:03 jSaso

Thank you for creating this issue @jSaso. I think it is a place for anyone looking to make their first contribution.

keerthanvasist avatar Mar 20 '20 17:03 keerthanvasist

@jSaso Can you help explain some of your thoughts behind this? Is your goal more to be able to view the input of the model and the output of the model denormalized, or more to see intermediate values denormalized?

zachgk avatar Mar 23 '20 22:03 zachgk