supereeg icon indicating copy to clipboard operation
supereeg copied to clipboard

reduce memory load with quantization

Open andrewheusser opened this issue 8 years ago • 2 comments

noticed that we are using 32 bit precision for at least some of the model components (maybe the data too?). I was reading about quantization (reducing precision) to make neural net params smaller in memory. https://www.tensorflow.org/performance/quantization

maybe worth considering?

andrewheusser avatar May 19 '17 18:05 andrewheusser

note: the link i posted specifically talks about quantizing deep neural nets. Because there are many layers/params, less precision can sometimes be ok...im not exactly sure how it would apply in our case, but may be worth exploring

andrewheusser avatar May 19 '17 19:05 andrewheusser

we could potentially move to 16 bit floats-- float16 ("half precision floating point numbers"). initially the data were stored as 64 bit floats ("double precision"), so moving to float32 ("single precision") was already a reduction. it really depends on the ranges of numbers we need to represent.

16 bit floats have a range of roughly [+/-6.5e4] 32 bit floats have a range of roughly [+/-3.4e38] 64 bit floats have a range of roughly [+/-1.8e308]

We should go with the minimum needed to reliably represent the data and do the computations under all reasonably expected scenarios (e.g. with the hugest datasets we could imagine), plus building in a good amount of tolerance beyond that.

jeremymanning avatar May 20 '17 10:05 jeremymanning