skrub icon indicating copy to clipboard operation
skrub copied to clipboard

Consider casting to float32 by default in TableVectorizer

Open GaelVaroquaux opened this issue 1 year ago • 2 comments

Problem Description

Using float64 instead of float32 typically incurs compute and memory loads, and users do not have this in mind.

Feature Description

We should add an option to the TableVectorizer to output float32. We should consider whether this is the default.

Alternative Solutions

N/A

Additional Context

N/A

GaelVaroquaux avatar Dec 19 '23 10:12 GaelVaroquaux

that also applies (maybe even more) to encoders, for example MinHash outputs float64

jeromedockes avatar Dec 19 '23 11:12 jeromedockes

that also applies (maybe even more) to encoders, for example MinHash outputs float64

Absolutely! Thanks for raising this. Maybe we should start there

GaelVaroquaux avatar Dec 19 '23 15:12 GaelVaroquaux

Closing because this has been addressed in the postprocessing step of the TableVectorizer in #902

TheooJ avatar May 28 '24 15:05 TheooJ