woodwork
woodwork copied to clipboard
[SPIKE] Run performance tests against changing the sample size for inference
We currently sample 100,000 rows during inference. This spike investigates reducing this number and possibly including a heuristic to change sample size based on the length of the data. In a perfect scenario we'd like to be using 10,000 that can effectively consider the entire dataset (beginning, middle, end) when sampling.