Daft
Daft copied to clipboard
fix: Add overflow protection to memory estimation
Fixes integer overflow crashes in memory estimation and partition operations during large dataset processing.
Changes:
- Memory estimation: Add overflow guards, cap at usize::MAX/2
- FixedSizeList: Limit 1M elements, check infinity
- Shuffle: Use checked_mul(), fallback on overflow
- Partitioning: Enforce 0 < n ≤ 100K
Breaking Change: IntoPartitionsConfig now requires validated constructor:
- Before:
IntoPartitionsConfig { num_partitions: 100 } - After:
IntoPartitionsConfig::new(100)?
Testing: 25 new overflow tests, all existing tests pass
Relates to: #4724
Greptile encountered an error while reviewing this PR. Please reach out to [email protected] for assistance.
100k and 1M are heuristics and happy to add doc comments / tighten FixedSizeList cap if you'd prefer