Large scale dataset training
Hi, I have encountered an issue where the dataset I entered is too large to be read, and if it is particularly large, , it can cause the process to be Killed. For example,
Loading extension module split_decision... Using /root/.cache/torch_extensions/py38_cu118 as PyTorch extensions root... No modifications detected for re-loaded extension module split_decision, skipping build step... Loading extension module split_decision... Killed
How can I solve this problem? Does PGBM support batch training? Thanks
Thanks for reporting, looking into it. PGBM currently doesn't support batch training, unfortunately. I'd suggest to try the CPU version based on Sklearn - let me know if that one does work for you?