sagemaker-xgboost-container
sagemaker-xgboost-container copied to clipboard
Distributed training: Add nanny process to terminate Rabit
As an additional safeguard, add a nanny process to terminate Rabit in case it hangs at the end of training unexpectedly.