MC17
MC17
We encountered the same problem, and as @elezar says, ecc error is generated as `nvmlEventTypeDoubleBitEccError` and `nvmlEventTypeSingleBitEccError`, but [https://github.com/NVIDIA/gpu-monitoring-tools/blob/master/bindings/go/nvml/bindings.go#L48](https://github.com/NVIDIA/gpu-monitoring-tools/blob/master/bindings/go/nvml/bindings.go#L48) only defined `XidCriticalError = C.nvmlEventTypeXidCriticalError`. to fix the issue, gpu-monitoring-tools need...
重新启动后已经可以成功运行了,第一次不知道为啥会出现奇怪的错误,谢谢 场景就是:要读取的/root/log中的nginx_access.log.xxxx-xx-xx文件后缀每天随当天日期变化,现在已经可以成功运行了,正在测试中,谢谢
hello, one year passed, is there any news about GPU cuda/mem isolation