sdk
sdk copied to clipboard
Allow system metrics to work with cgroups-v2
LAY-3722 System metrics erroneously try to collect metrics for cgroups for ubuntu
Description
If /sys/fs/cgroup/ exists on the user machine and they do a local layer run, we try to read the cgroup metrics, but at least in the case of Ubuntu, they are in a different shape (for me it fails because /sys/fs/cgroup/cpu/cpuacct.usage_user does not exist).
We should have a better way to determine if we are in a local or remote run or support the different shape of cgroups if we decide this is the accurate way of collecting the data.
We should also make collecting these metrics non-fatal probably.
I can reproduce on Ubuntu 22.04 both locally and in a Docker container (ubuntu:latest), but it seems to work fine on Ubuntu 20.04 (at least it doesn't crash, don't know if the data it collects is accurate).
Repro script:
from sklearn.svm import SVC
import layer
layer.login() # ("https://dev-ragnarok.layer.co/")
layer.init("test")
@layer.model("iris-model")
@layer.pip_requirements(packages=["scikit-learn==0.23.2"])
def train_model():
from sklearn import datasets
iris = datasets.load_iris()
clf = SVC(max_iter=-1)
result = clf.fit(iris.data, iris.target)
print("model1 computed")
return result
train_model()
Acceptance Criteria
- [ ]