load: Issues loading models from DVC managed remote storage
I'm storing models as part of a DVC pipeline in a research repository (let's call it "research") and I'm trying to load it on a VM inside the production system. In the research repository, I have the following: .mlem.yaml:
core:
storage:
type: dvc
.dvc/config:
[core]
remote = gcloud
['remote "gcloud"']
url = gs://********************
The model is stored at models/my-model / models/my-model.mlem as part of a DVC pipeline. The .mlem-file is visible in the git-repository, there is also a file named by the hash of the binary file in the bucket.
I'm trying to load the model in the production system with
import mlem
model = mlem.api.load('models/my-model', project='https://github.com/my-company/research')
That gives me the following error:
Traceback (most recent call last):
File "test.py", line 11, in <module>
mlem.api.load(f'models/{model}', project=repo)
File "/home/******/.venv/lib/python3.8/site-packages/mlem/core/metadata.py", line 99, in load
meta = load_meta(
File "/home/******/.venv/lib/python3.8/site-packages/mlem/core/metadata.py", line 167, in load_meta
location = Location.resolve(
File "/home/******/.venv/lib/python3.8/site-packages/mlem/core/meta_io.py", line 103, in resolve
return UriResolver.resolve(
File "/home/******/.venv/lib/python3.8/site-packages/mlem/core/meta_io.py", line 142, in resolve
return cls.find_resolver(path, project, rev, fs).process(
File "/home/******/.venv/lib/python3.8/site-packages/mlem/core/meta_io.py", line 194, in process
fs, project = cls.get_fs(project, rev)
File "/home/******/.venv/lib/python3.8/site-packages/mlem/core/meta_io.py", line 290, in get_fs
fs, _, (path,) = get_fs_token_paths(
File "/home/******/.venv/lib/python3.8/site-packages/fsspec/core.py", line 639, in get_fs_token_paths
fs = cls(**options)
File "/home/******/.venv/lib/python3.8/site-packages/fsspec/spec.py", line 76, in __call__
obj = super().__call__(*args, **kwargs)
File "/home/******/.venv/lib/python3.8/site-packages/fsspec/implementations/github.py", line 52, in __init__
r.raise_for_status()
File "/home/******/.venv/lib/python3.8/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://api.github.com/repos/my-company/research
This is working perfectly fine without dvc (i.e. the model binary is directly committed to git). Hence, this might be related to #47.
Hi @igordertigor! Thanks for reporting! Could you please check the first lines in .mlem file? Does it have type: dvc line like here https://github.com/iterative/example-mlem-get-started/blob/dvc/models/rf.mlem#L5 ? This is the instruction that let MLEM know it needs to work with DVC.
The other question: do you have DVC installed in the env where you run mlem.api.load?
Hi @aguschin, here is my .mlem.yaml:
$ cat .mlem.yaml
core:
storage:
type: dvc
Regarding your second question: I tried with and without DVC installed and with and without the respective remote configured where I run mlem.api.load. Does that answer your question?
@igordertigor, thanks!
In the first question I meant checking models/my-model.mlem. Could you check type: dvc exists in models/my-model.mlem? Like in this MLEM metafile?
Re 2nd question: thanks, I got it 🙏🏻
cc @mike0sv
Is this answering your question:
$ head models/my-model.mlem
artifacts:
data.pkl:
hash: f47d01924e75f480b416338e313b3812
size: 5759
type: dvc
uri: my-model
model_type:
io:
type: pickle
methods:
Similar for the other one. So yes, there is type: dvc there.
Hey @igordertigor ! Is your github repo private? If that is the case, did you provide GITHUB_TOKEN env for your production environment?
I don't think that's it. The repository is private, but I'm getting the same issue locally, if I just create a new virtual env/repository and everything. However, I am locally still logged in with github and I am able to load models that are stored in git. It is only a an issue with dvc. I'm happy to try again if there is anything specific that you want me to look out for.
Thanks @igordertigor. Let's try to debug this 🙌🏻
Does dvc get works and downloads the binary? Something like
$ dvc get https://github.com/repos/my-company/research models/my-model
(should download the model locally)
@igordertigor, did you have a chance to check it?
Hi @aguschin , apologies for the late reply. I've been away from my computer for a couple of days and will likely be here sporadically in the next couple of days as well. I'll try to at least check once per working day.
I did run dvc get and it works if I specify an ssh url:
$ dvc get [email protected]:my-company/my-repo models/my-model
downloads models/my-model to my-model, but doesn't fetch the associated .mlem file. This doesn't work with the https url.
This made me try @mike0sv's suggestion of providing a GITHUB_TOKEN. But that doesn't seem to help. I can however use the github command line app gh to authenticate. If I select https as my preferred protocol for Git operations (which it isn't, I prefer ssh), then I can use
$ dvc get https://github.com/my-company/my-repo models/my-model
to download the model binary.
However, with those insights, mlem still doesn't work:
$cat test_load_https.py
import mlem
mlem.api.load(
'models/my-model',
project='https://github.com/my-company/my-repo',
)
$ python test_load_https.py
Traceback (most recent call last):
File "load_mlem_model_through_dvc.py", line 3, in <module>
mlem.api.load(
File "/home/ingo/tmp/dvc-mlem-issue/.venv/lib/python3.8/site-packages/mlem/core/metadata.py", line 99, in load
meta = load_meta(
File "/home/ingo/tmp/dvc-mlem-issue/.venv/lib/python3.8/site-packages/mlem/core/metadata.py", line 167, in load_meta
location = Location.resolve(
File "/home/ingo/tmp/dvc-mlem-issue/.venv/lib/python3.8/site-packages/mlem/core/meta_io.py", line 103, in resolve
return UriResolver.resolve(
File "/home/ingo/tmp/dvc-mlem-issue/.venv/lib/python3.8/site-packages/mlem/core/meta_io.py", line 142, in resolve
return cls.find_resolver(path, project, rev, fs).process(
File "/home/ingo/tmp/dvc-mlem-issue/.venv/lib/python3.8/site-packages/mlem/core/meta_io.py", line 194, in process
fs, project = cls.get_fs(project, rev)
File "/home/ingo/tmp/dvc-mlem-issue/.venv/lib/python3.8/site-packages/mlem/core/meta_io.py", line 290, in get_fs
fs, _, (path,) = get_fs_token_paths(
File "/home/ingo/tmp/dvc-mlem-issue/.venv/lib/python3.8/site-packages/fsspec/core.py", line 639, in get_fs_token_paths
fs = cls(**options)
File "/home/ingo/tmp/dvc-mlem-issue/.venv/lib/python3.8/site-packages/fsspec/spec.py", line 76, in __call__
obj = super().__call__(*args, **kwargs)
File "/home/ingo/tmp/dvc-mlem-issue/.venv/lib/python3.8/site-packages/fsspec/implementations/github.py", line 52, in __init__
r.raise_for_status()
File "/home/ingo/tmp/dvc-mlem-issue/.venv/lib/python3.8/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://api.github.com/repos/my-company/my-repo
With ssh, it seems to get a little further:
$ cat test_load_ssh.py
import mlem
mlem.api.load(
'models/my-model',
project='[email protected]:my-company/my-model',
)
$ python test_load_ssh.py
Traceback (most recent call last):
File "/home/ingo/tmp/dvc-mlem-issue/.venv/lib/python3.8/site-packages/mlem/core/metadata.py", line 206, in find_meta_location
_, path = find_object(
File "/home/ingo/tmp/dvc-mlem-issue/.venv/lib/python3.8/site-packages/mlem/core/objects.py", line 1168, in find_object
raise ValueError(
ValueError: Object models/my-model not found, search of fs <fsspec.implementations.local.LocalFileSystem object at 0x7f8a5b5be4c0> at models/feature_engine
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "load_mlem_model_through_dvc_ssh.py", line 3, in <module>
mlem.api.load(
File "/home/ingo/tmp/dvc-mlem-issue/.venv/lib/python3.8/site-packages/mlem/core/metadata.py", line 99, in load
meta = load_meta(
File "/home/ingo/tmp/dvc-mlem-issue/.venv/lib/python3.8/site-packages/mlem/core/metadata.py", line 176, in load_meta
location=find_meta_location(location),
File "/home/ingo/tmp/dvc-mlem-issue/.venv/lib/python3.8/site-packages/mlem/core/metadata.py", line 210, in find_meta_location
raise MlemObjectNotFound(
mlem.core.errors.MlemObjectNotFound: MLEM object was not found at `[email protected]:my-company/my-repo/models/my-model`
In fact, this error message is correct: There is no models/my-model in the repository. It's in the dvc-controlled bucket.
I hope this helps.
Thanks @igordertigor! The only idea we have so far is that you somehow didn't populate GITHUB_TOKEN to be consumed by your script 😅 We'll try to reproduce this and investigate.
Tried to reproduce. What I can say:
- if
GITHUB_TOKENandGITHUB_USERNAMEare set, authentication withghdoesn't work. Got a different issue and created a bug https://github.com/iterative/mlem/issues/528 - Without
gh auth loginand withoutGITHUB_TOKENandGITHUB_USERNAMEset, runningpython test_load_https.pyfails for me withrequests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://api.github.com/repos/aguschin/example-mlem-private(expected) - With
GITHUB_TOKENset (but withoutGITHUB_USERNAMEset)mlem clone ...fails withValueError: Auth required both username and token(expected) - After
gh auth login(using HTTPS) and withoutGITHUB_TOKEN/GITHUB_USERNAME, I'm gettingrequests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://api.github.com/repos/aguschin/example-mlem-private(looks like a bug to me, maybe @mike0sv can clarify).
Thanks @aguschin , I think those are pretty much the issues I'm facing. Any idea why it doesn't work for ssh connections? After all, I can dvc get with the ssh url. And I believe the cloud storage stuff is independent of that anyway.
Hi @igordertigor! Sorry for the long wait 😞 . We did some investigation for #528, and figured out one should be authorized with both SSH and HTTP. I hope this will fix your issue. I wrote it down here https://mlem-ai-dvc-private-rep-yxphuh.herokuapp.com/doc/user-guide/dvc/#working-with-private-repositories
I understand this is not the best way to ask for both SSH and HTTP, so we'll fix it, but for now this is the only solution we have in mind.