skillful_nowcasting Loading model using DGMR.from_pretrained("openclimatefix/dgmr") fails

Describe the bug

When loading the pretrained model from HuggingFace as per the instructions in README.md I get a KeyError: 'config' exception. The issue is that when following the instructions **model_kwargs will be empty in NowcastingModelHubMixin._from_pretrained, but it then attempts to pass **model_kwargs['config'] to the class constructor in https://github.com/openclimatefix/skillful_nowcasting/blob/8c40296e79b3a5849269e8da475b6f9f6fd1ac5f/dgmr/hub.py#L154

To Reproduce

>>> from dgmr import DGMR
>>> model = DGMR.from_pretrained("openclimatefix/dgmr")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/me/venv/dgmr/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/me/venv/dgmr/lib/python3.11/site-packages/huggingface_hub/hub_mixin.py", line 420, in from_pretrained
    instance = cls._from_pretrained(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/home/me/venv/dgmr/git/skillful_nowcasting/dgmr/hub.py", line 154, in _from_pretrained
    model = cls(**model_kwargs["config"])
                  ~~~~~~~~~~~~^^^^^^^^^^
KeyError: 'config'

Expected behavior

I expect to have a variable model with the pretrained DGMR model.

Additional context

One can trivially work around the problem by passing an empty config argument when loading the pretrained model:

model = DGMR.from_pretrained("openclimatefix/dgmr", config={})

so maybe this is just a matter of updating README.md.

Still, I'd argue that the cleaner fix would be to pass the entire **model_kwargs to the class constructor, thus

model = cls(**model_kwargs)

That would also coincide with how HuggingFace themselves do it in https://github.com/huggingface/huggingface_hub/blob/855ee997202e003b80bafd3c02dac65146c89cc4/src/huggingface_hub/hub_mixin.py#L633.

Apr 14 '24 20:04 agijsberts

Given that I was baffled that nobody seemed to have reported such a basic problem earlier, I investigated this issue further and, not surprisingly, it is a bit more complicated than I initially though.

It turns out the issue surfaced only recently due to changes introduced in huggingface_hub >= 0.22.0. So for the moment the best workaround is to downgrade huggingface_hub to version 0.21.4:

pip3 install huggingface-hub==0.21.4

Prior to https://github.com/huggingface/huggingface_hub/commit/45147c518ad3c1f70ecb462de4bf23cd553ba54b, config would be injected in the model_kwargs if the model class accepted a **kwargs argument. From version 0.22.0 onwards, the logic for when config is injected has changed . Looking at the code and the docs at https://huggingface.co/docs/huggingface_hub/main/en/guides/integrations#a-more-complex-approach-class-inheritance, I believe that config is injected if (a) the model class expects a parameter config in its __init__, and/or (b) the model class expects a parameter config in its _from_pretrained.

More relevantly though, if the model class accepts a **kwargs argument in its __init__, then the contents of the config are injected one by one in the **model_kwargs (as opposed to injecting the config as-is in **model_kwargs['config']). This setting actually seems more applicable to me in NowcastingModelHubMixin: wouldn't it be sufficient to replace model = cls(**model_kwargs['config']) with model = cls(**model_kwargs) in _from_pretrained? And wouldn't that also allow to simplify DGMR.__init__?

Apr 22 '24 12:04 agijsberts

Thanks @agijsberts for finding this. Should we pin the requirements file to huggingface-hub==0.21.4. Would you be interested in making a PR for this?

Would you be able to expand further on the the model_kwargs point? Perhaps even some links to the code where it could be improved?

Jun 08 '24 19:06 peterdudfield

Pinning huggingface-hub==0.21.4 is a sensible stopgap in my opinion. I'll prepare a PR right away.

Regarding the model_kwargs and without going into too much detail: a significant chunk of DGMR.__init__ is dedicated to dealing with a potential config parameter and, when passed, to use it to override the normal arguments. If I understood the workings of ModelHubMixin correctly, it should no longer be necessary to rely on such a "magic" config argument and removing the logic would make DGMR.__init__ agnostic to the mixin.

I volunteer to have a look at what would be the best way to resolve this.

Jun 10 '24 19:06 agijsberts

Thanks @agijsberts, yea if you were able to have a look at the DGMR.__init__ and how the best way to resolve this is, then that would be super

Jun 11 '24 08:06 peterdudfield

I have just pushed the possible __init__ simplification in my fork https://github.com/agijsberts/skillful_nowcasting/tree/refactor_huggingface-hub. This already includes tests to make sure that serialization-deserialization returns an identical object for DGMR, Discriminator, etc. Unfortunately that introduces a pip dependency on safetensors. In any case, after my holidays I want to check that predictions on some test data are still the same before turning this into a PR.

Perhaps someone can clarify what the reason is that DGMR needs a custom NowcastingModelHubMixin instead of using huggingface-hub's PyTorchModelHubMixin (as used by Discriminator, Sampler etc.). My impression is that both do the same thing (except that PyTorchModelHubMixin writes in safetensor format), but I'm probably missing something.

Jun 18 '24 21:06 agijsberts

Thanks @agijsberts - have a good holiday, and I look foreward to seeing the PR

If Im honest Im not totlaly sure why NowcastingModelHubMixin is used instead of PyTorchModelHubMixin, but ill have to have a look at it

Jun 19 '24 07:06 peterdudfield