transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Update BLOOM parameter counts

Open Muennighoff opened this issue 3 years ago β€’ 11 comments

Update parameter counts of BLOOM models. The original counts were incorrect & have already been updated on the hub. I can't add reviewers, but @younesbelkada @thomasw21 may want to review

Script for counting:

def count_parameters(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)
count_parameters(AutoModelForCausalLM.from_pretrained("bigscience/bloom-350m"))

πŸŒΈπŸ€—

Muennighoff avatar Aug 08 '22 16:08 Muennighoff

Hi @Muennighoff ! Thanks for the fix, just FI the original model sizes were taken from: https://github.com/bigscience-workshop/bigscience/tree/master/train/tr11-176B-ml/smaller_models And I am afraid changing model names can lead to some breaking changes (thinking especially of all the Spaces that are using these models) I think maybe it's safer to rename the models as they were and discuss how we can fix that here

younesbelkada avatar Aug 08 '22 16:08 younesbelkada

I think it's fine as old links still work

New: Automatic Redirection
All links to this model will automatically redirect to the new location, including git operations. However, to avoid confusion, we recommend updating any existing local clones to point to the new repository URL. To do so, you can use the following command: git remote set-url origin {NEW_URL}

Muennighoff avatar Aug 08 '22 16:08 Muennighoff

The documentation is not available anymore as the PR was closed or merged.

Ok if this is the case sounds good to me! πŸ’ͺ Thanks for the fix!

younesbelkada avatar Aug 08 '22 16:08 younesbelkada

Note that the spaces will probably still break; As e.g. AutoTokenizer.from_pretrained("bigscience/bloom-350m") no longer works

Muennighoff avatar Aug 08 '22 20:08 Muennighoff

Wait I think you might have broken old links.


Traceback (most recent call last):
 File "/Users/thomas/code/bigscience/transformers-Official/src/transformers/configuration_utils.py", line 619, in _get_config_dict
   resolved_config_file = cached_path(
 File "/Users/thomas/code/bigscience/transformers-Official/src/transformers/utils/hub.py", line 285, in cached_path
   output_path = get_from_cache(
 File "/Users/thomas/code/bigscience/transformers-Official/src/transformers/utils/hub.py", line 509, in get_from_cache
   raise OSError(
OSError: Distant resource does not have an ETag, we won't be able to reliably ensure reproducibility.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/Users/thomas/code/bigscience/transformers-Official/src/transformers/models/auto/auto_factory.py", line 423, in from_pretrained
   config, kwargs = AutoConfig.from_pretrained(
 File "/Users/thomas/code/bigscience/transformers-Official/src/transformers/models/auto/configuration_auto.py", line 731, in from_pretrained
   config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
 File "/Users/thomas/code/bigscience/transformers-Official/src/transformers/configuration_utils.py", line 557, in get_config_dict
   config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
 File "/Users/thomas/code/bigscience/transformers-Official/src/transformers/configuration_utils.py", line 659, in _get_config_dict
   raise EnvironmentError(
OSError: Can't load config for 'bigscience/bloom-350m'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'bigscience/bloom-350m' is the correct path to a directory containing a config.json file

I'm using transformers=4.21.0

thomasw21 avatar Aug 09 '22 08:08 thomasw21

Yes I can confirm this breaks loading the model using pipeline and tokenizers as well (using transformers=4.21.0 and Google Colab).

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

MAX_NEW_TOKENS = 128
model_name = "bigscience/bloom-350m"
text = "Hello my name is"

pipe = pipeline(task="text-generation", model=model_name)
OSError                                   Traceback (most recent call last)
[/usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py](https://localhost:8080/#) in _get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
    655         except EnvironmentError:
    656             raise EnvironmentError(
--> 657                 f"Can't load config for '{pretrained_model_name_or_path}'. If you were trying to load it from "
    658                 "'https://huggingface.co/models', make sure you don't have a local directory with the same name. "
    659                 f"Otherwise, make sure '{pretrained_model_name_or_path}' is the correct path to a directory "

OSError: Can't load config for 'bigscience/bloom-350m'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'bigscience/bloom-350m' is the correct path to a directory containing a config.json file

Does not work also for models

model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

Could you point us on how you got:


New: Automatic Redirection
All links to this model will automatically redirect to the new location, including git operations. However, to avoid confusion, we recommend updating any existing local clones to point to the new repository URL. To do so, you can use the following command: git remote set-url origin {NEW_URL}

We can probably fix it through a PR

younesbelkada avatar Aug 09 '22 09:08 younesbelkada

I think it's fine as old links still work

New: Automatic Redirection
All links to this model will automatically redirect to the new location, including git operations. However, to avoid confusion, we recommend updating any existing local clones to point to the new repository URL. To do so, you can use the following command: git remote set-url origin {NEW_URL}

This just means that the old URLs still work, i.e. https://huggingface.co/bigscience/bloom-350m (It's from the Settings screen on the Hub).

The model names need to be updated (which is not a bug I think).

Muennighoff avatar Aug 09 '22 09:08 Muennighoff

I'd say this is a breaking change. @sgugger does the from_pretrained method not take in account redirection?

thomasw21 avatar Aug 09 '22 09:08 thomasw21

I addressed a potential fix in: https://github.com/huggingface/transformers/pull/18542 now I can load BLOOM models with old links but I am not sure if this breaks anything else (maybe let's wait for a review and the results of the CI tests there)

younesbelkada avatar Aug 09 '22 10:08 younesbelkada

huggingface_hub does not take into account redirections in its download methods. The issue was given low priority from what I understand, you can bug folks internally to show it's a bit important :-)

sgugger avatar Aug 09 '22 12:08 sgugger

Let's merge this? I think the damage is done & reverting now would just cause more damage. I will communicate such a change more extensively next time, sorry for the inconveniences caused.

Muennighoff avatar Aug 10 '22 19:08 Muennighoff