transformers
transformers copied to clipboard
Update BLOOM parameter counts
Update parameter counts of BLOOM models. The original counts were incorrect & have already been updated on the hub. I can't add reviewers, but @younesbelkada @thomasw21 may want to review
Script for counting:
def count_parameters(model):
return sum(p.numel() for p in model.parameters() if p.requires_grad)
count_parameters(AutoModelForCausalLM.from_pretrained("bigscience/bloom-350m"))
πΈπ€
Hi @Muennighoff ! Thanks for the fix, just FI the original model sizes were taken from: https://github.com/bigscience-workshop/bigscience/tree/master/train/tr11-176B-ml/smaller_models And I am afraid changing model names can lead to some breaking changes (thinking especially of all the Spaces that are using these models) I think maybe it's safer to rename the models as they were and discuss how we can fix that here
I think it's fine as old links still work
New: Automatic Redirection
All links to this model will automatically redirect to the new location, including git operations. However, to avoid confusion, we recommend updating any existing local clones to point to the new repository URL. To do so, you can use the following command: git remote set-url origin {NEW_URL}
The documentation is not available anymore as the PR was closed or merged.
Ok if this is the case sounds good to me! πͺ Thanks for the fix!
Note that the spaces will probably still break; As e.g. AutoTokenizer.from_pretrained("bigscience/bloom-350m") no longer works
Wait I think you might have broken old links.
Traceback (most recent call last):
File "/Users/thomas/code/bigscience/transformers-Official/src/transformers/configuration_utils.py", line 619, in _get_config_dict
resolved_config_file = cached_path(
File "/Users/thomas/code/bigscience/transformers-Official/src/transformers/utils/hub.py", line 285, in cached_path
output_path = get_from_cache(
File "/Users/thomas/code/bigscience/transformers-Official/src/transformers/utils/hub.py", line 509, in get_from_cache
raise OSError(
OSError: Distant resource does not have an ETag, we won't be able to reliably ensure reproducibility.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/thomas/code/bigscience/transformers-Official/src/transformers/models/auto/auto_factory.py", line 423, in from_pretrained
config, kwargs = AutoConfig.from_pretrained(
File "/Users/thomas/code/bigscience/transformers-Official/src/transformers/models/auto/configuration_auto.py", line 731, in from_pretrained
config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/Users/thomas/code/bigscience/transformers-Official/src/transformers/configuration_utils.py", line 557, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/Users/thomas/code/bigscience/transformers-Official/src/transformers/configuration_utils.py", line 659, in _get_config_dict
raise EnvironmentError(
OSError: Can't load config for 'bigscience/bloom-350m'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'bigscience/bloom-350m' is the correct path to a directory containing a config.json file
I'm using transformers=4.21.0
Yes I can confirm this breaks loading the model using pipeline and tokenizers as well (using transformers=4.21.0 and Google Colab).
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
MAX_NEW_TOKENS = 128
model_name = "bigscience/bloom-350m"
text = "Hello my name is"
pipe = pipeline(task="text-generation", model=model_name)
OSError Traceback (most recent call last)
[/usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py](https://localhost:8080/#) in _get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
655 except EnvironmentError:
656 raise EnvironmentError(
--> 657 f"Can't load config for '{pretrained_model_name_or_path}'. If you were trying to load it from "
658 "'https://huggingface.co/models', make sure you don't have a local directory with the same name. "
659 f"Otherwise, make sure '{pretrained_model_name_or_path}' is the correct path to a directory "
OSError: Can't load config for 'bigscience/bloom-350m'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'bigscience/bloom-350m' is the correct path to a directory containing a config.json file
Does not work also for models
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
Could you point us on how you got:
New: Automatic Redirection
All links to this model will automatically redirect to the new location, including git operations. However, to avoid confusion, we recommend updating any existing local clones to point to the new repository URL. To do so, you can use the following command: git remote set-url origin {NEW_URL}
We can probably fix it through a PR
I think it's fine as old links still work
New: Automatic Redirection All links to this model will automatically redirect to the new location, including git operations. However, to avoid confusion, we recommend updating any existing local clones to point to the new repository URL. To do so, you can use the following command: git remote set-url origin {NEW_URL}
This just means that the old URLs still work, i.e. https://huggingface.co/bigscience/bloom-350m (It's from the Settings screen on the Hub).
The model names need to be updated (which is not a bug I think).
I'd say this is a breaking change. @sgugger does the from_pretrained method not take in account redirection?
I addressed a potential fix in: https://github.com/huggingface/transformers/pull/18542 now I can load BLOOM models with old links but I am not sure if this breaks anything else (maybe let's wait for a review and the results of the CI tests there)
huggingface_hub does not take into account redirections in its download methods. The issue was given low priority from what I understand, you can bug folks internally to show it's a bit important :-)
Let's merge this? I think the damage is done & reverting now would just cause more damage. I will communicate such a change more extensively next time, sorry for the inconveniences caused.