cvt2distilgpt2
cvt2distilgpt2 copied to clipboard
'type' object is not subscriptable
After I ran "!python3 main.py --task mimic_cxr_jpg_chen". I got the following error:
warnings.warn(f"Workstation configuration for {socket.gethostname()} does not exist. Using default "
- CUDA:
- GPU:
- A100-SXM4-40GB
- available: True
- version: 11.3
- GPU:
- Packages:
- numpy: 1.21.6
- pyTorch_debug: False
- pyTorch_version: 1.12.0+cu113
- pytorch-lightning: 1.5.10
- tqdm: 4.64.0
- System:
- OS: Linux
- architecture:
- 64bit
- processor: x86_64
- python: 3.7.13
- version: #1 SMP Sun Apr 24 10:03:06 PDT 2022
Traceback (most recent call last):
File "main.py", line 214, in
main(clargs) File "main.py", line 58, in main config = get_config(clargs) File "/content/drive/MyDrive/cvt2distilgpt2/transmodal/config.py", line 54, in get_config config = load_config(clargs) File "/content/drive/MyDrive/cvt2distilgpt2/transmodal/config.py", line 26, in load_config config = getattr(importlib.import_module(module), "config")() File "/usr/lib/python3.7/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File " ", line 1006, in _gcd_import File " ", line 983, in _find_and_load File " ", line 967, in _find_and_load_unlocked File " ", line 677, in _load_unlocked File " ", line 728, in exec_module File " ", line 219, in _call_with_frames_removed File "/content/drive/MyDrive/cvt2distilgpt2/task/mimic_cxr_jpg_chen/config/cvt_21_to_distilgpt2_scst.py", line 1, in from config.cvt_21_to_distilgpt2_chexbert import config as external_config File "/content/drive/MyDrive/cvt2distilgpt2/config/cvt_21_to_distilgpt2_chexbert.py", line 1, in from config.cvt_21_to_distilgpt2 import config as external_config File "/content/drive/MyDrive/cvt2distilgpt2/config/cvt_21_to_distilgpt2.py", line 1, in from transmodal.network.cvt import spatial_position_feature_size File "/content/drive/MyDrive/cvt2distilgpt2/transmodal/network/cvt.py", line 27, in class CvT(Module): File "/content/drive/MyDrive/cvt2distilgpt2/transmodal/network/cvt.py", line 77, in CvT def forward(self, images: torch.FloatTensor) -> Union[dict[str, Tensor], dict[str, Union[Tensor, Any]]]:
TypeError: 'type' object is not subscriptable
Please guide me through this. Thank you.
Hi @jainnipun11,
I think I have found the cause of the issue. The return typing hints on this line are incompatible with python 3.7 (but work fine with python 3.9): https://github.com/aehrc/cvt2distilgpt2/blob/13d450ac3509e0d671f2aed701d57d01e2d73618/transmodal/network/cvt.py#L77
e.g., if we have tmp.py:
from typing import Optional, Union, Any
def forward(images) -> Union[dict[str, float], dict[str, Union[float, Any]]]:
return None
The following error occurs with python 3.7, but not with python 3.9
compute-i1 ~$ module load python/3.7.11
Loading python/3.7.11
Unloading conflict: python/3.9.4
compute-i1 ~$ python3 tmp.py
Traceback (most recent call last):
File "tmp.py", line 4, in <module>
def forward(images) -> Union[dict[str, float], dict[str, Union[float, Any]]]:
TypeError: 'type' object is not subscriptable
compute-i1 ~$ module load python/3.9.4
Loading python/3.9.4
Unloading conflict: python/3.7.11
compute-i1 ~$ python3 tmp.py
compute-i1 ~$
So, I have removed the return typing hints for that function: https://github.com/aehrc/cvt2distilgpt2/blob/77656703a67b03c4ea415fbaab4766c1218c479e/transmodal/network/cvt.py#L77
If you can check if this worked, that would be great and please let me know of any further issues after this.
Thanks, Aaron.
Hey Aaron, I modified according to you and now I got this error:
/content/drive/MyDrive/cvt2distilgpt2/transmodal/utils.py:39: UserWarning: Workstation configuration for 5f6e9f0cc0f3 does not exist. Using default configuration: num_workers=5, total_gpus=1, total_memory=16 warnings.warn(f"Workstation configuration for {socket.gethostname()} does not exist. Using default "
- CUDA:
- GPU:
- Tesla V100-SXM2-16GB
- available: True
- version: 11.3
- GPU:
- Packages:
- numpy: 1.21.6
- pyTorch_debug: False
- pyTorch_version: 1.12.0+cu113
- pytorch-lightning: 1.5.10
- tqdm: 4.64.0
- System:
- OS: Linux
- architecture:
- 64bit
- processor: x86_64
- python: 3.7.13
- version: #1 SMP Sun Apr 24 10:03:06 PDT 2022
Traceback (most recent call last):
File "main.py", line 214, in
main(clargs) File "main.py", line 58, in main config = get_config(clargs) File "/content/drive/MyDrive/cvt2distilgpt2/transmodal/config.py", line 89, in get_config local_files_only=True, File "/usr/local/lib/python3.7/dist-packages/transformers/models/auto/tokenization_auto.py", line 609, in from_pretrained return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py", line 1812, in from_pretrained **kwargs, File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py", line 1836, in _from_pretrained **(copy.deepcopy(kwargs)), File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py", line 1950, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/usr/local/lib/python3.7/dist-packages/transformers/models/gpt2/tokenization_gpt2.py", line 192, in init with open(merges_file, encoding="utf-8") as merges_handle:
TypeError: expected str, bytes or os.PathLike object, not NoneType
Thanks.
Hi @jainnipun11,
Can you confirm that you have downloaded the files for distilgpt2 from https://huggingface.co/distilgpt2/tree/main and have placed them in https://github.com/aehrc/cvt2distilgpt2/tree/main/checkpoints/distilgpt2?
I have made some updates to check for path issues. Can you please pull the latest version and test?
And if problems persist, can you please try python 3.8 or 3.9??
Hope this helps, Aaron.
Hey Aaron,
I followed all your path instructions and I have stored the 4 files inside the recommended folder. Now, I updated the python version to python 3.9.1, and this is the result:
/content/drive/MyDrive/cvt2distilgpt2/transmodal/utils.py:39: UserWarning: Workstation configuration for ad62cbeafeaa does not exist. Using default configuration: num_workers=5, total_gpus=1, total_memory=16 warnings.warn(f"Workstation configuration for {socket.gethostname()} does not exist. Using default "
- CUDA:
- GPU:
- Tesla V100-SXM2-16GB
- available: True
- version: 10.2
- GPU:
- Packages:
- numpy: 1.22.1
- pyTorch_debug: False
- pyTorch_version: 1.10.1+cu102
- pytorch-lightning: 1.5.8
- tqdm: 4.62.3
- System:
- OS: Linux
- architecture:
- 64bit
- processor: x86_64
- python: 3.9.1
- version: #1 SMP Sun Apr 24 10:03:06 PDT 2022
/content/drive/MyDrive/cvt2distilgpt2/transmodal/ext/cvt/models/cls_cvt.py:558: SyntaxWarning: "is" with a literal. Did you mean "=="?
or pretrained_layers[0] is '*'
Traceback (most recent call last):
File "/content/drive/MyDrive/cvt2distilgpt2/main.py", line 214, in
main(clargs) File "/content/drive/MyDrive/cvt2distilgpt2/main.py", line 58, in main config = get_config(clargs) File "/content/drive/MyDrive/cvt2distilgpt2/transmodal/config.py", line 87, in get_config config["tokenizer"] = AutoTokenizer.from_pretrained( File "/usr/local/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 550, in from_pretrained return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) File "/usr/local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1747, in from_pretrained return cls._from_pretrained( File "/usr/local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1773, in _from_pretrained slow_tokenizer = (cls.slow_tokenizer_class)._from_pretrained( File "/usr/local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1882, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/usr/local/lib/python3.9/site-packages/transformers/models/gpt2/tokenization_gpt2.py", line 186, in init with open(merges_file, encoding="utf-8") as merges_handle:
TypeError: expected str, bytes or os.PathLike object, not NoneType
Hi @jainnipun11,
It seems that you did not pull the latest version of the repo.
In your error, AutoTokenizer is on line 87
File "/content/drive/MyDrive/cvt2distilgpt2/transmodal/config.py", line 87, in get_config
config["tokenizer"] = AutoTokenizer.from_pretrained(
Whereas, in the latest repo, it is on line 93: https://github.com/aehrc/cvt2distilgpt2/blob/618bdba5de308d9cb2059c75158464b7ec5b408f/transmodal/config.py#L93
Please pull the latest repo and we can go from there.
Aaron.
Hey Aaron. This time around I am encountering a different error altogether. I am using Python 3.9. Installed all the requirements using this.
Traceback (most recent call last):
File "/content/drive/MyDrive/cvt2distilgpt2/main.py", line 9, in
ValueError: transformers.models.auto.spec is None
Hi @jainnipun11,
Try this: https://github.com/huggingface/transformers/issues/15212
Hey Aaron! I referred the repository you suggested, it resolved the issue I was encountering earlier, but again the return type error occured.
/content/drive/MyDrive/cvt2distilgpt2/transmodal/utils.py:39: UserWarning: Workstation configuration for b563c33267e1 does not exist. Using default configuration: num_workers=5, total_gpus=1, total_memory=16 warnings.warn(f"Workstation configuration for {socket.gethostname()} does not exist. Using default "
- CUDA:
- GPU:
- Tesla V100-SXM2-16GB
- available: True
- version: 10.2
- GPU:
- Packages:
- numpy: 1.22.1
- pyTorch_debug: False
- pyTorch_version: 1.10.1+cu102
- pytorch-lightning: 1.5.8
- tqdm: 4.62.3
- System:
- OS: Linux
- architecture:
- 64bit
- processor: x86_64
- python: 3.9.1
- version: #1 SMP Sun Apr 24 10:03:06 PDT 2022
Traceback (most recent call last):
File "/content/drive/MyDrive/cvt2distilgpt2/main.py", line 214, in
main(clargs) File "/content/drive/MyDrive/cvt2distilgpt2/main.py", line 58, in main config = get_config(clargs) File "/content/drive/MyDrive/cvt2distilgpt2/transmodal/config.py", line 90, in get_config config["tokenizer"] = AutoTokenizer.from_pretrained( File "/usr/local/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 550, in from_pretrained return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) File "/usr/local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1747, in from_pretrained return cls._from_pretrained( File "/usr/local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1773, in _from_pretrained slow_tokenizer = (cls.slow_tokenizer_class)._from_pretrained( File "/usr/local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1882, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/usr/local/lib/python3.9/site-packages/transformers/models/gpt2/tokenization_gpt2.py", line 186, in init with open(merges_file, encoding="utf-8") as merges_handle: TypeError: expected str, bytes or os.PathLike object, not NoneType
Thank you. Nipun
Hi Nipun,
I will look into this tomorrow, but in the meantime could you please try this GPT2 example in the same environment and let me know if it works? https://huggingface.co/docs/transformers/model_doc/gpt2#transformers.GPT2LMHeadModel.forward.example
Thanks for your patience, Aaron.