BioGPT icon indicating copy to clipboard operation
BioGPT copied to clipboard

README example fails: 'NoneType' object has no attribute 'tokenizer'

Open ayushnoori opened this issue 2 years ago • 1 comments

Hi there, I followed the README example set up script exactly, and received the following error. May you please help me resolve?

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[2], line 1
----> 1 m = TransformerLanguageModel.from_pretrained(
      2         "checkpoints/Pre-trained-BioGPT", 
      3         "checkpoint.pt", 
      4         "data",
      5         tokenizer='moses', 
      6         bpe='fastbpe', 
      7         bpe_codes="data/bpecodes",
      8         min_len=100,
      9         max_len_b=1024)

File ~/miniforge3/envs/test_env/lib/python3.10/site-packages/fairseq/models/fairseq_model.py:261, in BaseFairseqModel.from_pretrained(cls, model_name_or_path, checkpoint_file, data_name_or_path, **kwargs)
    238 """
    239 Load a :class:`~fairseq.models.FairseqModel` from a pre-trained model
    240 file. Downloads and caches the pre-trained model file if needed.
   (...)
    257         model archive path.
    258 """
    259 from fairseq import hub_utils
--> 261 x = hub_utils.from_pretrained(
    262     model_name_or_path,
    263     checkpoint_file,
    264     data_name_or_path,
    265     archive_map=cls.hub_models(),
    266     **kwargs,
    267 )
    269 cls.upgrade_args(x["args"])
    271 logger.info(x["args"])

File ~/miniforge3/envs/test_env/lib/python3.10/site-packages/fairseq/hub_utils.py:70, in from_pretrained(model_name_or_path, checkpoint_file, data_name_or_path, archive_map, **kwargs)
     67 if "user_dir" in kwargs:
     68     utils.import_user_module(argparse.Namespace(user_dir=kwargs["user_dir"]))
---> 70 models, args, task = checkpoint_utils.load_model_ensemble_and_task(
     71     [os.path.join(model_path, cpt) for cpt in checkpoint_file.split(os.pathsep)],
     72     arg_overrides=kwargs,
     73 )
     75 return {
     76     "args": args,
     77     "task": task,
     78     "models": models,
     79 }

File ~/miniforge3/envs/test_env/lib/python3.10/site-packages/fairseq/checkpoint_utils.py:279, in load_model_ensemble_and_task(filenames, arg_overrides, task, strict, suffix, num_shards)
    277 if not PathManager.exists(filename):
    278     raise IOError("Model file not found: {}".format(filename))
--> 279 state = load_checkpoint_to_cpu(filename, arg_overrides)
    280 if shard_idx == 0:
    281     args = state["args"]

File ~/miniforge3/envs/test_env/lib/python3.10/site-packages/fairseq/checkpoint_utils.py:231, in load_checkpoint_to_cpu(path, arg_overrides)
    229 if arg_overrides is not None:
    230     for arg_name, arg_val in arg_overrides.items():
--> 231         setattr(args, arg_name, arg_val)
    232 state = _upgrade_state_dict(state)
    233 return state

AttributeError: 'NoneType' object has no attribute 'tokenizer'

My environment variables are set as:

(test_env) an583@PHS030015 project_dir % conda env config vars list
MOSES = /Users/ayush/project_dir/mosesdecoder
FASTBPE = /Users/ayush/project_dir/fastBPE

I would be grateful for assistance to help resolve. Thanks!

ayushnoori avatar Feb 18 '23 15:02 ayushnoori

I fixed this by reinstalling fairseq to 0.12.0. The package got reinstalled to versin 0.10.0 while installing sacremoses.

fairseq version == 0.12.0:

brigmecham avatar Mar 29 '23 19:03 brigmecham