molfeat
molfeat copied to clipboard
Can't retrieve model ChemGPT-1.2B from the store!
Is there an existing issue for this?
- [X] I have searched the existing issues and found nothing
Bug description
I've been trying to use ChemGPT-1.2B
, but I'm getting this error Can't retrieve model ChemGPT-1.2B from the store !
.
Just FYI, I have successfully used the following models. It appears that I'm having the issue with only this model ChemGPT-1.2B
.
- GPT2-Zinc480M-87M
- ChemBERTa-77M-MLM
- ChemGPT-19M
How to reproduce the bug
transformer = PretrainedHFTransformer(kind='ChemGPT-1.2B', notation='selfies', dtype=float)
features = transformer(smiles)
Error messages and logs
0%| | 0.00/736 [00:00<?, ?B/s]
0%| | 0/7 [00:00<?, ?it/s]
---------------------------------------------------------------------------
ModelStoreError Traceback (most recent call last)
File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/loader.py:100](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/loader.py#line=99), in PretrainedStoreModel._load_or_raise(cls, name, download_path, store, **kwargs)
99 modelcard = store.search(name=name)[0]
--> 100 artifact_dir = store.download(modelcard, download_path, **kwargs)
101 except Exception:
File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/modelstore.py:239](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/modelstore.py#line=238), in ModelStore.download(self, modelcard, output_dir, chunk_size, force)
238 mapper.fs.delete(output_dir, recursive=True)
--> 239 raise ModelStoreError(
240 f"""The destination artifact at {model_dest_path} has a different sha256sum ({cache_sha256sum}) """
241 f"""than the Remote artifact sha256sum ({modelcard.sha256sum}). The destination artifact has been removed !"""
242 )
244 return output_dir
ModelStoreError: The destination artifact at [/Users/chunj/Library/Caches/molfeat/ChemGPT-1.2B/model.save](http://localhost:8889/Users/chunj/Library/Caches/molfeat/ChemGPT-1.2B/model.save) has a different sha256sum (4d8819f7c8c91ba94ba446d32f29342360d62971a9fa37c8cab2e31f9c3fc4c5) than the Remote artifact sha256sum (e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855). The destination artifact has been removed !
During handling of the above exception, another exception occurred:
ModelStoreError Traceback (most recent call last)
Cell In[6], line 1
----> 1 features = transformer(smiles)
File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/base.py:384](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/base.py#line=383), in MoleculeTransformer.__call__(self, mols, enforce_dtype, ignore_errors, **kwargs)
359 def __call__(
360 self,
361 mols: List[Union[dm.Mol, str]],
(...)
364 **kwargs,
365 ):
366 r"""
367 Calculate features for molecules. Using __call__, instead of transform.
368 If ignore_error is True, a list of features and valid ids are returned.
(...)
382
383 """
--> 384 features = self.transform(mols, ignore_errors=ignore_errors, enforce_dtype=False, **kwargs)
385 ids = np.arange(len(features))
386 if ignore_errors:
File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/sklearn/utils/_set_output.py:316](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/sklearn/utils/_set_output.py#line=315), in _wrap_method_output.<locals>.wrapped(self, X, *args, **kwargs)
314 @wraps(f)
315 def wrapped(self, X, *args, **kwargs):
--> 316 data_to_wrap = f(self, X, *args, **kwargs)
317 if isinstance(data_to_wrap, tuple):
318 # only wrap the first output for cross decomposition
319 return_tuple = (
320 _wrap_data_with_container(method, data_to_wrap[0], X, self),
321 *data_to_wrap[1:],
322 )
File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/base.py:207](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/base.py#line=206), in PretrainedMolTransformer.transform(self, smiles, **kwargs)
204 mols = [mols[i] for i in ind_to_compute]
206 if len(mols) > 0:
--> 207 converted_mols = self._convert(mols, **kwargs)
208 out = self._embed(converted_mols, **kwargs)
210 if not isinstance(out, list):
File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/hf_transformers.py:367](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/hf_transformers.py#line=366), in PretrainedHFTransformer._convert(self, inputs, **kwargs)
358 def _convert(self, inputs: list, **kwargs):
359 """Convert the list of molecules to the right format for embedding
360
361 Args:
(...)
365 processed: pre-processed input list
366 """
--> 367 self._preload()
369 if isinstance(inputs, (str, dm.Mol)):
370 inputs = [inputs]
File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/hf_transformers.py:326](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/hf_transformers.py#line=325), in PretrainedHFTransformer._preload(self)
324 def _preload(self):
325 """Perform preloading of the model from the store"""
--> 326 super()._preload()
327 self.featurizer.model.to(self.device)
328 self.featurizer.max_length = self.max_length
File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/base.py:90](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/base.py#line=89), in PretrainedMolTransformer._preload(self)
88 """Preload the pretrained model for later queries"""
89 if self.featurizer is not None and isinstance(self.featurizer, PretrainedModel):
---> 90 self.featurizer = self.featurizer.load()
91 self.preload = True
File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/hf_transformers.py:209](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/hf_transformers.py#line=208), in HFModel.load(self)
207 if self._model is not None:
208 return self._model
--> 209 download_output_dir = self._artifact_load(
210 name=self.name, download_path=self.cache_path, store=self.store
211 )
212 model_path = dm.fs.join(download_output_dir, self.store.MODEL_PATH_NAME)
213 self._model = HFExperiment.load(model_path)
File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/loader.py:81](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/loader.py#line=80), in PretrainedStoreModel._artifact_load(cls, name, download_path, **kwargs)
79 if not dm.fs.exists(download_path):
80 cls._load_or_raise.cache_clear()
---> 81 return cls._load_or_raise(name, download_path, **kwargs)
File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/loader.py:103](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/loader.py#line=102), in PretrainedStoreModel._load_or_raise(cls, name, download_path, store, **kwargs)
101 except Exception:
102 mess = f"Can't retrieve model {name} from the store !"
--> 103 raise ModelStoreError(mess)
104 return artifact_dir
ModelStoreError: Can't retrieve model ChemGPT-1.2B from the store !
Environment
Current environment
molfeat 0.10.1
pytorch 2.4.0
rdkit 2024.03.5
macOS Ventura 13.6.7
scikit-learn 1.5.2
Used conda to install molfeat
Additional context
I'm using my local laptop + Jupyter Lab.