xlstm_large/load_from_pretrained.ipnb xLSTMLargeConfig.__init__() unexpected kwd '_name_or_path'
With all the experimentation I have done over the last two weeks, after hours of comparison I found out that pip install -e . is the problem.
After several days I had changed my env from pip install xlstm, as I wanted the 2-line code-change for multi-GPU to propagate from the xlstm-fork to the env.
Still need to find what exactly is different between the two install variants.
So this does work with the frozen env!
2 of 2 .ipnb NOGOes Again, worrying not to have read up enough to succeed, I found that this doesn't work either anyway.
In trying to port the pretty much only working streaming inference from Torch to Triton I ran into a cascade of errors around config.yaml.
xlstm_large/load_from_pretrained.ipnb cell 4:
model = load_from_pretrained(checkpoint_path=CHECKPOINT_PATH)
TypeError: xLSTMLargeConfig.__init__() got an unexpected keyword argument '_name_or_path'
46 config = OmegaConf.load(checkpoint_path [/](http://localhost:8888/) "config.yaml")
---> 48 mlstm_config = xLSTMLargeConfig(**config)
Firstly, config.yaml does not exist in the model dir nor anywhere else, so I did a quick conversion from config.json.
Now it didn't like '_name_or_path', the first key.
Turns out it doesn't want half of the 2 dozen keys, but trimming down config.yaml didn't help either, not even dynamically feeding the wanted KVs.
Error message:
TypeError Traceback (most recent call last) Cell In[4], line 1 ----> 1 model = load_from_pretrained(checkpoint_path=CHECKPOINT_PATH)
File ~/dl/xlstm-fork/notebooks/xlstm_large/../../xlstm/xlstm_large/from_pretrained.py:48, in load_from_pretrained(checkpoint_path, return_last_states, chunkwise_kernel_name, sequence_kernel_name, step_kernel_name, backend_mode, chunk_size) 45 sharded_path = checkpoint_path / f"model_{n}.safetensors" 46 config = OmegaConf.load(checkpoint_path / "config.yaml") ---> 48 mlstm_config = xLSTMLargeConfig(**config) 49 # Note: The default weight mode is single. 50 # For fused weight mode convert the weights using convert_single_weights_to_fused_weights. 51 mlstm_config.weight_mode = "single"
TypeError: xLSTMLargeConfig.init() got an unexpected keyword argument '_name_or_path'