xlstm_large/load_from_pretrained.ipnb xLSTMLargeConfig.init() unexpected kwd '_name_or_path'

Open ai-bits opened this issue 1 year ago • 0 comments

With all the experimentation I have done over the last two weeks, after hours of comparison I found out that pip install -e . is the problem. After several days I had changed my env from pip install xlstm, as I wanted the 2-line code-change for multi-GPU to propagate from the xlstm-fork to the env. Still need to find what exactly is different between the two install variants. So this does work with the frozen env!

2 of 2 .ipnb NOGOes Again, worrying not to have read up enough to succeed, I found that this doesn't work either anyway.

In trying to port the pretty much only working streaming inference from Torch to Triton I ran into a cascade of errors around config.yaml.

xlstm_large/load_from_pretrained.ipnb cell 4: model = load_from_pretrained(checkpoint_path=CHECKPOINT_PATH)

TypeError: xLSTMLargeConfig.__init__() got an unexpected keyword argument '_name_or_path'

46 config = OmegaConf.load(checkpoint_path [/](http://localhost:8888/) "config.yaml") ---> 48 mlstm_config = xLSTMLargeConfig(**config)

Firstly, config.yaml does not exist in the model dir nor anywhere else, so I did a quick conversion from config.json. Now it didn't like '_name_or_path', the first key. Turns out it doesn't want half of the 2 dozen keys, but trimming down config.yaml didn't help either, not even dynamically feeding the wanted KVs.

Error message:

TypeError Traceback (most recent call last) Cell In[4], line 1 ----> 1 model = load_from_pretrained(checkpoint_path=CHECKPOINT_PATH)

File ~/dl/xlstm-fork/notebooks/xlstm_large/../../xlstm/xlstm_large/from_pretrained.py:48, in load_from_pretrained(checkpoint_path, return_last_states, chunkwise_kernel_name, sequence_kernel_name, step_kernel_name, backend_mode, chunk_size) 45 sharded_path = checkpoint_path / f"model_{n}.safetensors" 46 config = OmegaConf.load(checkpoint_path / "config.yaml") ---> 48 mlstm_config = xLSTMLargeConfig(**config) 49 # Note: The default weight mode is single. 50 # For fused weight mode convert the weights using convert_single_weights_to_fused_weights. 51 mlstm_config.weight_mode = "single"

TypeError: xLSTMLargeConfig.init() got an unexpected keyword argument '_name_or_path'

Mar 01 '25 08:03 ai-bits

xlstm_large/load_from_pretrained.ipnb xLSTMLargeConfig.__init__() unexpected kwd '_name_or_path'

xlstm_large/load_from_pretrained.ipnb xLSTMLargeConfig.init() unexpected kwd '_name_or_path'