timesfm icon indicating copy to clipboard operation
timesfm copied to clipboard

checkpoint loading issue

Open ylq1996 opened this issue 1 year ago • 3 comments

i have downloaded the checkpoint from the provided repo on huggingface. However, when I ran the code, there was an error when loading the checkpoint, 'tfm.load_from_checkpoint('checkpoint')'

ValueError: Dimension to ungroup is not divisible by its index sizes. Group "(np)" expects size 228, but its indices "p" have combined specified size 32. ERROR conda.cli.main_run:execute(124): conda run python /opt/project/test.py failed. (See above for error)

ylq1996 avatar May 10 '24 09:05 ylq1996

This looks like an issue that happened during jitting the model, and it might be because thecontext_len was not a multiplier of 32. What were the params you used to initialize the model instance?

Also updated README to make this requirement explicit.

siriuz42 avatar May 10 '24 17:05 siriuz42

I used to set the context_len=100, which leads to this error. After changing context_len=128, it has been solved. However, I encountered a new issue, I stored the checkpoint weight in ['/usr/src/app/checkpoint']. Using this tfm.load_from_checkpoint('/usr/src/app/') will result in ValueError: No checkpoints were found in directory checkpoint_dir=PosixGPath('/usr/src/app'). And i found that there was no step from the code: if step is None: step = checkpoint_manager.latest_step() if step is None: raise ValueError( f'No checkpoints were found in directory {checkpoint_dir=!r}' )

ylq1996 avatar May 13 '24 03:05 ylq1996

I used to set the context_len=100, which leads to this error. After changing context_len=128, it has been solved. However, I encountered a new issue, I stored the checkpoint weight in ['/usr/src/app/checkpoint']. Using this tfm.load_from_checkpoint('/usr/src/app/') will result in ValueError: No checkpoints were found in directory checkpoint_dir=PosixGPath('/usr/src/app'). And i found that there was no step from the code: if step is None: step = checkpoint_manager.latest_step() if step is None: raise ValueError( f'No checkpoints were found in directory {checkpoint_dir=!r}' )

Before I met the same problem with you, my "checkpoint" path is[/home/dedong/huggingface/timesfm/checkpoints/checkpoint_1100000 / state], Change the path to[/home/dedong/huggingface/timesfm/checkpoints] can successfully loaded model

zhaokui001 avatar May 31 '24 09:05 zhaokui001