publicstaticvo comments

Results 5 comments of


                                            publicstaticvo

self.client_module.attn.q_proj.weight.shape[1] returns IndexError: tuple index out of range

@cmikeh2 this is another error with zero_stage=3

self.client_module.attn.q_proj.weight.shape[1] returns IndexError: tuple index out of range

I deleted `free_param(param)` at line 1115 of `deepspeed/runtime/zero/partition_parameters.py` and it seems to work, but I don't know if it is the right solution. For example, I met another Cuda Out...

spectra

https://space-mm-data.oss-cn-wulanchabu.aliyuncs.com/downstreamv2/iemocap.tgz https://space-mm-data.oss-cn-wulanchabu.aliyuncs.com/downstreamv2/mintrec.tgz https://space-mm-data.oss-cn-wulanchabu.aliyuncs.com/downstreamv2/mosei.tgz https://space-mm-data.oss-cn-wulanchabu.aliyuncs.com/downstreamv2/mosi.tgz 这些是pre-processed fine-tuning data，都是由pickle组成可以直接用，spokenwoz和预训练的晚一点给

spectra

你的transformers版本是4.18么不过按理来说应该没问题的，因为我这边4.28也能跑通我看了整个代码都没找到有这个“cache_dir” 你的traceback是不是没有截取完整，我看不出是代码执行到哪出的问题

spectra

哦源代码里面好像是没给直接从一个wavlm和一个roberta直接下游微调的设置，必须要从预训练好的模型开始的你可以自己试试改改，应该比较简单，参考ATForPretraining的写法就成