JunZhan2000 issues

Results 11 issues of


                                            JunZhan2000

Error when using multi-GPU training: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered

I am trying to train a Chinese model of a conformer. When I train with 4 2080ti, there will be an error in the middle of the epoch: CUDA_ERROR_ILLEGAL_ADDRESS: an...

bug

more info needed

The issue with training results.

Hello, thank you very much for your code and videos! I'm using this code repository to train on the flowers dataset with a batch size of 32 for 200 epochs,...

【Training Error】 IndexError: list index out of range

Traceback (most recent call last): | 0/1 [00:00

torch.load() returned a dict when inference

when run inference code, loading the model, but torch.load() returned a dict get error: > File "vall-e/vall_e/__main__.py", line 30, in main ar = torch.load(args.ar_ckpt).to(args.device) AttributeError: 'dict' object has no attribute...

What data did you use for training

I'm not familiar with the music domain, are there any open-source datasets available for use?

MosIT data

Great job! When will you open source MosIT data?

Is there plan to open source the instruction-tuning data?

Multi-GPU training issues

Hello, thank you very much for your work. Can you give a code for multi-GPU or multi-node training?

xformers error: NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs

> NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs: query : shape=(8, 1024, 1, 64) (torch.float32) key : shape=(8, 1024, 1, 64) (torch.float32) value : shape=(8, 1024, 1, 64) (torch.float32)...

[QA] InternEvo能否load预训练llama2的参数

### 描述问题 InternEvo能否load预训练llama2的参数，再继续预训练，用hf的格式还是原始的格式

question