FedScale icon indicating copy to clipboard operation
FedScale copied to clipboard

Unable to Train Google Speech Dataset

Open implosion07 opened this issue 1 year ago • 6 comments

What happened + What you expected to happen

We are unable to train on the google speech dataset. While training, we get the error as shown below Please help us out.

Screenshot from 2023-07-15 12-13-29

Versions / Dependencies

We are running on Red Hat servers and are using python 3.7

Reproduction script

It seems like the problem is with resnet code for it is error due to mismatch in the number of channels.

Issue Severity

High: It blocks me from completing my task.

implosion07 avatar Jul 15 '23 06:07 implosion07

Thanks for trying FedScale. Note that the model you are using perhaps is from torchvision. Please refer to these specialized models for speech task: https://github.com/SymbioticLab/FedScale/blob/master/fedscale/cloud/fllibs.py#L129-L162

fanlai0990 avatar Jul 17 '23 10:07 fanlai0990

Thanks for the reply @fanlai0990 . However we are using the same model as provided in the repository. Also we are using the same configurations as provided in the config file: https://github.com/SymbioticLab/FedScale/blob/master/benchmark/configs/speech/google_speech.yml Especially the task which is set to speech.

implosion07 avatar Jul 19 '23 06:07 implosion07

Hello, I'm encountering the same problem mentioned by @implosion07 . I've used the provided model from the repository and followed the configurations exactly as specified in the config file https://github.com/SymbioticLab/FedScale/blob/master/benchmark/configs/speech/google_speech.yml . Unfortunately, I haven't been able to resolve the issue. Just wanted to add my experience here to raise awareness about the problem. Thanks in advance!

PanPapag avatar Jul 19 '23 07:07 PanPapag

Thanks for letting us know. I just finished my test run, but it seems the code works well for me. Did you install FedScale from the source code? Screen Shot 2023-07-19 at 4 16 48 PM

fanlai0990 avatar Jul 19 '23 08:07 fanlai0990

Thanks for checking @fanlai0990 We are using the following configuration. Can you help us in finding out the mistake if any. Screenshot from 2023-07-19 19-11-00

implosion07 avatar Jul 19 '23 13:07 implosion07

It's hard for us to identify the issue as we can not reproduce it. For now, (1) it seems the data_dir is slightly different to the default one. Please double check it; (2) roll back to the default configuration here to check it.

fanlai0990 avatar Jul 19 '23 13:07 fanlai0990