FedML icon indicating copy to clipboard operation
FedML copied to clipboard

Input dimension error on "CNN + MNIST"

Open royukira opened this issue 2 years ago • 3 comments

Hi, when I try to run "CNN + MNIST" at the mqtt_s3_fedavg_mnist_lr_example, I found a bug caused by the shape of input tensor.

Error

Traceback (most recent call last):
  File "/home/fedml/FedML/python/examples/cross_silo/mqtt_s3_fedavg_mnist_lr_example/one_line/../../../../fedml/core/distributed/communication/mqtt_s3/mqtt_s3_multi_clients_comm_manager.py", line 224, in _on_message
    self._on_message_impl(client, userdata, msg)
  File "/home/fedml/FedML/python/examples/cross_silo/mqtt_s3_fedavg_mnist_lr_example/one_line/../../../../fedml/core/distributed/communication/mqtt_s3/mqtt_s3_multi_clients_comm_manager.py", line 220, in _on_message_impl
    self._notify(payload_obj)
  File "/home/fedml/FedML/python/examples/cross_silo/mqtt_s3_fedavg_mnist_lr_example/one_line/../../../../fedml/core/distributed/communication/mqtt_s3/mqtt_s3_multi_clients_comm_manager.py", line 184, in _notify
    observer.receive_message(msg_type, msg_params)
  File "/home/fedml/FedML/python/examples/cross_silo/mqtt_s3_fedavg_mnist_lr_example/one_line/../../../../fedml/core/distributed/client/client_manager.py", line 103, in receive_message
    handler_callback_func(msg_params)
  File "/home/fedml/FedML/python/examples/cross_silo/mqtt_s3_fedavg_mnist_lr_example/one_line/../../../../fedml/cross_silo/horizontal/fedml_client_manager.py", line 77, in handle_message_init
    self.__train()
  File "/home/fedml/FedML/python/examples/cross_silo/mqtt_s3_fedavg_mnist_lr_example/one_line/../../../../fedml/cross_silo/horizontal/fedml_client_manager.py", line 164, in __train
    weights, local_sample_num = self.trainer.train(self.round_idx)
  File "/home/fedml/FedML/python/examples/cross_silo/mqtt_s3_fedavg_mnist_lr_example/one_line/../../../../fedml/cross_silo/horizontal/fedml_trainer.py", line 41, in train
    self.trainer.train(self.train_local, self.device, self.args)
  File "/home/fedml/FedML/python/examples/cross_silo/mqtt_s3_fedavg_mnist_lr_example/one_line/../../../../fedml/cross_silo/horizontal/trainer/my_model_trainer_classification.py", line 42, in train
    log_probs = model(x)
  File "/home/fedml/anaconda3/envs/fedml/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fedml/FedML/python/examples/cross_silo/mqtt_s3_fedavg_mnist_lr_example/one_line/../../../../fedml/model/cv/cnn.py", line 134, in forward
    x = self.conv2d_1(x)
  File "/home/fedml/anaconda3/envs/fedml/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fedml/anaconda3/envs/fedml/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 446, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/fedml/anaconda3/envs/fedml/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 442, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 1, 3, 3], but got 3-dimensional input of size [10, 1, 784] instead

According to the traceback info, I found that the MNIST dataloader doesn't reshape the x of size [784, ] to [28, 28]. And, I temporarily fix the bug in my forked repo, please review at: 551f551

royukira avatar May 07 '22 11:05 royukira

Hi

Can you run the FedML with CNN + MNIST for python torch_fedavg_mnist_lr_step_by_step_example.py --cf fedml_config.yaml or python torch_fedavg_mnist_lr_custum_data_and_model_example.py --cf fedml_config.yaml ?

Could you show me how to edit the source code? Thank you

trannict avatar May 13 '22 07:05 trannict

@Nicole456 please double-check whether the code still has this issue. Do we need to merge this 551f551

chaoyanghe avatar Aug 19 '22 17:08 chaoyanghe

Hi

Can you run the FedML with CNN + MNIST for python torch_fedavg_mnist_lr_step_by_step_example.py --cf fedml_config.yaml or python torch_fedavg_mnist_lr_custum_data_and_model_example.py --cf fedml_config.yaml ?

Could you show me how to edit the source code? Thank you

Hi, the example under the path python/examples/simulation/sp_fedavg_mnist_cnn_example might meet your needs. You can run FedML with CNN + MNIST by running python torch_fedavg_mnist_cnn_step_by_step_example.py --cf fedml_config.yaml under the path.

And there are many examples of FL algorithms(fedavg,fedopt etc.), models and dataset combinations under python/examples/simulation, so you can refer to these examples and customize them according to your needs

Nicole456 avatar Aug 24 '22 06:08 Nicole456

Closing this issue due to inactivity.

fedml-dimitris avatar Oct 24 '23 20:10 fedml-dimitris