FedML
FedML copied to clipboard
One server and one client example in FedCV object detection
Hi,
I build FedML platform using docker container provided by the authors.
To see the performance of simple one server and one client example, I run run_server.sh
and run_client.sh
scripts inside the object_detection/ directory. Then, I started receiving following state in server and client terminal
#Server side:
mqtt_s3.send_message: msg topic = fedml_yolov5_0_2
mqtt_s3.send_message: msg topic = fedml_yolov5_0_2
[FedML-Server(0) @device-id-0] [Thu, 21 Jul 2022 02:36:06] [INFO] [mqtt_s3_multi_clients_comm_manager.py:249:send_message] mqtt_s3.send_message: MQTT msg sent
mqtt_s3.send_message: MQTT msg sent
mqtt_s3.send_message: MQTT msg sent
[FedML-Server(0) @device-id-0] [Thu, 21 Jul 2022 02:36:06] [INFO] [fedml_server_manager.py:101:handle_messag_connection_ready] Connection ready for client2
Connection ready for client2
Connection ready for client2
[FedML-Server(0) @device-id-0] [Thu, 21 Jul 2022 02:36:06] [INFO] [mqtt_s3_multi_clients_comm_manager.py:140:on_connected] mqtt_s3.on_connect: server subscribes
mqtt_s3.on_connect: server subscribes
mqtt_s3.on_connect: server subscribes
[FedML-Server(0) @device-id-0] [Thu, 21 Jul 2022 02:36:06] [INFO] [mqtt_s3_multi_clients_comm_manager.py:198:_on_message_impl] mqtt_s3.on_message: not use s3 pack
mqtt_s3.on_message: not use s3 pack
mqtt_s3.on_message: not use s3 pack
[FedML-Server(0) @device-id-0] [Thu, 21 Jul 2022 02:36:06] [INFO] [mqtt_s3_multi_clients_comm_manager.py:172:_notify] mqtt_s3.notify: msg type = 5
mqtt_s3.notify: msg type = 5
mqtt_s3.notify: msg type = 5
[FedML-Server(0) @device-id-0] [Thu, 21 Jul 2022 02:36:06] [INFO] [server_manager.py:155:receive_message] receive_message. rank_id = 0, msg_type = 5.
receive_message. rank_id = 0, msg_type = 5.
receive_message. rank_id = 0, msg_type = 5.
[FedML-Server(0) @device-id-0] [Thu, 21 Jul 2022 02:36:06] [INFO] [fedml_server_manager.py:111:handle_message_client_status_update] self.client_online_mapping = {'1': True}
self.client_online_mapping = {'1': True}
self.client_online_mapping = {'1': True}
[FedML-Server(0) @device-id-0] [Thu, 21 Jul 2022 02:36:06] [INFO] [fedml_server_manager.py:126:handle_message_client_status_update] sender_id = 1, all_client_is_online = False
sender_id = 1, all_client_is_online = False
sender_id = 1, all_client_is_online = False
#Client side
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:33:03] [INFO] [mqtt_s3_multi_clients_comm_manager.py:148:on_connected] mqtt_s3.on_connect: client subscribes real_topic = fedml_yolov5_0_1, mid = 17, result = 0
mqtt_s3.on_connect: client subscribes real_topic = fedml_yolov5_0_1, mid = 17, result = 0
mqtt_s3.on_connect: client subscribes real_topic = fedml_yolov5_0_1, mid = 17, result = 0
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:33:03] [INFO] [mqtt_s3_multi_clients_comm_manager.py:198:_on_message_impl] mqtt_s3.on_message: not use s3 pack
mqtt_s3.on_message: not use s3 pack
mqtt_s3.on_message: not use s3 pack
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:33:03] [INFO] [mqtt_s3_multi_clients_comm_manager.py:172:_notify] mqtt_s3.notify: msg type = 6
mqtt_s3.notify: msg type = 6
mqtt_s3.notify: msg type = 6
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:33:03] [INFO] [fedml_client_master_manager.py:162:send_client_status] send_client_status
send_client_status
send_client_status
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:33:03] [INFO] [client_manager.py:157:send_message] Sending message (type 5) to server
Sending message (type 5) to server
Sending message (type 5) to server
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:33:03] [INFO] [mqtt_s3_multi_clients_comm_manager.py:277:send_message] mqtt_s3.send_message: MQTT msg sent
mqtt_s3.send_message: MQTT msg sent
mqtt_s3.send_message: MQTT msg sent
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:34:04] [INFO] [fedml_client_master_manager.py:62:handle_message_connection_ready] Connection is ready!
Connection is ready!
Connection is ready!
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:34:04] [INFO] [mqtt_s3_multi_clients_comm_manager.py:148:on_connected] mqtt_s3.on_connect: client subscribes real_topic = fedml_yolov5_0_1, mid = 19, result = 0
mqtt_s3.on_connect: client subscribes real_topic = fedml_yolov5_0_1, mid = 19, result = 0
mqtt_s3.on_connect: client subscribes real_topic = fedml_yolov5_0_1, mid = 19, result = 0
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:34:04] [INFO] [mqtt_s3_multi_clients_comm_manager.py:198:_on_message_impl] mqtt_s3.on_message: not use s3 pack
mqtt_s3.on_message: not use s3 pack
mqtt_s3.on_message: not use s3 pack
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:34:04] [INFO] [mqtt_s3_multi_clients_comm_manager.py:172:_notify] mqtt_s3.notify: msg type = 6
mqtt_s3.notify: msg type = 6
mqtt_s3.notify: msg type = 6
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:34:04] [INFO] [fedml_client_master_manager.py:162:send_client_status] send_client_status
send_client_status
send_client_status
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:34:04] [INFO] [client_manager.py:157:send_message] Sending message (type 5) to server
Sending message (type 5) to server
Sending message (type 5) to server
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:34:04] [INFO] [mqtt_s3_multi_clients_comm_manager.py:277:send_message] mqtt_s3.send_message: MQTT msg sent
mqtt_s3.send_message: MQTT msg sent
mqtt_s3.send_message: MQTT msg sent
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:35:05] [INFO] [fedml_client_master_manager.py:62:handle_message_connection_ready] Connection is ready!
Connection is ready!
Connection is ready!
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:35:05] [INFO] [mqtt_s3_multi_clients_comm_manager.py:148:on_connected] mqtt_s3.on_connect: client subscribes real_topic = fedml_yolov5_0_1, mid = 21, result = 0
mqtt_s3.on_connect: client subscribes real_topic = fedml_yolov5_0_1, mid = 21, result = 0
mqtt_s3.on_connect: client subscribes real_topic = fedml_yolov5_0_1, mid = 21, result = 0
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:35:05] [INFO] [mqtt_s3_multi_clients_comm_manager.py:198:_on_message_impl] mqtt_s3.on_message: not use s3 pack
mqtt_s3.on_message: not use s3 pack
mqtt_s3.on_message: not use s3 pack
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:35:05] [INFO] [mqtt_s3_multi_clients_comm_manager.py:172:_notify] mqtt_s3.notify: msg type = 6
mqtt_s3.notify: msg type = 6
mqtt_s3.notify: msg type = 6
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:35:05] [INFO] [fedml_client_master_manager.py:162:send_client_status] send_client_status
send_client_status
send_client_status
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:35:05] [INFO] [client_manager.py:157:send_message] Sending message (type 5) to server
Sending message (type 5) to server
Sending message (type 5) to server
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:35:05] [INFO] [mqtt_s3_multi_clients_comm_manager.py:277:send_message] mqtt_s3.send_message: MQTT msg sent
mqtt_s3.send_message: MQTT msg sent
mqtt_s3.send_message: MQTT msg sent
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:36:06] [INFO] [fedml_client_master_manager.py:62:handle_message_connection_ready] Connection is ready!
Connection is ready!
Connection is ready!
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:36:06] [INFO] [mqtt_s3_multi_clients_comm_manager.py:148:on_connected] mqtt_s3.on_connect: client subscribes real_topic = fedml_yolov5_0_1, mid = 23, result = 0
mqtt_s3.on_connect: client subscribes real_topic = fedml_yolov5_0_1, mid = 23, result = 0
mqtt_s3.on_connect: client subscribes real_topic = fedml_yolov5_0_1, mid = 23, result = 0
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:36:06] [INFO] [mqtt_s3_multi_clients_comm_manager.py:198:_on_message_impl] mqtt_s3.on_message: not use s3 pack
mqtt_s3.on_message: not use s3 pack
mqtt_s3.on_message: not use s3 pack
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:36:06] [INFO] [mqtt_s3_multi_clients_comm_manager.py:172:_notify] mqtt_s3.notify: msg type = 6
mqtt_s3.notify: msg type = 6
mqtt_s3.notify: msg type = 6
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:36:06] [INFO] [fedml_client_master_manager.py:162:send_client_status] send_client_status
send_client_status
send_client_status
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:36:06] [INFO] [client_manager.py:157:send_message] Sending message (type 5) to server
Sending message (type 5) to server
Sending message (type 5) to server
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:36:06] [INFO] [mqtt_s3_multi_clients_comm_manager.py:277:send_message] mqtt_s3.send_message: MQTT msg sent
mqtt_s3.send_message: MQTT msg sent
mqtt_s3.send_message: MQTT msg sent
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:37:07] [INFO] [fedml_client_master_manager.py:62:handle_message_connection_ready] Connection is ready!
Connection is ready!
Connection is ready!
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:37:07] [INFO] [mqtt_s3_multi_clients_comm_manager.py:148:on_connected] mqtt_s3.on_connect: client subscribes real_topic = fedml_yolov5_0_1, mid = 25, result = 0
mqtt_s3.on_connect: client subscribes real_topic = fedml_yolov5_0_1, mid = 25, result = 0
mqtt_s3.on_connect: client subscribes real_topic = fedml_yolov5_0_1, mid = 25, result = 0
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:37:07] [INFO] [mqtt_s3_multi_clients_comm_manager.py:198:_on_message_impl] mqtt_s3.on_message: not use s3 pack
mqtt_s3.on_message: not use s3 pack
mqtt_s3.on_message: not use s3 pack
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:37:07] [INFO] [mqtt_s3_multi_clients_comm_manager.py:172:_notify] mqtt_s3.notify: msg type = 6
mqtt_s3.notify: msg type = 6
mqtt_s3.notify: msg type = 6
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:37:07] [INFO] [fedml_client_master_manager.py:162:send_client_status] send_client_status
send_client_status
send_client_status
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:37:07] [INFO] [client_manager.py:157:send_message] Sending message (type 5) to server
Sending message (type 5) to server
Sending message (type 5) to server
[FedML-Client(1) @device-id-1] [Thu, 21 Jul 2022 02:37:07] [INFO] [mqtt_s3_multi_clients_comm_manager.py:277:send_message] mqtt_s3.send_message: MQTT msg sent
mqtt_s3.send_message: MQTT msg sent
mqtt_s3.send_message: MQTT msg sent
I suspect that there is some problem in the connection between client and server. I am using the same PC for client and for server Any hint and suggestion is highly appreciated !!!
It seems that the connection is successful. But in the config/fedml_config.yaml
, the client_num_per_round
is 2
. You can change it to 1
or you can launch two clients using bash run_client.sh 1
and bash run_client.sh 2
.
By the way, you can get more advantage usage at here
@beiyuouo Thanks for your quick and kind response. It's been a while since started the training between client and server, but so far I haven't seen any details or information about training status, neither a number of training epochs nor intermediate accuracy calculation. Does FedML automatically calculates accuracy metrics or should it be added inside the code?
@Adeelbek We have written a lot of metrics log information during the training process, which does not need users to add. Maybe you have some problems before the training process. Can you provide more information? And did you run the bootstrap.sh
in config
before starting the server, maybe you should run it first.
Hi @beiyuouo
In my previous trial, I did not run the bootstrap.sh
file before running the training. I stopped the training and then I run bash bootstrap.sh
file located in config
file. After that, I ran training with one server and two client scenarios but still cannot get any training metric information (mAP, AP or Recall). I am getting only communication information between clients and servers as I indicated above. Do I need to install additional libraries? Currently, I have a docker environment with preinstalled OpenCV, seaborn, pandas, etc. The followings are my environment details:
Package Version
----------------------- --------------------
absl-py 1.1.0
addict 2.4.0
aliyun-log-python-sdk 0.7.9
asttokens 2.0.5
backcall 0.2.0
backports.zoneinfo 0.2.1
blis 0.7.8
boto3 1.22.11
botocore 1.25.11
cachetools 5.2.0
catalogue 2.0.7
certifi 2019.11.28
cffi 1.15.0
chardet 3.0.4
charset-normalizer 2.0.12
click 8.1.3
cmake 3.22.4
commonmark 0.9.1
cycler 0.11.0
cymem 2.0.6
dataclasses 0.6
dateparser 1.1.1
dbus-python 1.2.16
decorator 5.1.1
dill 0.3.5.1
docker-pycreds 0.4.0
elastic-transport 8.1.2
elasticsearch 8.2.0
executing 0.8.3
fedml 0.7.210
flatbuffers 2.0
fonttools 4.34.4
future 0.18.2
gensim 4.2.0
gitdb 4.0.9
GitPython 3.1.27
google-auth 2.9.1
google-auth-oauthlib 0.4.6
grpcio 1.46.0
h5py 3.6.0
idna 2.8
importlib-metadata 4.12.0
intel-openmp 2022.1.0
iotop 0.6
ipython 8.4.0
jedi 0.18.1
Jinja2 3.1.2
jmespath 1.0.0
joblib 1.1.0
kiwisolver 1.4.4
langcodes 3.3.0
Markdown 3.4.1
MarkupSafe 2.1.1
matplotlib 3.5.2
matplotlib-inline 0.1.3
mkl 2022.1.0
mkl-include 2022.1.0
MNN 1.1.6
mpi4py 3.0.3
multiprocess 0.70.13
murmurhash 1.0.7
nano 0.10.0
networkx 2.8
ninja 1.10.2.3
numpy 1.22.3
oauthlib 3.2.0
onnx 1.7.0
onnx-simplifier 0.4.0
onnxruntime 1.11.1
onnxsim-no-ort 0.4.0
opencv-python 4.6.0.66
opencv-python-headless 4.6.0.66
packaging 21.3
paho-mqtt 1.6.1
pandas 1.4.3
parso 0.8.3
pathtools 0.1.2
pathy 0.6.2
pexpect 4.8.0
pickleshare 0.7.5
Pillow 9.1.0
pip 20.0.2
preshed 3.0.6
promise 2.3
prompt-toolkit 3.0.30
protobuf 3.19.4
psutil 5.9.0
ptyprocess 0.7.0
pure-eval 0.2.2
pyasn1 0.4.8
pyasn1-modules 0.2.8
pycocotools 2.0.4
pycparser 2.21
pydantic 1.9.1
Pygments 2.12.0
PyGObject 3.36.0
pynvml 11.4.1
pyparsing 3.0.8
python-apt 2.0.0+ubuntu0.20.4.7
python-dateutil 2.8.2
pytz 2022.1
pytz-deprecation-shim 0.1.0.post0
PyYAML 5.3.1
regex 2022.3.2
requests 2.27.1
requests-oauthlib 1.3.1
requests-unixsocket 0.2.0
rich 12.5.1
rsa 4.8
s3transfer 0.5.2
scikit-learn 1.1.0rc1
scipy 1.8.0
seaborn 0.11.2
sentry-sdk 1.5.12
setproctitle 1.2.3
setuptools 45.2.0
shortuuid 1.0.9
six 1.14.0
sklearn 0.0
smart-open 6.0.0
smmap 5.0.0
spacy 3.4.0
spacy-legacy 3.0.9
spacy-loggers 1.0.3
srsly 2.4.3
stack-data 0.3.0
supervisor 4.2.4
tbb 2021.6.0
tensorboard 2.9.1
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
thinc 8.1.0
thop 0.1.1.post2207130030
threadpoolctl 3.1.0
torch 1.11.0
torch-geometric 2.0.5
torchvision 0.12.0
tqdm 4.64.0
traitlets 5.3.0
typer 0.4.2
typing-extensions 4.2.0
tzdata 2022.1
tzlocal 4.2
urllib3 1.26.9
wandb 0.12.16
wasabi 0.9.1
wcwidth 0.2.5
Werkzeug 2.1.2
wget 3.2
wheel 0.34.2
zipp 3.8.1
@Adeelbek Hi, could you run fedml env
to provide more context information
Hi @beiyuouo,
Thanks for your support. Actually, the problem is solved after upgrading torch version from 1.11.0 to 1.12.0+cu116. Probably, the one who directly uses docker image should double check their cuda driver version and torch compatibility. Currently, I have 8 GPUs (3090) on my server pc but when run fedml env
command, it is showing no GPU message on the terminal as follows:
======== FedML (https://fedml.ai) ========
FedML version: 0.7.210
Execution path:/usr/local/lib/python3.8/dist-packages/fedml/__init__.py
======== Running Environment ========
OS: Linux-5.4.0-117-generic-x86_64-with-glibc2.29
Hardware: x86_64
Python version: 3.8.10 (default, Mar 15 2022, 12:22:08)
[GCC 9.4.0]
PyTorch version: 1.12.0+cu116
MPI4py is installed
======== CPU Configuration ========
The CPU usage is : 26%
Available CPU Memory: 205.3 G / 376.5395622253418G
======== GPU Configuration ========
No GPU devices
fedml@gpusystem:/home/gpuadmin/OPD/FedML$ nvidia-smi
Tue Jul 26 00:45:28 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.08 Driver Version: 510.73.08 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:1D:00.0 Off | N/A |
| 65% 68C P2 229W / 350W | 23697MiB / 24576MiB | 40% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:1E:00.0 Off | N/A |
| 30% 37C P8 23W / 350W | 2MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... Off | 00000000:1F:00.0 Off | N/A |
| 30% 38C P8 21W / 350W | 2MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 NVIDIA GeForce ... Off | 00000000:20:00.0 Off | N/A |
| 30% 35C P8 24W / 350W | 2MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 4 NVIDIA GeForce ... Off | 00000000:21:00.0 Off | N/A |
| 68% 70C P2 303W / 350W | 15389MiB / 24576MiB | 68% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 5 NVIDIA GeForce ... Off | 00000000:22:00.0 Off | N/A |
| 76% 72C P2 312W / 350W | 15163MiB / 24576MiB | 90% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 6 NVIDIA GeForce ... Off | 00000000:23:00.0 Off | N/A |
| 88% 74C P2 304W / 350W | 15165MiB / 24576MiB | 88% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 7 NVIDIA GeForce ... Off | 00000000:24:00.0 Off | N/A |
| 71% 70C P2 312W / 350W | 15163MiB / 24576MiB | 93% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
Odilbek, [2022-07-26 오전 9:58]
| ID ID Usage |
|=============================================================================|
+----------------------------------------------------
Do you have any idea why it can't recognize the GPUs?
While running training, I tried to use the predefined GPUs by creating gpu_mapping.yaml
. It is actually using all GPUs in the same fashion that I have pre-assigned. However, the usage of GPU is very low like 1~3% of GPU memory. Is this normal ?
Hi there,
I trained YOLOv5 in server and client for 120 epochs. However, I haven't got any stored weights for server or client in predefined directory which is ~/object_detection/runs/
. What could be the problem?
One more thing, in ./config/fedml_config.yaml
file, I see weights are initialized as weights='none'
. Why don't we just use some pretrained weights which are publicly available in models (YOLOv5, v6, v7) github repos (like: weights='yolov5s.pt'
)?
Yes, you're right. You can use the pretrained model by setting the weights
in the config file. If you are using MLOps you can download the final model directly. But if you are in the simulation scheme, it may not have checkpoints currently
Yes, you're right. You can use the pretrained model by setting the
weights
in the config file. If you are using MLOps you can download the final model directly. But if you are in the simulation scheme, it may not have checkpoints currently
Thanks for your suggestion, I tried changing weights: "yolov5s.py", but runs/train/exp10/weights is still empty. How can I know the effect of training. Thanks!
@xierongpytorch Actually, you can enable wandb from your configuration file to see the training details while doing distributed training. At least, I was able to see the effect of training from wandb when I used their old platform which has been deleted from their previous github repo. In current object detection task, they included wandb enabling/disabling option in config/fedm_config.yaml
file defined as enable_wandb: false
. Simply, you can enable setting false to true. However, when you enable wandb option, you will come across many runtime errors which might not be solvable. So currently effect of training cannot be seen without fixing bugs or adding some codes.
I also cannot get any training metric information (mAP, AP or Recall), and wandb only have "BusyTime,...." ,command line only have mloss
@xierongpytorch Actually, you can enable wandb from your configuration file to see the training details while doing distributed training. At least, I was able to see the effect of training from wandb when I used their old platform which has been deleted from their previous github repo. In current object detection task, they included wandb enabling/disabling option in
config/fedm_config.yaml
file defined asenable_wandb: false
. Simply, you can enable setting false to true. However, when you enable wandb option, you will come across many runtime errors which might not be solvable. So currently effect of training cannot be seen without fixing bugs or adding some codes.
Thanks for the instructive advice! I can't wait to follow your suggestion and use wandb successfully, I get a lot of Time, I think I don't understand the meaning of FedCV parameters, how can I learn more parameters details. Also I would like to ask you sincerely, how to view the training weights? Many thanks!