FedVision icon indicating copy to clipboard operation
FedVision copied to clipboard

Federated Computer Vision Engine

Results 14 FedVision issues
Sort by recently updated
recently updated
newest added

I started "sh examples/paddle_mnist/run.sh 127.0.0.1:10002" I am sure the training is started, but how to know the situation that the training is finished. And if it is finished, how to...

Before the block, root:ERROR: I don't know any relation beteen them. Or the program was blocked by other reasons.

机器为Ubuntu 20.04,有一个Nvidia 3090显卡,python环境为3.8,其余包的版本均按照readme和requriments.txt中安装 但是在运行Run examples部分的 `sh FedVision/examples/paddle_mnist/run.sh 127.0.0.1:10002`语句时遇到如下错误: ```[Master]2023-05-10 21:52:39.512 | ERROR | fedvision.framework.master.master:_co_handler:492:run jobs failed: compile error Traceback (most recent call last): File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main return...

使用的是fedvision-deploy deploy deploy --config standalone_template.yaml 命令 报错信息如下 Traceback (most recent call last): File "/root/fedvision/fedvision/bin/fedvision-deploy", line 8, in sys.exit(app()) File "/root/fedvision/fedvision/lib/python3.6/site-packages/fedvision_deploy_toolkit/_deploy.py", line 44, in deploy _maybe_create_python_venv(machine) File "/root/fedvision/fedvision/lib/python3.6/site-packages/fedvision_deploy_toolkit/_deploy.py", line 69, in...

任务提交以后,有没有可视化的页面做展示,如何查看程序运行后的结果 ![image](https://user-images.githubusercontent.com/19608744/227679176-1ade2396-a6a3-47a9-b98b-0f2c15005a04.png)

训练过程正常,但是模型训练checkpoint.save的.pdparams文件为空

hi, run the command, the train will always block on DEBUG:data loader ready: sh FedVision/examples/paddle_mnist/run.sh 127.0.0.1:10002 if i run sh FedVision/examples/paddle_mnist/run.sh 127.0.0.1:10003, maser2 can train normally, however, master1 can't work...

能帮忙double check下官方release的template.yaml文件么? cluster1中对应的两个worker1 和worker2,怎么分别对应machine1和machine2,跟release的框图不太对,并且在进行多级训练的时候也不对。 clusters: - name: cluster1 manager: machine: machine1 port: 10001 workers: - name: worker1 machine: machine1 ports: 12000-12999 max_tasks: 10 - name: worker2 machine: machine2 ports: 13000-13999...