toufunao issues

Results 7 issues of


                                            toufunao

How to run a oneshot strategy such as DARTS in data-parallel with multiple GPUs?

**Describe the issue**: I want to run DARTS examples in multiple GPUs, so I wrapped the model with DDP and shared data with Distributedsampler. However, I found the final 2...

ImportError: Cannot use a path to identify something from main.

**Describe the issue**: When I tried to use NAS, it reported 'ImportError: Cannot use a path to identify something from __main__.' and 'ValueError: Pickle too large when trying to dump...

Question about which party should be arbiter?

In hetero-LR settings, a role arbiter is needed. I would like to know which party should I assign this role. In the tutorial, some choose the guest to be the...

Using 2 GPUs training to train DARTS in parallel, but get 2 different search architecture?

**Describe the issue**: I used 2 GPUs to train DARTS, but from the output, I find that I get 2 different results. And I used 'export_onnx', but I didn't get...

NAS 2.0

[BUG] 纯内网环境安装pycocotools失败

**问题描述 / Problem Description** 纯内网环境安装pycocotools依赖失败 **环境信息 / Environment Information** 操作系统：红帽商业版7.7 python：3.10.9 **附加信息 / Additional Information** 添加与问题相关的任何其他信息 / Add any other information related to the issue.

bug

issue with example_completion.py

when i tried codellama-7b and codellama-34b to test code completion, all results were garbled code. facilities: OS: Red hat 4.8.5-36 GCC:4.8.5 32G V100 cuda:11.7 torch: 2.0.0 fairscale 0.4.13 sentencepiece: 0.1.99...

TensorParallel object has no attribute save_pretrained

I used tensor_parallel to finetune qwen model with lora in tensor parallel way. However, it cannot save the model in the end. Any help can you provide? Thanks.