FATE-LLM
FATE-LLM copied to clipboard
Federated Learning for LLMs.
https://github.com/FederatedAI/FATE-LLM/blob/main/doc/tutorial/parameter_efficient_llm/ChatGLM3-6B_ds.ipynb 按照这个教程中的指导,需要上传train.json到存储引擎中 {"file":"xxxx/train.json","head",false,"partition":4,"meta":{},"namespace":"experiment","name":"ad"} 上传数据失败。需要需要设置:Please provide sample_id_name
 FATE-LLM训练GPT模型时,卡住在这里不动了,最开始以为是资源问题,使用了2台机器上跑,每台1块GPU,结果还是卡住,没报错,也没日志输出。哪位大佬知道怎么调整吗?
按照https://github.com/FederatedAI/FATE-LLM/blob/main/doc/tutorial/parameter_efficient_llm/ChatGLM3-6B_ds.ipynb教程进行训练模型时,在提交任务后,出现FP16报错的情况——在client的docker容器中提交的,也加入了FATE-LLM/python到PYTHONPATH环境变量中。请问下各位大佬,这个该怎么解决呢?谢谢了。 FP16 Mixed precision trainning with AMP or APEX('--fp16') and FP16 half precision evaluation('--fp16_full_eval') can only be used on CUDA or NPU devices or certain XPU devices (with IPEX)
看到你们论文中提到了The FedLLM Privacy Hub,但是在FATE-LLM中并没有看到这部分的代码,想问下 你们这边的差分隐私方案是如何实现的呢?
我想了解一下FedCoLLM模块的实现方法,请问这个模块的代码是在FATE-LLM还是在FATE仓库
一定要cluster部署才可以吗
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)" NAME="Debian GNU/Linux" VERSION_ID="12" VERSION="12 (bookworm)" VERSION_CODENAME=bookworm ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/" Python version 3.8 Steps followed for installing FATE 1.11.3 version Pulled a Docker python 3.8 image,...
> File "demo.py", line 100, in pipeline.compile() │ └ └ File "/data/zhihao/anaconda3/envs/fate_env/lib/python3.8/site-packages/pipeline/backend/pipeline.py", line 428, in compile self._train_conf = self._construct_train_conf() │ │ │ └ │ │ └ │ └ {'dsl_version': 2,...