BMCourse icon indicating copy to clipboard operation
BMCourse copied to clipboard

国内网/服务器网不能自动从huggingface datasets上下载文件的解决方案

Open ShengdingHu opened this issue 3 years ago • 1 comments

ShengdingHu avatar Jul 11 '22 15:07 ShengdingHu

可以借助一个可以连huggingface datasets,并且可以将文件scp到目标服务器上的机器,例如本地机器, 运行python 代码

from datasets import load_dataset
mydataset = load_dataset("glue", "mrpc")
mydataset.save_to_disk("YOURPATH/glue.mrpc") # 不一定叫glue.mrpc 取个名就行

在终端中

scp -r  YOURPATH/glue.mprc  USERNAME@IP:THE_ABSOLUTE_PATH_TO_SAVE_YOUR_DATASET

之后在服务器中, 运行python代码

from datasets import load_from_disk
mydataset = load_from_disk(THE_ABSOLUTE_PATH_TO_SAVE_YOUR_DATASET)

ShengdingHu avatar Jul 11 '22 15:07 ShengdingHu