ChatGLM-6B 手动下载的模型放到哪个位置？

看到有这一行说明：

如果你从Hugging Face Hub上下载checkpoint的速度较慢，也可以从这里手动下载。

我手动下载了模型，应该放到本地哪个文件夹呢？

Mar 15 '23 07:03 helloxz

随便放在哪里都行。。在加载的时候设置好本地路径即可。例如： mypath = "/home/xxxx/public/chatglm-6b" tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True) model = AutoModel.from_pretrained(mypath, trust_remote_code=True).half().quantize(4).cuda() # 这里进行了int4量化。

Mar 15 '23 11:03 yaleimeng

不得行呢，这样会报错：

>>> mypath="D:/apps/ChatGLM-6B/model"
>>> from transformers import AutoTokenizer, AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Program Files\Python37\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 614, in from_pretrained
    pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
  File "D:\Program Files\Python37\lib\site-packages\transformers\models\auto\configuration_auto.py", line 852, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "D:\Program Files\Python37\lib\site-packages\transformers\configuration_utils.py", line 565, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "D:\Program Files\Python37\lib\site-packages\transformers\configuration_utils.py", line 632, in _get_config_dict
    _commit_hash=commit_hash,
  File "D:\Program Files\Python37\lib\site-packages\transformers\utils\hub.py", line 381, in cached_file
    f"{path_or_repo_id} does not appear to have a file named {full_filename}. Checkout "
OSError: D:/apps/ChatGLM-6B/model does not appear to have a file named config.json. Checkout 'https://huggingface.co/D:/apps/ChatGLM-6B/model/None' for available files.

是我姿势没对吗？

Mar 15 '23 12:03 helloxz

同样遇到了这个问题

Mar 15 '23 12:03 ykallan

you should download all files from here, except those you have downloaded from tsinghua cloud

Mar 15 '23 14:03 feixyz10

I have downloaded all files from huggingface. However, when I execute

mypath = "G:/chatGLM-6B/model"
tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True)

it returns:

OSError: [WinError 123] 文件名、目录名或卷标语法不正确。: 'C:\\Users\\<My Username>\\.cache\\huggingface\\modules\\transformers_modules\\G:'

It seems that some errors occur in my cache setting

Mar 16 '23 07:03 Ling-YangHui

Path name on Windows should be specially treated (just Google this). Pathlib package or replacing "/" with "\" might be possible to solve your problem.

Mar 16 '23 07:03 feixyz10

看报错应该是路径识别有点异常。程序觉得传递的路径有问题（Windows格式），Linux下是/开头而且不会有冒号。因为程序基本上都是针对Linux系统写的，而Windows系统路径不一样，默认编码也不是utf-8，会有很多坑。试试按照报错的路径去放置，实在不行建议双系统或者开个Windows的Linux虚拟机去跑（似乎虚拟机能用GPU了吧）。

Mar 16 '23 08:03 yaleimeng

问题应该已解决，目前我的解决方案是，首先用AutoModel，把模型名称填入进去，然后等待模型下载好后，把模型从~/.cache/xxx 路径下，复制到项目目录，然后修改脚本，这样是可以启动的，直接通过云盘下载的模型，缺少一些文件，不能直接启动 @yaleimeng @feixyz10 @helloxz @Ling-YangHui

Mar 16 '23 09:03 ykallan

哦。。我们从huggingface下载的。所以没遇到按理说直接弄个压缩包分流就行了啊。。怎么不同地方的东西还不一样

Mar 16 '23 09:03 yaleimeng

云盘只有大文件，只要去huggingface下把小文件下齐就好了，我亲测可行

Mar 16 '23 12:03 Liu-Steve

%USERPROFILE%.cache\huggingface\hub\models--THUDM--chatglm-6b\snapshots下有一个或多个名字像git版本号的目录，放最新的那个下面就可以了

Mar 16 '23 12:03 marszhao

不得行呢，这样会报错：

>>> mypath="D:/apps/ChatGLM-6B/model"
>>> from transformers import AutoTokenizer, AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Program Files\Python37\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 614, in from_pretrained
    pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
  File "D:\Program Files\Python37\lib\site-packages\transformers\models\auto\configuration_auto.py", line 852, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "D:\Program Files\Python37\lib\site-packages\transformers\configuration_utils.py", line 565, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "D:\Program Files\Python37\lib\site-packages\transformers\configuration_utils.py", line 632, in _get_config_dict
    _commit_hash=commit_hash,
  File "D:\Program Files\Python37\lib\site-packages\transformers\utils\hub.py", line 381, in cached_file
    f"{path_or_repo_id} does not appear to have a file named {full_filename}. Checkout "
OSError: D:/apps/ChatGLM-6B/model does not appear to have a file named config.json. Checkout 'https://huggingface.co/D:/apps/ChatGLM-6B/model/None' for available files.

是我姿势没对吗？

问题解决了吗？

Mar 17 '23 10:03 nikshe

@nikshe 是否文件没有下齐全？除了8个权重文件，hugging face 目录下所有的小文件都要放在模型目录里。看这个报错是模型目录里没有 config.json 文件

Mar 17 '23 16:03 jerrylususu

这8个bin文件要用cat把他们连起来吗，我显示找不到模型文件，但连起来又说读取失败：

root@2227e6c2b8b1:/work/chatglm-6b/ChatGLM-6B# python cli_demo.py Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Traceback (most recent call last): File "cli_demo.py", line 6, in model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().quantize(8).cuda() File "/usr/local/lib/python3.8/dist-packages/transformers/models/auto/auto_factory.py", line 459, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.8/dist-packages/transformers/modeling_utils.py", line 2164, in from_pretrained raise EnvironmentError( OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory THUDM/chatglm-6b. root@2227e6c2b8b1:/work/chatglm-6b/ChatGLM-6B#

Mar 18 '23 00:03 ttsking

我的解决了，原来是缺少 pytorch_model.bin.index.json

Mar 18 '23 00:03 ttsking

我这边必须填写window的绝对路径

Mar 18 '23 02:03 nikohpng

我的解决了，原来是缺少 pytorch_model.bin.index.json

能分享一下本地运行模型文件的代码吗

Mar 31 '23 16:03 luieswww

我的模型文件是齐全的，路径也是正确的，但是也报错。最后我把keras卸载了，就没有问题了。好神奇啊！

Apr 03 '23 03:04 DelaiahZ

我这边在wsl里面，都需要改成绝对路径，两行都要改。

Apr 04 '23 08:04 bash99

I have downloaded all files from huggingface. However, when I execute
mypath = "G:/chatGLM-6B/model"
tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True)
it returns:
OSError: [WinError 123] 文件名、目录名或卷标语法不正确。: 'C:\\Users\\<My Username>\\.cache\\huggingface\\modules\\transformers_modules\\G:'
It seems that some errors occur in my cache setting

mypath = 'G:\\chatGLM-6B\\model' 这样试试双斜杠'\\'试试

Apr 07 '23 15:04 tiejiang8

以下是我本地运行方法

第一次，联网运行

模型会自动下载到 C:\Users\Administrator\.cache\huggingface\hub\models--THUDM--chatglm-6b-int4

后面就可以离线加载

mypath = r'C:\Users\Administrator\.cache\huggingface\hub\models--THUDM--chatglm-6b-int4\snapshots\9163f7e6d9b2e5b4f66d9be8d0288473a8ccd027'

tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True)

9163f7e6d9b2e5b4f66d9be8d0288473a8ccd027 要看自己的是多少。

Apr 11 '23 12:04 mingyue0094

我的报错信息： root@DESKTOP-FMBI0K0:/data/ChatGLM-6b# python3 demo.py Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Traceback (most recent call last): File "demo.py", line 5, in tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True) File "/usr/local/lib/python3.7/dist-packages/transformers/models/auto/tokenization_auto.py", line 679, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py", line 1813, in from_pretrained **kwargs, File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py", line 1958, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/tokenization_chatglm.py", line 205, in init self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/tokenization_chatglm.py", line 55, in init assert vocab_file is not None AssertionError

我的代码： from transformers import AutoModel, AutoTokenizer import gradio as gr import mdtex2html mypath="/data/chatglm-6b-int4" tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True) model = AutoModel.from_pretrained(mypath, trust_remote_code=True).half().cuda() model = model.eval()

我使用的是wsl,下载的模型放在/data/chatglm-6b-int4下

Apr 11 '23 13:04 xuji755

This change works in cli_demo.py:

local_path = "/home/somepath/somepath/ChatGLM-6B/huggingface_chatglm-6m"
tokenizer = AutoTokenizer.from_pretrained(local_path, trust_remote_code=True)
model = AutoModel.from_pretrained(local_path, trust_remote_code=True).quantize(4).half().cuda()

Firstly I downloaded these .bin files from THU cloud and other files in chatglm-6b folder from huggingface, and an OSError showed when running cli_demo.py. Then I used "git clone https://huggingface.co/THUDM/chatglm-6b", and it works.

Apr 12 '23 06:04 taraliu23

我的报错信息： root@DESKTOP-FMBI0K0:/data/ChatGLM-6b# python3 demo.py Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Traceback (most recent call last): File "demo.py", line 5, in tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True) File "/usr/local/lib/python3.7/dist-packages/transformers/models/auto/tokenization_auto.py", line 679, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py", line 1813, in from_pretrained **kwargs, File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py", line 1958, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/tokenization_chatglm.py", line 205, in init self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/tokenization_chatglm.py", line 55, in init assert vocab_file is not None AssertionError

我的代码： from transformers import AutoModel, AutoTokenizer import gradio as gr import mdtex2html mypath="/data/chatglm-6b-int4" tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True) model = AutoModel.from_pretrained(mypath, trust_remote_code=True).half().cuda() model = model.eval()

我使用的是wsl,下载的模型放在/data/chatglm-6b-int4下

我也是一样的情况，给了地址，他还是走缓存

Apr 12 '23 06:04 xiaoxinxin666666

Please follow the instructions at https://github.com/THUDM/ChatGLM-6B#%E4%BB%8E%E6%9C%AC%E5%9C%B0%E5%8A%A0%E8%BD%BD%E6%A8%A1%E5%9E%8B

Apr 12 '23 15:04 duzx16

改为

from transformers import AutoModel, AutoTokenizer
import gradio as gr
import mdtex2html

tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("E:/model/LanguageModel/ChatGLM/chatglm-6b", trust_remote_code=True).half().cuda()

后报错

OSError: [WinError 123] 文件名、目录名或卷标语法不正确。: 'C:\\Users\\Administrator\\.cache\\huggingface\\modules\\transformers_modules\\E:'

Apr 20 '23 15:04 linonetwo

tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("e:\\model\\LanguageModel\\ChatGLM\\chatglm-6b", trust_remote_code=True).half().cuda()

这样可以

Apr 20 '23 15:04 linonetwo

在demo.py import之后增加/修改


import os
model_path = os.path.join(".", "models")
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().cuda()
model = model.eval()

然后把从hugfacing上git clone好的文件放到models目录下。这个linux和windows都可以用。

May 10 '23 06:05 LeXwDeX

请问我用的自己训好的模型hf格式，用web_demo跑一直没有反应，没有输出。请问怎么解决呢 2781686214546_ pic

Jun 08 '23 09:06 12lxr

模型参数没用缓存，但为什么模型本身加载的py还用的缓存啊，咋解决

Jun 15 '23 08:06 daihuangyu

ChatGLM-6B ChatGLM-6B copied to clipboard

手动下载的模型放到哪个位置？

ChatGLM-6B
ChatGLM-6B copied to clipboard