Grounded-Segment-Anything
Grounded-Segment-Anything copied to clipboard
请问,离线运行,配置文件需要修改哪些地方?需要下载哪些文件?
请问,离线运行,配置文件需要修改哪些地方?需要下载config.json等文件,放在哪? 在没有联网的服务器上运行,报 OSError: We couldn't connect to https://huggingface.co to load this file, couldn't find it in che cached files and it looks like bert-base-uncased is not the path to a directory containing a file named config.json.
谢谢!
您看一下代码里的接口,代码可以直接指定config和预训练模型路径,手动下载并放到合适路径即可。
不是很清楚这个整体的流程。可以明确下代码如何直接指定config和预训练模型路径吗?谢谢!
以grounded_sam_demo.py
为例, 使用--grounded_checkpoint
控制预训练模型路径,--config
控制config路径
不好意思,可能没表达清楚,不是如何指定config运行代码。我遇到的问题是,在运行代码的中间过程中,GroundingDINO/groundingdino/util/get_tokenlizer.py 第17行,tokenizer = AutoTokenizer.from_pretrained(text_encoder_type),从这个地方再往下运行,需要联网下载一些文件,也就是我一开始问的(报错提示需要下载config.json等文件,放在哪? 在没有联网的服务器上运行,报 OSError: We couldn't connect to https://huggingface.co to load this file, couldn't find it in che cached files and it looks like bert-base-uncased is not the path to a directory containing a file named config.json.) 谢谢!
You can first run the hugginface-related code to download the packages, which often likes xx. from_pretrained()
不好意思,可能没表达清楚,不是如何指定config运行代码。我遇到的问题是,在运行代码的中间过程中,GroundingDINO/groundingdino/util/get_tokenlizer.py 第17行,tokenizer = AutoTokenizer.from_pretrained(text_encoder_type),从这个地方再往下运行,需要联网下载一些文件,也就是我一开始问的(报错提示需要下载config.json等文件,放在哪? 在没有联网的服务器上运行,报 OSError: We couldn't connect to https://huggingface.co to load this file, couldn't find it in che cached files and it looks like bert-base-uncased is not the path to a directory containing a file named config.json.) 谢谢!
确实,我也发现了,由于每次请求都要请求远程的huggingface容易出现连接失败的错误。我下载了 需要的模型文件,还没搞清楚要怎么改成加载本地的文件。
It will load the downloaded model automatically after you download it.
Maybe you need this link to download the whole model by installing huggingface_hub, then replace the input param in from_pretrained
with your local model directory.
Here is my workaround to run the model without connecting to huggingface:
- Step 1: download necessary files listed in huggingface-bert-base-uncased, including
config.json, flax_model.msgpack, pytorch_model.bin, tf_model.h5, tokenizer.json, tokenizer_config.json, vocab.txt
- Step 2: put downloaded files (Step 1) into your local folder. For example, the local folder could be
Grounded-Segment-Anything/huggingface/bert-base-uncased
- Step 3: modify
text_encoder_type
in get_tokenlizer.py#L17 and get_tokenlizer.py#L23 to your local folder (defined in Step 2) - Step 4: run the model and enjoy it
Here is my workaround to run the model without connecting to huggingface:
- Step 1: download necessary files listed in huggingface-bert-base-uncased, including
config.json, flax_model.msgpack, pytorch_model.bin, tf_model.h5, tokenizer.json, tokenizer_config.json, vocab.txt
- Step 2: put downloaded files (Step 1) into your local folder. For example, the local folder could be
Grounded-Segment-Anything/huggingface/bert-base-uncased
- Step 3: modify
text_encoder_type
in get_tokenlizer.py#L17 and get_tokenlizer.py#L23 to your local folder (defined in Step 2)- Step 4: run the model and enjoy it
好麻烦。。。主要是最后这个要改源代码。。。。。那后面怎么保持和master的同步(虽然后面改动的可能性不大
@SlongLiu 小哥,这里可以加一下这个配置么?给一个初始化时透传目录的机会
定位 load_model_hf 这个方法,写一个自己喜欢的调试语句,打印一下 cache_file 在本地的路径,将模型文件拷贝到某个地方,然后注释掉 cache_file 这一行,将 cache_file 指定到本地路径
Here is my workaround to run the model without connecting to huggingface:
- Step 1: download necessary files listed in huggingface-bert-base-uncased, including
config.json, flax_model.msgpack, pytorch_model.bin, tf_model.h5, tokenizer.json, tokenizer_config.json, vocab.txt
- Step 2: put downloaded files (Step 1) into your local folder. For example, the local folder could be
Grounded-Segment-Anything/huggingface/bert-base-uncased
- Step 3: modify
text_encoder_type
in get_tokenlizer.py#L17 and get_tokenlizer.py#L23 to your local folder (defined in Step 2)- Step 4: run the model and enjoy it
We will highlight it in our issue! Thanks for your solution, we will refine the code in the future release
Here is my workaround to run the model without connecting to huggingface:
- Step 1: download necessary files listed in huggingface-bert-base-uncased, including
config.json, flax_model.msgpack, pytorch_model.bin, tf_model.h5, tokenizer.json, tokenizer_config.json, vocab.txt
- Step 2: put downloaded files (Step 1) into your local folder. For example, the local folder could be
Grounded-Segment-Anything/huggingface/bert-base-uncased
- Step 3: modify
text_encoder_type
in get_tokenlizer.py#L17 and get_tokenlizer.py#L23 to your local folder (defined in Step 2)- Step 4: run the model and enjoy it
good job! Thanks
Here is my modified code: from transformers import AutoTokenizer, BertModel, RobertaModel, RobertaTokenizerFast, BertTokenizer
def get_tokenlizer(text_encoder_type): if not isinstance(text_encoder_type, str): if hasattr(text_encoder_type, "text_encoder_type"): text_encoder_type = text_encoder_type.text_encoder_type elif text_encoder_type.get("text_encoder_type", False): text_encoder_type = text_encoder_type.get("text_encoder_type") else: raise ValueError( "Unknown type of text_encoder_type: {}".format(type(text_encoder_type)) ) print("final text_encoder_type: {}".format(text_encoder_type))
tokenizer_path = "Grounded-Segment-Anything/huggingface/bert-base-uncased"
tokenizer = BertTokenizer.from_pretrained(tokenizer_path, use_fast=False)
return tokenizer
def get_pretrained_language_model(text_encoder_type): if text_encoder_type == "bert-base-uncased": model_path = "Grounded-Segment-Anything/huggingface/bert-base-uncased/pytorch_model.bin" return BertModel.from_pretrained(model_path) if text_encoder_type == "roberta-base": return RobertaModel.from_pretrained(text_encoder_type) raise ValueError("Unknown text_encoder_type {}".format(text_encoder_type))
But I still get an error:
(gsa) D:\forwork\Grounded-Segment-Anything>python grounded_sam_demo.py --config GroundingDINO/groundingdino/config/GroundingDINO_SwinT_OGC.py --grounded_checkpoint groundingdino_swint_ogc.pth --sam_checkpoint sam_vit_h_4b8939.pth --input_image assets/demo1.jpg --output_dir "outputs" --box_threshold 0.3 --text_threshold 0.25 --text_prompt "bear" --device "cuda"
D:\Anaconda3\envs\gsa\lib\site-packages\torch\functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\TensorShape.cpp:3191.)
return VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
final text_encoder_type: bert-base-uncased
Traceback (most recent call last):
File "grounded_sam_demo.py", line 181, in repo_type
argument if needed.
Is there any good solution?
export http_proxy="http://192.168.30.127:4780" export https_proxy="http://192.168.30.127:4780"
set proxy in docker