bonuschild
bonuschild
example: - https://huggingface.co/TheBloke/CodeLlama-7B-AWQ: physical size is 4GB but use VRAM about 20GB - https://huggingface.co/TheBloke/deepseek-coder-33B-instruct-AWQ: physical size is 17GB but can not run on a dual-A100(40G) server. > `--tensor-parallel-size 2` is...
I thinks we need: - VS Code plugins - Model loading methods - API Server to communicate between plugins and model backend Is here we have mature solution?
首先是仓库有两个:https://github.com/ginuerzh/gost/ 和 https://github.com/go-gost/gost - 后者里的文档关于配置是YAML和JSON - 前者仓库的代码和issue中发现只能用JSON,且格式较大不一致,学习比较混乱。 请问作者能否指条道路,本仓库的`gost`要以配置文件方式启动的话,要怎么写呢? 以下是使用错误的配置文件的报错: 
### Class | 类型 大语言模型 ### Feature Request | 功能请求 看文档介绍,似乎是支持自己Pull模型下载在本地run,可否直接给出在另一个IP上的api的http地址供gpt-academic使用? 也可能是我没看全文档,请指教,谢谢!
### req1: svg storage location - In obsidian, there is an option that setting the location of new created attachment such as picture and so on:  ### req2: svg...
In draw.io a new created diagram is set "grid view" on and "page view on":  I use draw.io open the `.svg` file created by plugin `drawio-obsidian` it show no...
I saw that you mock `llama.cpp` but I still have gpu resources, event I also have enough cpu & RAM. Just want to figure out it's right scene to deploying.
As title this repository missing an official `requirements.txt` to guide developer to install dependencies. Will it come up later?
- Already posted on https://github.com/vllm-project/vllm/issues/1479 - My GPU is RTX 3060 with 12GB VRAM - My target model is[CodeLlama-7B-AWQ](https://huggingface.co/TheBloke/CodeLlama-7B-AWQ), which size is
Here is my output after executing: ```bash (autogptq) root@XXX:/mnt/e/Downloads/AutoGPTQ-API# python blocking_api.py Traceback (most recent call last): File "/mnt/e/Downloads/AutoGPTQ-API/blocking_api.py", line 29, in model = AutoGPTQForCausalLM.from_quantized(model_name_or_path, File "/root/miniconda3/envs/autogptq/lib/python3.10/site-packages/auto_gptq/modeling/auto.py", line 108, in from_quantized...