inference
inference copied to clipboard
REF: Remove some builtin old models and `ggmlv3` model format
- Remove some builtin old models
- Baichuan baichuan-chat
- Starcoder starcoderplus starchat
- glaive-coder
- wizardlm-v1.0
- vicuna-v1.3 vicuna-v1.5
- OpenBuddy
- orca
- falcon
- Chatglm chatglm-2
- Tiny-llama (TODO)
- opt (TODO)
- Change
llama-2
/llama-2-chat
ggmlv3
format toggufv2
format. - Remove support for
ggmlv3
format. - Remove register model
s3
schema support. - Remove support for self-hosted models.
- Remove some unused codes.
- Remove
opencv
for a required dependency. Place it inall
dependency. - Rename
pytorch
dir totransformers
and renameggml
dir tollama_cpp
.