Anima
Anima copied to clipboard
33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU
在进行外推训练时,max-position-embeddings这个参数需要做调整吗?比如之前是4096,外推到8K的话,需不需要改成8192?
Hello, A medium article gave the information that we can use four different ways to optimize the model 1. Layer wise Inteference 2. Single layer optimization-->Flash Attention 3. Model File...
(env) e:\AI\AirLLM>python airllm.py Traceback (most recent call last): File "e:\AI\AirLLM\airllm.py", line 1, in from airllm import AirLLMLlama2 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "e:\AI\AirLLM\airllm.py", line 1, in from airllm import AirLLMLlama2 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ImportError: cannot...
code: from airllm import AirLLMLlama2 MAX_LENGTH = 128 # could use hugging face model repo id: model = AirLLMLlama2("D:/models/Qwen-72B-Chat") input_text = [ 'What is the capital of United States?', #...
Hi, First off, awesome work. Love the idea. Secondly, Do you happen to have run-time metrics? Would love to know how long the model takes to perform inference with and...
请问如何处理这个错误:f"{hf_cache_path}/pytorch_model.bin.index.json should exists."。
The requirements in `setup.py` should specify compatible versions (at least the minimal version required). For instance I have a pod running `torch-1.12.0+cu116` which is not compatible. ```sh AttributeError: module 'torch'...
![image](https://github.com/lyogavin/Anima/assets/35209678/6f5f4c0f-7361-4e37-99ad-a67011c189d5) ![image](https://github.com/lyogavin/Anima/assets/35209678/3cd9f51c-1aac-4b8c-bbc7-108702069ad0) 训练脚本 Anima/rlhf/run_dpo_training.sh