Anima icon indicating copy to clipboard operation
Anima copied to clipboard

33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU

Results 64 Anima issues
Sort by recently updated
recently updated
newest added

### Discussed in https://github.com/lyogavin/Anima/discussions/113 Originally posted by **janmartin** February 11, 2024 AirLLM is great. And it desperately needs a simple installer and UI like AUTOMATIC1111 (for stable diffusion) for Windows....

运行代码 airllm到 model = AirLLMLlama2("/home/user/models/Anima-7B-100K")这一句的时候,出现下面错误: ``` model = AirLLMLlama2("/home/user/models/Anima-7B-100K") found index file... found_layers:{'model.embed_tokens.': True, 'model.layers.0.': True, 'model.layers.1.': True, 'model.layers.2.': True, 'model.layers.3.': True, 'model.layers.4.': True, 'model.layers.5.': True, 'model.layers.6.': True, 'model.layers.7.': True,...

bug

I'm not sure it makes sense to load more than one layer from performance standpoint, but using 1.6GB out of 11GB/16GB of typical consumer GPU is not optimal (and super...

Does AirLLM support AMD gpu?

Using `is_flash_attn_available` is deprecated and will be removed in v4.38. Please use `is_flash_attn_2_available` instead. Traceback (most recent call last): File "/opt/ai/test/inference_example_test.py", line 8, in model = AirLLMLlama2("/root/autodl-tmp/ai/Yi-34B-Chat",layer_shards_saving_path="/root/autodl-tmp/ai/layerSave") File "/root/miniconda3/lib/python3.10/site-packages/airllm/airllm.py", line...

Will the airllm framework be adapted for the streaming output functionality of different models in the future?

future work

我看示例中加载的大模型都属于“base”模型,并没有加载“chat”模型的例子,如果能够使用的话可以提供一些例子吗,谢谢

question

The sample code (taken from AirLLM examples): ```python from airllm import AirLLMLlamaMlx import mlx.core as mx MAX_LENGTH = 128 model = AirLLMLlamaMlx("garage-bAInd/Platypus2-7B") input_text = [ 'I like', ] input_tokens =...

bug

If we can run 70B models with just a 4GB vram graphic card, does it mean it is also possible to finetune a 70B model with a single 4090 with...

future work