llm-foundry icon indicating copy to clipboard operation
llm-foundry copied to clipboard

does it work on local machine or someone with limited resources

Open rabsher opened this issue 2 years ago • 5 comments

does it work on my local machine or is necessary in need GPU for it to run this model?

I tried to load a model on my local machine with 15GB of ram but it's stuck model loading because of the size of the model.

Is there any way to run it on the local machine? if so please guide me on what the additional parameter needs to pass

` import transformers import torch config = transformers.AutoConfig.from_pretrained( 'mosaicml/mpt-7b', trust_remote_code=True )

config.attn_config['attn_impl'] = 'triton'

model = transformers.AutoModelForCausalLM.from_pretrained( 'mosaicml/mpt-7b', trust_remote_code=True, config=config, torch_dtype=torch.bfloat16 ) model.to(device='cuda:0') `

rabsher avatar May 09 '23 05:05 rabsher

It should work on a local machine with enough RAM. I have just tried this on my machine with 16 GB of RAM (no GPU). It loaded successfully. Each parameter is 2 bytes (bf16 = 16 bits) * 7B parameters ≈ 14 GB. I don't know why this wouldn't work on your machine, but it is possible that accounting for some overhead, 15 GB is not quite enough RAM.

nik-mosaic avatar May 09 '23 19:05 nik-mosaic

would you please share your line of code if its different from mine

rabsher avatar May 10 '23 05:05 rabsher

If you are running transformer models locally without GPU, including MPT, you should probably checkout the GGML project. There is an open PR to add support for MPT: https://github.com/ggerganov/ggml/pull/145

samhavens avatar May 11 '23 00:05 samhavens

Run nvidia-smi to see if something else is also using GPU resources. On linux, the windowing system can use a few 100 MBs. I see the hf_chat.py program using 14014MiB with mosaicml/mpt-7b-chat.

patrickhwood avatar May 12 '23 06:05 patrickhwood

@rabsher You mentioned that you have a machine with 15 GB of RAM, and it sounds like with no GPU. I do not think you have enough RAM to use mosaicml/mpt-7b. You might be able to load it w/ 15 GB of RAM, but I don' think you will have enough memory left to actually run it or train it.

The llm-foundry code generally works on my local machine (16GB RAM, no GPUs), but I am limited to using small models.

alextrott16 avatar May 12 '23 23:05 alextrott16

Closing this issue for cleanup but feel free to reopen @rabsher if you have additional questions.

abhi-mosaic avatar May 17 '23 22:05 abhi-mosaic