please how to call it locally
@NanshaNansha To load a model locally using the PeftModel class from a pretrained model, you need to ensure that the base_model and other required files are available locally. Try out these steps and let me know, if it works
- Install the dependencies:-
pip install transformers peft - Prepare Local Paths: Set the local paths where your pretrained model and cache directory are located. example snippet:-
from peft import PeftModel
# Define the local path to the base model and the cache directory
base_model_path = 'path/to/your/base_model'
model_id = 'path/to/your/local_model_directory'
cache_dir = 'path/to/your/cache_directory'
# Load the model from the local path
model = PeftModel.from_pretrained(base_model_path, model_id=model_id, cache_dir=cache_dir)
# Example: Use the model for inference
# Make sure you have the tokenizer and other necessary components
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_path)
input_text = "Your input text here"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model(**inputs)
print(outputs)
Let me know, if it works Thanks
I did make it work on my m4 MacBook Pro for most part of the demo. You will need to change some code and install packages, but it's doable.
The parts working are:
- Prepare training dataset from finnhub and convert to llama format
- Use the adapter in the repo on top of llama as the model to forecast a given stock.
The part is not working for me is: after step 1 above, I actually need to fine-tune llama to generate my own adapter, as the adapter inside this repo is around 1 year old. Looks like training step require cuda, which I don't have.
I tried to run it on the cloud with Nvidia card. I'm using around 1000 rows as training data set and 200 rows testing data set. The training is working also, but it's super slow. It requires around 13 hours of training.
And, for the training process, I was running the train.sh on a machine with 16gb v100 card. It says the ram is not big enough. Looks like PyTorch takes most of the ram. The training itself needs less than 200MB of ram.
[rank0]: torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 172.00 MiB.
GPU 0 has a total capacity of 15.77 GiB of which 16.19 MiB is free. Including non-PyTorch memory,
this process has 15.75 GiB memory in use. Of the allocated memory 15.45 GiB is allocated by PyTorch,
and 1.74 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.
See documentation for Memory Management
(https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
@BruceYanghy is this expected?