Puyuan Liu

Results 24 comments of Puyuan Liu

I got the same error with NousResearch/Nous-Capybara-34B, ``` File "/home/ec2-user/SageMaker/anaconda3/envs/ot-gpt-package/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained return model_class.from_pretrained( File "/home/ec2-user/SageMaker/anaconda3/envs/ot-gpt-package/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3480, in from_pretrained ) = cls._load_pretrained_model( File "/home/ec2-user/SageMaker/anaconda3/envs/ot-gpt-package/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3870, in...

I got the same error

The script worked for opt models but do not work for other models. I guess it has something to do with the model format.

I found the solution. Basically you have to change `--lora_module_name decoder.layers.` to the appropriate name for you model, for example, `--lora_module_name h.` for bloom and gpt-neo.

I got this error sometimes. This error disappeared once I restart the server or slightly change the training parameter (e.g., change max_length from 1024 to 1025).

@mrwyattii Thanks for the reply! I was using nvidia-smi to measure the memory cost. I was able to train pythia-2.8B with max_length=1280 using stage=1, but got OOM error with stage=2.

> Hello, do you use local mode or server mode? > > Could you show your collection info? Thank you for your reply! It's running in a local mode. The...

I am trying to insert 20k batches, 64 embeddings per batch. The speed drops from 0.7 seconds per batch to 3 seconds per batch after 500 iterations. This only happens...

Thank you. I observe that the retrieval for sparse collection was also much (~20x) slower than dense in this case.

It appears that the latency during upsert operations is due to resizing. This resizing process is initiated when we keep adding sparse vectors to a collection that contains both sparse...