phalexo comments

Results 137 comments of


                                            phalexo

RecursionError: maximum recursion depth exceeded

> Normally I was getting OverflowError but I followed this [PR](https://github.com/artidoro/qlora/pull/20) and it resolved but now I got this error. I see the same thing, I changed LlamaTokenizerFast back to...

ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported.

There seems to be a discrepancy in spelling LLaMATokenizer. The one in transformers is LlaMATokenizer, the second ell is l, not L. Also MA vs. ma. So, I had to...

Core dump. Not sure if caused by earlier problem with pad_token

def prefetch_tensor(A, to_cpu=False): assert A.is_paged, 'Only paged tensors can be prefetched!' if to_cpu: deviceid = -1 else: deviceid = A.page_deviceid num_bytes = dtype2bytes[A.dtype]*A.numel() lib.cprefetch(get_ptr(A), ct.c_size_t(num_bytes), ct.c_int32(deviceid)) The above function is...

Syntax/Logic error? pad_token is used before it is defined.

trainable params: 79953920.0 || all params: 3660320768 || trainable: 2.184341894267557 loaded model Using pad_token, but it is not set yet. pad_token_id = -1 Traceback (most recent call last): File "/home/developer/qlora/qlora.py",...

Trained model output seems illegible

7B is a tiny model. You cannot expect any quality in the output.

RecursionError: maximum recursion depth exceeded while calling a Python object, after the pad_token isssue was fixed

Changing the tokenizer back to LlamaTokenizer from LlamaTokenizerFast, removed the infinite recursion problem, BUT it brought back the "core dump" as I stated in another issue. So, the new tokenizer...

RecursionError: maximum recursion depth exceeded while calling a Python object, after the pad_token isssue was fixed

> I have the same issue. Do you solve this problem? Well, you can use the other Tokenizer without the "Fast" ending. It should get rid of recursion. In my...

RecursionError: maximum recursion depth exceeded while calling a Python object, after the pad_token isssue was fixed

Did you pull the updated file? There were some other changes related to pad_token. On Tue, May 30, 2023, 5:31 PM jianchao ji ***@***.***> wrote: > 'qlora.py' is the only...

Multi-GPU Training

> Hello I added some information on the multi-gpu setup in the README. In `qlora.py` we use Accelerate. You are correct that per_device_train/eval_batch_size refers to global batch size unlike the...

Errors happen during loading llama 65B for tuning.

> Hello, I tried the same command and it worked on my side. > > ``` > python qlora.py –learning_rate 0.0001 --model_name_or_path huggyllama/llama-7b > ``` > > Could you make...