localGPT Truncation not explicitly mention

I get this error when i Try to run a query

Truncation was not explicitly activated but max_length is provided a specific value, please use truncation=True to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to truncation. Setting pad_token_id to eos_token_id:128001 for open-end generation. C:\Users\Tarun Sridhar.conda\envs\mummy\lib\site-packages\transformers\models\llama\modeling_llama.py:648: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.) attn_output = torch.nn.functional.scaled_dot_product_attention(

What can be possible fixes?

Jun 30 '24 05:06 udbhav-44

I also try to run a query face the same problem, but the system only shows "Setting pad_token_id to eos_token_id:128001 for open-end generation.", have you ever solve the problem yet, pls help.

Jul 18 '24 06:07 GregChiang0201

I got the same message and the query takes forever... Any explanation of the error and if it has influence on the query results?

Jul 25 '24 08:07 KansaiTraining

I find the problem is, this author build the program in serial, instead of parallel, while you compile run_localGPT, you can also monitor you CPU usage(by top, or htop instructions). In my aspect, I only utilize 1~2 cpu cores to run the program, that’s the reason why it run so slow.

On Jul 25, 2024, at 4:20 PM, KansaiTraining @.***> wrote:

I got the same message and the query takes forever... Any explanation of the error and if it has influence on the query results?

— Reply to this email directly, view it on GitHub https://github.com/PromtEngineer/localGPT/issues/813#issuecomment-2249743173, or unsubscribe https://github.com/notifications/unsubscribe-auth/BJC5EIKMTLAJJ6K2KGHHCG3ZOCYMJAVCNFSM6AAAAABKDZOUTCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBZG42DGMJXGM. You are receiving this because you commented.

Jul 25 '24 09:07 GregChiang0201

Same issue here... I also see my SSD reading a lot because of python 3.10, even after getting :

Truncation was not explicitly activated but max_length is provided a specific value, please use truncation=True to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to truncation. Setting pad_token_id to eos_token_id:128001 for open-end generation.

Has anyone found a solution?

Aug 09 '24 15:08 maxrmp

I have the same issue, any help?

Nov 25 '24 11:11 slivmi