Serhii Korol

Results 83 comments of Serhii Korol

The problem on Mac is with the underlying download script (associative arrays and nested loop). It's Linux-oriented and should be adopted to MacOS. TBH, I quit trying to fix different...

Because it's not the root cause. It never enters this [loop](https://github.com/juncongmoo/pyllama/blob/main/llama/download_community.sh#L134-L139).

```shell_script python3 quant_infer.py --wbits 4 --load pyllama-7B4b.pt --text "The meaning of life is" --max_length 24 --cuda cuda:0 ```

Several people complaining on the garbage in the output here #58.

Noticed the same on 4 bits model. Just a garbage in the output. Now I'm trying to quantize from the downloaded files. Will post the result here later.

BTW, found an interesting observation here #58: `--groupsize 128` affect the results somehow. Need to try to quantize w/o this flag.

@DrewSBAI this number would be different even for a single device if you re-run it 10 times. For me, slam-toolbox mostly every single run prints a new number in ~447-453...

Any updates? JB plugin still doesn't work: ![Screenshot from 2024-08-03 14-17-47](https://github.com/user-attachments/assets/9c2f4353-a6a2-4d71-a5f5-45cd4ca9494e)

Nevermind, it's fixed in the dev branch. Just build it and install from .zip.