Qingyou Meng
Qingyou Meng
It looks that this PR refactors current code base too much. That's not big problem if the changes are `urgent`, but this is arguable. * First of all, there are...
Looks like the source files will be re-structured sooner or later. https://github.com/ggerganov/llama.cpp/issues/384#issuecomment-1480276524
Segmentation fault caused by unchecked NULL pointer when memory pool gets full? https://github.com/ggerganov/llama.cpp/issues/373#issuecomment-1479948004
> [main-2023-03-22-161321.ips.zip](https://github.com/ggerganov/llama.cpp/files/11041861/main-2023-03-22-161321.ips.zip) > > Here is the crash log. Above log indicates invalid access to a address `KERN_INVALID_ADDRESS at 0x0000000000000038`. The `not enough space in the context's memory pool ...`...
Should due to lack of memory. `convert-pth-to-ggml.py` loads the entire 13GB `models/7B/consolidated.00.pth` as pytorch model. Another similar issue https://github.com/ggerganov/llama.cpp/issues/402
See this issue https://github.com/ggerganov/llama.cpp/pull/428
> My fault :) It's this issue https://github.com/ggerganov/llama.cpp/issues/431
I'm wondering what's the value of `f16_model_parts_paths`: ``` f16_model_parts_paths = map( lambda filename: os.path.join(f16_model_path_base, filename), glob.glob(f"{f16_model_path_base}*") ) print(f16_model_parts_paths) # add this line ```
Perhaps it's more clear by changing commit message like this: "Fix GPTQ converter: corrected output file magic"
I can confirm this bug. Add to quantize.py, line 81: ```python for v in f16_model_parts_paths: print(v) ``` Run ```sh python3 quantize.py --models-path models 7B ``` Output: ``` models/7B/ggml-model-f16.bin/models/7B/ggml-model-f16.bin Succesfully quantized...