llama.cpp Python3 script instead of bash

#!/usr/bin/env python3

import os
import sys

if not (len(sys.argv) == 2 and sys.argv[1] in ["7B", "13B", "30B", "65B"]):
    print(f"\nUsage: {sys.argv[0]} 7B|13B|30B|65B [--remove-f16]\n")
    sys.exit(1)

for i in os.listdir(f"models/{sys.argv[1]}"):
    if i.endswith("ggml-model-f16.bin"):
        os.system(f"./quantize {os.path.join('models', sys.argv[1], i)} {os.path.join('models', sys.argv[1], i.replace('f16', 'q4_0'))} 2")
        if len(sys.argv) == 3 and sys.argv[2] == "--remove-f16":
            os.remove(os.path.join('models', sys.argv[1], i))

Mar 15 '23 21:03 Black-Engineer

I made a CMD script, but Python is more sensible considering it's already a requirement.

@(
SETLOCAL EnableDelayedExpansion

ECHO	---------------------------------------------------------------
ECHO	convert and quantize facebook LLaMA models for use by llama.cpp
ECHO	---------------------------------------------------------------

REM	directory containing original facebook models
set "SRC=Y:\LLaMA"

REM	directory containing ggml model files - both f16 and q4
set "DEST=."

REM	free disk space by deleting ggml f16 models after quantization
REM	set DELETE_F16=1

ECHO	Starting ... This could take a couple hours! ...

REM	todo: quantize in parallel

REM	stop if any model files exist, DO NOT OVERWRITE

IF NOT EXIST "!DEST!\llama-7B\ggml-model*.bin*" (
	python ..\convert-pth-to-ggml.py !SRC!\7B\ 1
	md !DEST!\llama-7B 2> NUL
	move !SRC!\7B\ggml-model-f16.bin !DEST!\llama-7B\ggml-model.bin
	quantize !DEST!\llama-7B\ggml-model.bin !DEST!\llama-7B\ggml-model-q4_0.bin 2
	IF DEFINED DELETE_F16 del !DEST!\llama-7B\ggml-model.bin
) ELSE (
	ECHO remove model files from "!DEST!\llama-7B" to re-generate.
	DIR /B "!DEST!\llama-7B\ggml-model*.bin*"
	ECHO ---------------------------------------------------------
)

IF NOT EXIST "!DEST!\llama-13B\ggml-model*.bin*" (
	python ..\convert-pth-to-ggml.py !SRC!\13B\ 1
	md !DEST!\llama-13B 2> NUL
	move !SRC!\13B\ggml-model-f16.bin   !DEST!\llama-13B\ggml-model.bin
	move !SRC!\13B\ggml-model-f16.bin.1 !DEST!\llama-13B\ggml-model.bin.1
	quantize !DEST!\llama-13B\ggml-model.bin   !DEST!\llama-13B\ggml-model-q4_0.bin 2
	quantize !DEST!\llama-13B\ggml-model.bin.1 !DEST!\llama-13B\ggml-model-q4_0.bin.1 2
	IF DEFINED DELETE_F16 del !DEST!\llama-13B\ggml-model.bin*
) ELSE (
	ECHO remove model files from "!DEST!\llama-13B" to re-generate.
	DIR /B "!DEST!\llama-13B\ggml-model*.bin*"
	ECHO ---------------------------------------------------------
)

IF NOT EXIST "!DEST!\llama-30B\ggml-model*.bin*" (
	python ..\convert-pth-to-ggml.py !SRC!\30B\ 1
	md !DEST!\llama-30B 2> NUL
	move !SRC!\30B\ggml-model-f16.bin   !DEST!\llama-30B\ggml-model.bin
	move !SRC!\30B\ggml-model-f16.bin.1 !DEST!\llama-30B\ggml-model.bin.1
	move !SRC!\30B\ggml-model-f16.bin.2 !DEST!\llama-30B\ggml-model.bin.2
	move !SRC!\30B\ggml-model-f16.bin.3 !DEST!\llama-30B\ggml-model.bin.3
	quantize !DEST!\llama-30B\ggml-model.bin   !DEST!\llama-30B\ggml-model-q4_0.bin 2
	quantize !DEST!\llama-30B\ggml-model.bin.1 !DEST!\llama-30B\ggml-model-q4_0.bin.1 2
	quantize !DEST!\llama-30B\ggml-model.bin.2 !DEST!\llama-30B\ggml-model-q4_0.bin.2 2
	quantize !DEST!\llama-30B\ggml-model.bin.3 !DEST!\llama-30B\ggml-model-q4_0.bin.3 2
	IF DEFINED DELETE_F16 del !DEST!\llama-30B\ggml-model.bin*
) ELSE (
	ECHO remove model files from "!DEST!\llama-30B" to re-generate.
	DIR /B "!DEST!\llama-30B\ggml-model*.bin*"
	ECHO ---------------------------------------------------------
)

IF NOT EXIST "!DEST!\llama-65B\ggml-model*.bin*" @(
	python ..\convert-pth-to-ggml.py !SRC!\65B\ 1
	md !DEST!\llama-65B 2> NUL
	move !SRC!\65B\ggml-model-f16.bin   !DEST!\llama-65B\ggml-model.bin
	move !SRC!\65B\ggml-model-f16.bin.1 !DEST!\llama-65B\ggml-model.bin.1
	move !SRC!\65B\ggml-model-f16.bin.2 !DEST!\llama-65B\ggml-model.bin.2
	move !SRC!\65B\ggml-model-f16.bin.3 !DEST!\llama-65B\ggml-model.bin.3
	move !SRC!\65B\ggml-model-f16.bin.4 !DEST!\llama-65B\ggml-model.bin.4
	move !SRC!\65B\ggml-model-f16.bin.5 !DEST!\llama-65B\ggml-model.bin.5
	move !SRC!\65B\ggml-model-f16.bin.6 !DEST!\llama-65B\ggml-model.bin.6
	move !SRC!\65B\ggml-model-f16.bin.7 !DEST!\llama-65B\ggml-model.bin.7
	quantize !DEST!\llama-65B\ggml-model.bin   !DEST!\llama-65B\ggml-model-q4_0.bin 2
	quantize !DEST!\llama-65B\ggml-model.bin.1 !DEST!\llama-65B\ggml-model-q4_0.bin.1 2
	quantize !DEST!\llama-65B\ggml-model.bin.2 !DEST!\llama-65B\ggml-model-q4_0.bin.2 2
	quantize !DEST!\llama-65B\ggml-model.bin.3 !DEST!\llama-65B\ggml-model-q4_0.bin.3 2
	quantize !DEST!\llama-65B\ggml-model.bin.4 !DEST!\llama-65B\ggml-model-q4_0.bin.4 2
	quantize !DEST!\llama-65B\ggml-model.bin.5 !DEST!\llama-65B\ggml-model-q4_0.bin.5 2
	quantize !DEST!\llama-65B\ggml-model.bin.6 !DEST!\llama-65B\ggml-model-q4_0.bin.6 2
	quantize !DEST!\llama-65B\ggml-model.bin.7 !DEST!\llama-65B\ggml-model-q4_0.bin.7 2
	IF DEFINED DELETE_F16 del !DEST!\llama-65B\ggml-model.bin*
) ELSE (
	ECHO remove model files from "!DEST!\llama-65B" to re-generate.
	DIR /B "!DEST!\llama-65B\ggml-model*.bin*"
	ECHO ---------------------------------------------------------
)

)

Sometimes I just use the tool where I'm at - not the brightest choice. 😃

Mar 15 '23 22:03 bitRAKE

os.system(f"./quantize {os.path.join('models', sys.argv[1], i)} {os.path.join('models', sys.argv[1], i.replace('f16', 'q4_0'))} 2")

Consider using something like subprocess.call to prevent security issues like command injections in filename.

Mar 17 '23 01:03 hx507

This change replaces the use of os.system in the Python script with subprocess.run, improving script security by preventing potential command injection issues in file names.

os.system is prone to command injection when used with dynamic file names, as it doesn't automatically escape special characters. In contrast, subprocess.run allows passing command arguments as a list of strings, minimizing command injection risk.

Although subprocess.call has not been officially marked as deprecated in the Python documentation, it is recommended to use subprocess.run instead, starting from Python 3.5 onwards.

#!/usr/bin/env python3

import os
import sys
import subprocess

if not (len(sys.argv) == 2 and sys.argv[1] in ["7B", "13B", "30B", "65B"]):
    print(f"\nUsage: {sys.argv[0]} 7B|13B|30B|65B [--remove-f16]\n")
    sys.exit(1)

for i in os.listdir(f"models/{sys.argv[1]}"):
    if i.endswith("ggml-model-f16.bin"):
        subprocess.run(["./quantize", os.path.join("models", sys.argv[1], i), os.path.join("models", sys.argv[1], i.replace("f16", "q4_0")), "2"])
        if len(sys.argv) == 3 and sys.argv[2] == "--remove-f16":
            os.remove(os.path.join("models", sys.argv[1], i))

I hope this can help you!

Mar 17 '23 19:03 rafablanquer

Already done in https://github.com/ggerganov/llama.cpp/pull/222

Mar 19 '23 19:03 prusnak

llama.cpp llama.cpp copied to clipboard

Python3 script instead of bash

llama.cpp
llama.cpp copied to clipboard