ggml icon indicating copy to clipboard operation
ggml copied to clipboard

Embed yolo files

Open katsu560 opened this issue 9 months ago • 14 comments

Some app like yolov3-tiny needs additional files to execute such as label(coco.names) and alphabet labels(100_0.png, ...) files. If these files are embedded to a model(gguf) file and the app read them from the model file, the app is more portable.

I added below

  • added new GGUF_TYPE_NAMEDOBJECT with name(file path) and value(file body) for adding files to gguf
  • expanded gguf-py to support NAMEDOBJECT, constants.py, gguf_reader.py, gguf_writer.py
    • please see pull request to llama.cpp
  • added gguf-addfile.py script to add files to gguf file
    • add files as NAMEDOBJECT (general.namedobject.N) or add files as NAMEDOBJECT array (general.namedobject[N] with --array option)
  • expanded ggml to support NAMEDOBJECT, ggml.h ggml.c
  • expanded yolov3-tiny to read coco.names and alphabet labels from gguf file,
    • at first read from gguf, then read from file if failed from gguf

NAMEDOBJECT constructed from name(file path) and value(file body)

    struct gguf_nobj {
        uint64_t nname;  // length of name
        char   * name;   // name in utf8
        uint64_t n;      // length of data in bytes
        char   * data;   // data body (file body)
    };

function usage:

struct gguf_nobj gguf_find_name_nobj(const struct gguf_context * ctx, const char * name)

call gguf_find_name_nobj() with const struct gguf_context *ctx and const char *name. ctx is gguf_context pointer. name is string encoded UTF8 like filename. search 'name' NAMEDOBJECT and return struct nobj. if not found, return struct nobj(0, NULL, 0, NULL). so if nobj.n == 0 means 'not found'. if found, return nobj with nobj.name has name, nobj.n has length of nobj.data, nobj.data has byte stream of data.

    struct gguf_nobj nobj = gguf_find_name_nobj(ctx, filename);
    if (nobj.n == 0) {
        return false;
    }
    membuf buf(nobj.data, nobj.data + nobj.n);
    std::istream file_in(&buf);
    if (!file_in) {
        return false;
    }
    std::string line;
    while (std::getline(file_in, line)) {
        labels.push_back(line);
    }

script usage:

python3 gguf-addfile.py [--array] input-gguf-file output-gguf-file files ...
  • add files as NAMEDOBJECT (general.namedobject.N)
  • add files as NAMEDOBJECT array (general.namedobject[N]) with --array option

katsu560 avatar May 19 '24 11:05 katsu560