llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Create a C-style API similar to whisper.cpp

Open thomasantony opened this issue 1 year ago • 5 comments

This change makes it easier to use this code as a library - say to build python bindings on top of it. It extracts out the following functions into llama.cpp

  • llama_model_load
  • llama_eval
  • llama_model_quantize

It also moves the relevant struct definitions to llama.h. This for example, helps avoid redefinition of llama_hparams in quantize.cpp. Please let me know if you have any suggestions to improve this.

See here for an example of this library structure in use.

thomasantony avatar Mar 13 '23 03:03 thomasantony

In my fork I added this struct to bundle up all the relevant data:

struct llama_state {
    gpt_vocab vocab;
    llama_model model;
    struct {
        int64_t t_load_us = -1;
        int64_t t_sample_us = -1;
        int64_t t_predict_us = -1;
    } timing;
};

j-f1 avatar Mar 13 '23 16:03 j-f1

@ggerganov I have made the changes. Please let me know what you think

thomasantony avatar Mar 16 '23 03:03 thomasantony

@j-f1 @Green-Sky @ggerganov I have done another pass at refactoring and also fixed a few logical bugs that left interactive mode broken in my original version (among other things). I have verified that interactive mode now works as intended and inference remains just as fast as before.

I have also rebased on to the latest master branch. Please take another look. Thanks!

thomasantony avatar Mar 18 '23 02:03 thomasantony

@thomasantony We want to have a C-style API in llama.h. We cannot expose C++ constructs

For now, leave it like this and let me apply the necessary changes on top of yours to demonstrate what I have in mind - probably tomorrow or the day after. Thanks for the contributing!

ggerganov avatar Mar 18 '23 17:03 ggerganov

@thomasantony We want to have a C-style API in llama.h. We cannot expose C++ constructs

For now, leave it like this and let me apply the necessary changes on top of yours to demonstrate what I have in mind - probably tomorrow or the day after. Thanks for the contributing!

Okay. Thanks. In the meantime, I will rebase the new changes on the master branch on to this branch.

thomasantony avatar Mar 18 '23 17:03 thomasantony

Superseded by #370

ggerganov avatar Mar 21 '23 20:03 ggerganov