llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

ggml : move rope type enum to ggml.h

Open danbev opened this issue 1 year ago • 6 comments

This commit moves the llama_rope_type enum from llama.h to ggml.h and changes its name to ggml_rope_type.

The motivation for this change is to address the TODO in llama.h and use the enum in ggml.

Note: This commit does not change the mode parameter to be of type enum ggml_rope_type. The name mode and its usage suggest that it might be more generic and possibly used as a bit field for multiple flags. Further investigation/discussion may be needed to determine if mode should be restricted to RoPE types.

danbev avatar Aug 09 '24 13:08 danbev

If it is a bit field, then my preference would be to declare the different values as constants rather than as an enum. However, I don't think it is a bit field at the moment, even if it used to be in the past. Also, support for GLM rope was removed a while ago, so from what I can tell, the only two valid options are either normal or neox. GGML_ROPE_TYPE_NONE also doesn't really make sense in ggml, I understand that the reason it exists in llama.cpp is to support models that don't use rope, but that shouldn't be carried to ggml.

slaren avatar Aug 09 '24 15:08 slaren

Yes make sense. Probably we need ggml_rope_type with "norm" and "neox" in ggml and then another enum llama_rope_type in llama that combines the ggml_rope_type values with an extra "skip" value.

ggerganov avatar Aug 09 '24 15:08 ggerganov

@slaren @ggerganov Thanks for the reviews/feedback! I've updated with a commit containing your suggestions as I understood them.

Would it makes sense to follow up this PR with one that updates the mode parameter to be of type ggml_rope_type for the ggml_rope_* functions?

danbev avatar Aug 10 '24 05:08 danbev

If it is a bit field, then my preference would be to declare the different values as constants rather than as an enum.

@slaren Sorry, I overlooked this yesterday. What's the advantage of having the values as constants compared to enum? And to become a bit field, do you mean that the 0 values should be changed to 1?

@danbev It would be nice to make the change in a single PR. But let's first clarify how to express the rope types before we proceed.

ggerganov avatar Aug 10 '24 05:08 ggerganov

What's the advantage of having the values as constants compared to enum?

I just think it is clearer to define the flags as int constants, because an enum is only meant to be used with one value at a time, not as a combination of flags. The type created by the enum would also not be usable (you can still use it as an int in C due to implicit casts, but in C++ that would require explicit casts).

And to become a bit field, do you mean that the 0 values should be changed to 1?

What I meant is that in practice only one value can be used at the same time, so there is no reason to make this a bit field, and using an enum is fine.

slaren avatar Aug 10 '24 12:08 slaren

Ok, got it. So how about we define NeoX type like this and we don't even need the "norm" type within ggml:

diff --git a/ggml/include/ggml.h b/ggml/include/ggml.h
index 22f5dfed..a7ee17d2 100644
--- a/ggml/include/ggml.h
+++ b/ggml/include/ggml.h
@@ -244,6 +244,8 @@
 #define GGML_EXIT_SUCCESS 0
 #define GGML_EXIT_ABORTED 1
 
+#define GGML_ROPE_TYPE_NEOX 2
+
 #define GGUF_MAGIC "GGUF"
 
 #define GGUF_VERSION 3
@@ -437,12 +439,6 @@ extern "C" {
         GGML_FTYPE_MOSTLY_Q4_0_8_8 = 27, // except 1d tensors
     };
 
-    // Rotary Positional Embedding (RoPE) types
-    enum ggml_rope_type {
-        GGML_ROPE_TYPE_NORM =  0,
-        GGML_ROPE_TYPE_NEOX =  2,
-    };
-
     // available tensor operations:
     enum ggml_op {
         GGML_OP_NONE = 0,

And we keep the mode argument ggml_rope as int so that in the future we could use it as a bit field to store multiple options if needed.

@danbev We would need to update mode & 2 -> mode & GGML_ROPE_TYPE_NEOX in the backends as ell.

ggerganov avatar Aug 10 '24 13:08 ggerganov