ggml : move rope type enum to ggml.h
This commit moves the llama_rope_type enum from llama.h to ggml.h and changes its name to ggml_rope_type.
The motivation for this change is to address the TODO in llama.h and use the enum in ggml.
Note: This commit does not change the mode parameter to be of type enum ggml_rope_type. The name mode and its usage suggest that it might be more generic and possibly used as a bit field for multiple flags. Further investigation/discussion may be needed to determine if mode should be restricted to RoPE types.
- [x] I have read the contributing guidelines
- Self-reported review complexity:
- [x] Low
- [ ] Medium
- [ ] High
If it is a bit field, then my preference would be to declare the different values as constants rather than as an enum. However, I don't think it is a bit field at the moment, even if it used to be in the past. Also, support for GLM rope was removed a while ago, so from what I can tell, the only two valid options are either normal or neox. GGML_ROPE_TYPE_NONE also doesn't really make sense in ggml, I understand that the reason it exists in llama.cpp is to support models that don't use rope, but that shouldn't be carried to ggml.
Yes make sense. Probably we need ggml_rope_type with "norm" and "neox" in ggml and then another enum llama_rope_type in llama that combines the ggml_rope_type values with an extra "skip" value.
@slaren @ggerganov Thanks for the reviews/feedback! I've updated with a commit containing your suggestions as I understood them.
Would it makes sense to follow up this PR with one that updates the mode parameter to be of type ggml_rope_type for the ggml_rope_* functions?
If it is a bit field, then my preference would be to declare the different values as constants rather than as an enum.
@slaren Sorry, I overlooked this yesterday. What's the advantage of having the values as constants compared to enum? And to become a bit field, do you mean that the 0 values should be changed to 1?
@danbev It would be nice to make the change in a single PR. But let's first clarify how to express the rope types before we proceed.
What's the advantage of having the values as constants compared to enum?
I just think it is clearer to define the flags as int constants, because an enum is only meant to be used with one value at a time, not as a combination of flags. The type created by the enum would also not be usable (you can still use it as an int in C due to implicit casts, but in C++ that would require explicit casts).
And to become a bit field, do you mean that the
0values should be changed to1?
What I meant is that in practice only one value can be used at the same time, so there is no reason to make this a bit field, and using an enum is fine.
Ok, got it. So how about we define NeoX type like this and we don't even need the "norm" type within ggml:
diff --git a/ggml/include/ggml.h b/ggml/include/ggml.h
index 22f5dfed..a7ee17d2 100644
--- a/ggml/include/ggml.h
+++ b/ggml/include/ggml.h
@@ -244,6 +244,8 @@
#define GGML_EXIT_SUCCESS 0
#define GGML_EXIT_ABORTED 1
+#define GGML_ROPE_TYPE_NEOX 2
+
#define GGUF_MAGIC "GGUF"
#define GGUF_VERSION 3
@@ -437,12 +439,6 @@ extern "C" {
GGML_FTYPE_MOSTLY_Q4_0_8_8 = 27, // except 1d tensors
};
- // Rotary Positional Embedding (RoPE) types
- enum ggml_rope_type {
- GGML_ROPE_TYPE_NORM = 0,
- GGML_ROPE_TYPE_NEOX = 2,
- };
-
// available tensor operations:
enum ggml_op {
GGML_OP_NONE = 0,
And we keep the mode argument ggml_rope as int so that in the future we could use it as a bit field to store multiple options if needed.
@danbev We would need to update mode & 2 -> mode & GGML_ROPE_TYPE_NEOX in the backends as ell.