ggml icon indicating copy to clipboard operation
ggml copied to clipboard

change of address storage of ptr value tensor->data from uint64_t to …

Open adamdadd opened this issue 2 years ago • 6 comments

Small change that I hope makes sense, the data value in the struct is of type void* so a cast to uintptr_t rather than uint64_t seems more appropriate here and may prove useful in the case of none-64-bit architectures.

Great project! Many thanks

adamdadd avatar Jun 18 '23 04:06 adamdadd

Thanks!

I used uint64_t specifically to handle none-64-bit architectures, because with the proposed change the output will depend on the architecture, while with the version on master it is independent from the architecture. Do I have a flaw in my logic?

ggerganov avatar Jun 18 '23 10:06 ggerganov

imagine running ggml on CHERI though, which will hardware fault if you try to use a mere integer as a pointer

LoganDark avatar Jun 19 '23 20:06 LoganDark

imagine running ggml on CHERI though, which will hardware fault if you try to use a mere integer as a pointer

Do you mean CheriBSD?

lin72h avatar Jun 25 '23 03:06 lin72h

imagine running ggml on CHERI though, which will hardware fault if you try to use a mere integer as a pointer

Do you mean CheriBSD?

No, all CHERI architectures support capabilities at a hardware level, which means they will fault regardless of the presence of an operating system.

I believe they do have a compatibility mode that does allow non-capability pointers (so just integers), which GGML probably would depend on if compiled for CHERI (assuming it can even store pointers at all, it seems to assume they are 64-bit).

LoganDark avatar Jun 25 '23 04:06 LoganDark

Some brief background on CHERI may be useful here. I'll attempt to give an overview of the idea. The fundamental aim is to provide memory safety on a hardware level. Current support is for RISCV, MIPS, ARMv8-A based CPUs using a ported version of LLVM-13.

Capabilities (128-bit pointers, 64-bit address + metadata such as permissions, access bounds etc.) are created and enforced on a hardware level and their validity is checked using an addtional tag-bit in specific capability registers. This is a hardware check on the authenticity of the capability preventing forgery.

In the case of CHERI there are indeed 2 modes. These are chosen at compile time:

  • There is the default recommended mode PURECAP which compiles all pointers to capabilities.

  • The second mode is HYBRID, this keeps pointers but allows for capabilities along-side them that must be specified explicitly as a type.

In the case of purecap dereferencing an int64_t type that holds a pointer value would throw a SIGPROT as it would not have the correct permissions for access.

In it's current implementation I believe it would throw a warning at compile time and the compiler would attempt an implicit conversion but my reasoning is that a use of int64_t doesn't necessarily equate to support of an address on all architectures.

Hope that was somewhat helpful 😊

adamdadd avatar Jun 28 '23 02:06 adamdadd

That's actually hilarious that I correctly pointed out CHERI if that was the real motivation for this pull request :D

LoganDark avatar Jun 28 '23 02:06 LoganDark

Decided to remove the pointer address saving as it is not used anyway. At some point I had some ideas to utilize it when loading exported ggml graphs, but later it turned out to not be necessary

ggerganov avatar Jul 02 '23 15:07 ggerganov