Cesium
Cesium copied to clipboard
Architecture-dependent sizes in portable code
There's a problem: if we compile to AnyCPU
, then we cannot generally determine sizeof(void*)
, which has to be a constant expression, according to the standard. Same about, say, size_t
or ptrdiff_t
, and actually any architecture-dependent types, mostly derived from pointers.
There are two possible strategies of dealing with that.
-
We can forbid using
sizeof
with a pointer and any pointer-sized types in portable code. Our users will have to specify the architecture when compiling a binary that uses these features.It is possible that this will effectively prevent compiling to
AnyCPU
of the most of C code.Also, using of such C libraries from external code would be problematic sometimes (the library author will have to distribute versions built for each architecture, which kinda loses the point of Cesium).
-
We can introduce a special architecture-independent compilation mode which will use 8 bytes for any pointer or
size_t
, and generally will prefer bigger object sizes, but will still allow building toAnyCPU
.This may be problematic for cases when a C library exposes some .NET interface which operates on pointer types of
IntPtr
: it's unclear how to properly compile that (or if it is even possible).
For now, I am considering going both ways: add a flag to enable (or disable) architecture-independent compilation, and allow the user to specify the architecture that will be used to calculate sizeof
and whatnot.
Depends on:
- [ ] #353,
- [x] #354.
Basically what I propose is instead of int IType.SizeInBytes
have IExpression IType.GetSizeInBytes()
function or property. If paired with constant folding, we have same IL as today, and we can have dynamic IL for types like void*[10]
Ok, after some thought, I've decided we'll try to support the following architecture sets in Cesium:
-
32b
: an architecture with 32-bit pointers, gets compiled as an x86 assembly (maybe with a flag to make it compatible with ARM32, though I'm not sure such a flag exists), -
64b
: an architecture with 64-bit pointers, gets compiled as an x64 assembly (with obligatory ARM64 support), -
wide
: an architecture forcing pointers to be 64-bit even on a 32-bit platform (we'll have a lot of fun with that I reckon), -
dynamic
: an architecture calculating pointer size dynamically at runtime.
Notes:
-
I am specifically calling
32b
and64b
architecture sets and not architectures, and explicitly not call themx86
andx64
to avoid confusion with the actual x86 and x86_64 architectures. We only impose restrictions on pointer size (and, likely, memory layout in the future, when we'll implementoffsetof
and whatnot), and not the instruction set.Though it's possible that we won't be able to make produced binaries compatible with both x86 and ARM32, or both x86_64 and ARM64. In such case, the idea will be to introduce four "real" architectures instead (x86, ARM32, x86_64, ARM64). The general scheme won't change in such case; internally in the the compiler code, we'll have a flag for architecture bitness and not actual output architecture.
-
Both
wide
anddynamic
should provide portable Any CPU-targeted binaries. -
Ideologically,
dynamic
will work as @kant2002 proposed.Yet for
dynamic
, I don't want to extend its magic scope too much for now. Certain things will be forbidden indynamic
, such as fixed arrays of size based on type sizes or offsets (i.e.struct foo { char x[sizeof(void*)]; };
). All these things should fail at compile-time.It is possible to make these types work by abusing runtime dispatch, but their interop story will be very confusing (while codegen will just become mildly messy). We may consider supporting them in the future, if demand arises.
-
I want to introduce all of these in one run right now, but only future will tell how fruitful are those ideas; maybe we'll drop some of the more exotic variants, or significantly limit the scope of what's allowed in
wide
anddynamic
.I really want at least one architecture to support 100% of C standard, but I'm ready to take compromises on all the others. This will still allow us to call Cesium C17-compilant compiler for that architecture, right?
Seems like wide
will require a separate version of the standard library.
We could either compile it using #if
magic, or have fun with Cecil and post-processing.