csharplang
csharplang copied to clipboard
[Proposal]: ReadOnlySpan initialization from static data
ReadOnlySpan initialization from static data
- [x] Proposed
- [ ] Prototype: Not Started
- [ ] Implementation: Not Started
- [ ] Specification: Not Started
Summary
Provide a syntax for initializing a ReadOnlySpan<T> from constant data and with guaranteed zero allocation.
Motivation
https://github.com/dotnet/roslyn/pull/24621 added compiler support that translates:
ReadOnlySpan<byte> data = new byte[] { const, values };
into non-allocating code that blits the binary data into the assembly data section and creates a span that points directly to that data. The same optimization is done for:
static readonly ReadOnlySpan<byte> Data => new byte[] { const, values };
We now rely on this all over the place in dotnet/runtime and elsewhere, as it provides a very efficient means for accessing a collection of constant values with minimal overhead and in a way the JIT is able to optimize consumption of very well.
However, there are multiple problems with this:
- The optimization only applies to byte-sized primitive T values, namely byte, sbyte, and bool. Specify any other type, and you fall off a massive cliff, as code you were hoping to be allocation-free now allocates a new array on every access.
- It's easy to accidentally fall off a similar cliff if at least one of the values turns out to be non-const (or becomes non-const), e.g. if a const value referred to in the initialization is changed elsewhere from a const to a static readonly.
- The syntax is confusing, as it looks like it's allocating, and PRs that optimize code from:
private static readonly byte[] s_bytes = new byte[] { ... };
to
private static ReadOnlySpan<byte> Bytes => new byte[] { ... };
are often met with confusion and misinformation.
Detailed design
Add dedicated syntax for creating spans without allocating that:
- Doesn't visually look like it's allocating (i.e. avoid use of 'new').
- Provides validation with errors if the data isn't provably constant.
As the following syntax fails to compile today:
ReadOnlySpan<byte> data = { 1, 2, 3 };
and
static ReadOnlySpan<byte> Data => { 1, 2, 3 };
they could be co-opted for this purpose.
Opening up this syntax via the removal of new T[] doesn't prevent the optimization from being applied by the compiler when new T[] is used, but it would guarantee a non-allocating implementation when the new T[] isn't used.
This could also tie in with params Span<T>: the syntax for the local variant could blit the data into the assembly if possible, or else fall back to the same implementation it would use for a params Span<T> method argument (assuming that params syntax itself doesn't itself fall back to heap allocation).
Implementation-wise, the compiler would use RVA statics whenever possible and fall back to static readonly arrays otherwise.
Drawbacks
TBD
Alternatives
TBD
Unresolved questions
Related to this, we have prototyped in both the runtime and the C# compiler support for extending this optimization to more than just byte-sized primitives. The difficulty with other primitives is endianness, and it can be addressed in a manner similar to array initialization: the runtime exposes a helper that either returns the original pointer, or sets up a cache of a copy of the data reversed based on the current endianness. There are also multiple fallback code generation options available for if that API isn't available. Such improvements to the compiler are related but separate from the improvements to the language syntax for this existing optimization.
Design meetings
- https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-09-21.md#readonlyspan-initialization-from-static-data
cc: @jaredpar
As specified in https://github.com/dotnet/csharplang/blob/3c8559f186d4c5df5d1299b0eaa4e139ae130ab6/spec/arrays.md#array-creation the syntax localOrField = { 1, 2, 3 }; is just a "shortcut" for localOrField = new int[]{ 1, 2, 3 };
So I think the optimization should be done regardless how the array initialization is written syntactically. Currently we have these forms AFAIK:
new Type[]{ value1, value2}new[]{ value1, value2}{ value1, value2 }
To force this const-array-behaviour we could maybe introduce a new intrinsic Function like System.Array.CreateConstArray(1, 2, 3).
BTW: it would be nice if we could write:
static ReadOnlySpan<byte> Data => { 1, 2, 3 };
or
static ReadOnlySpan<byte> Data { return { 1, 2, 3 }; }
So I think the optimization should be done regardless how the array initialization is written syntactically.
As I noted, the compiler would be free to do so... it already does so. But one of the key aspects here is the compiler preventing you from shooting yourself in the foot, guaranteeing that the syntax is non-allocating (at least beyond any initialization required, e.g. in the case of a big endian machine reading the assembly little endian data), and it would be a breaking change if the existing usage that allocates stopped compiling.
Provides validation with warnings or errors if the expression would allocate
This feels like the best approach to me. Essentially we could standardize on = { ... } as being the non-allocating form. Then we could warn in any situation where it would cause an allocation:
- The contents of the
{ ... }were non-const - The target of the expression was
Span<byte> - The target of the expression was
ReadOnlySpan<T>whereTis notbyte,boolorsbyte. This of course would change if we implemented the optimization for non-byte sized types.
The one part that still bothers me is arrays. That syntax works for arrays today but has none of the safe guards. I've thought a bit about adding a disabled warning here that could be enabled by developers who hit this a lot but I'm having trouble convincing myself that meets the bar. Given that this only works on fields and locals today, maybe just knowing it doesn't work on arrays is enough for developers.
BTW: it would be nice if we could write: static ReadOnlySpan
Data => { 1, 2, 3 };
That is asking for { ... } to be a more general expression where today it's limited to field and local declarations. It's not un-doable but a bit of a bigger ask.
If we took this then I'd feel a bit more strongly about finding a way to extend this warning to cases where it crossed into arrays. Take for example the code M({1, 2, 3}) which would be legal if we generalized this to expressions. Whether or not it allocates depends on overload resolution.
I was just doing an experiment and stumbled upon what seems to be a bug:

Surely the 'buffer' is - implicitly - fixed in the sense that the GC is never going to relocate it if its inside a ref struct?
https://github.com/dotnet/csharplang/issues/1792
@Korporal see https://github.com/dotnet/csharplang/issues/1792
--
Beaten by @stephentoub :)
Not exactly the same but related discussion: #955, more along the lines of
const ReadOnlySpan<byte> data = new byte[] { const, values };
I particularly like the idea of putting const somewhere in the expression to indicate that the result must be considered "constant" (and therefore non-allocating, etc).
I feel like this could be more generally extended in the future to handle things like constexpr or unmanaged constants (the ability to define constants for blittable data types whose layout will never change, like Guid or Vector2/3/4, etc).
- https://github.com/dotnet/csharplang/discussions/688 proposes
unmanaged constantsupport utilizing the same functionality as constant arrays of primitives and likewise a helper method for fixing up endianness. The actual data declarations fully support such declarations being structured, so the only real language requirement is some constraint/expectation/contract that the layout of a type won't change or that such layout changes (e.g. packing differences) can be handled by the runtime
I'd thought a bit about const here but discarded it for the following reasons.
Firstly if we allowed const here then we need to consider all the places in which const expressions are legal: attribute arguments, optional parameters, folding, etc ... For ReadOnlySpan<byte> I haven't dug deeply into all of these but at a high level I think there are answers for these questions. Several of them involve a significant amount of work compared to the initial suggestion of re-using an existing syntax for initialization.
One of the other suggestions we are looking at though is expanding this optimization to types which are not byte sized. For example ReadOnlySpan<char>, ReadOnlySpan<int>, etc ... That brings endianess concerns into the conversation and at the moment I don't think there are great answers for how we work those into our existing const semantics. That becomes a significant work item for us to dig into.
Secondly const only solves the issue for fields and locals. That doesn't cover all the cases where is desire to have the non-allocating syntax. Reading into the issue you can see the desire exists to essentially see this at the expression level hence const doesn't solve it.
That's not to say I don't want const ReadOnly<byte> to work in the future (i'd actually like it). I just don't think it's necessarily the best solution for the particular problem being stated here.
@jaredpar What if the new const expressions were completely separate from the existing constant expressions and const fields/variables? Syntactically, it could look something like:
ReadOnlySpan<byte> data = const byte[] { 1, 2, 3 };
(This may be what @tannergooding meant by "putting const somewhere in the expression".)
I think this is nice, because:
- it more clearly expresses that it's the expression that's changing, not the field or variable,
- it doesn't have to work for attributes etc.,
- it works everywhere you can have expressions,
- it's consistent with
newandstackalloc: the keyword indicates where you're allocating from.
It's also problematic, because it would be very confusing if const meant two related, but very different things. Though that could be solved by using a different keyword (constalloc? constexpr? static?).
@svick
What if the new const expressions were completely separate from the existing constant expressions and const fields/variables?
There isn't inherently anything wrong with this. But it's introducing a new expression form which is going to have a higher bar than re-using existing expression form in more places.
it doesn't have to work for attributes etc.,
It would be really weird if it didn't though. For example it would be weird if I could say const i = const 42 but not [My(const 1)]. I can't introducing this without introducing the complete picture.
It's also problematic, because it would be very confusing if const meant two related, but very different things.
This doesn't bother me. It's close enough to use for both concepts. Or said differently it's not different enough that I would warrant a new keyword for it.
It would be really weird if it didn't though. For example it would be weird if I could say const i = const 42 but not [My(const 1)]. I can't introducing this without introducing the complete picture.
It doesn't need to be supported for all constant expressions, only array initializer expressions:
ReadOnlySpan<byte> span = new byte[] { 1, 2, 3 }; // creates span from System.Array (unless optimizable)
ReadOnlySpan<byte> span = stackalloc byte[] { 1, 2, 3 }; // creates span from stack array
ReadOnlySpan<byte> span = const byte[] { 1, 2, 3 }; // creates span from assembly data section
It doesn't need to be supported for all constant expressions, only array initializer expressions:
That seems very counterintuitive. If const is meant to explicitly identify expressions that are const why should it be limited to a subset of const expressions?
Yes I agree that const 42 (or any other literal) is a bit silly but what about const A.B? Lacking const that could be a side effecting property but with const prefix I've guaranteed I'm accessing a constant here. Or even const A + B as another example.
I can't see introducing const as a way to identify const array expressions but exclude it from all the other expression that are constant.
Championing this. I too fell into:
The syntax is confusing, as it looks like it's allocating, and PRs that optimize code from ... are often met with confusion and misinformation.
It would be nice to have a form that was less magical about how it was working with the compiler to get this result.
As the following syntax fails to compile today:
Yup. And, afaict, within roslyn, there woudl be no complexity to parse this at all. This would already fix directly into our syntax model. So this would just be something on the semantic side to enable. I don't know the compiler impl here well, but i have a feeling it could be done with minimal effort. Specifically:
byte[] b = { ... } is already supported, and so the compiler already effectively has to bind and translate that form to byte[] b = new byte[the_length] { ... } internally. So this would just be:
- do teh same for if you have
{ ... }on the RHS of aReadOnlySpan<X> X = { ... }or as the expr in aReadOnlySpan<X> X => exprproperty body. The rest of binding/emit should then take over and give teh same optimization as today. - Relax the rule on ref-structs as fields to allow them as static-readonly fields. Though need runtime/gc people to state if that will be a problem or not. Conceptually for C# i don't see any problems there.
- Ensure that binding can understand that if you have
ReadOnlySpan<X> x = { ... }what the right type of the array initializer is. For example, if you haveReadOnlySpan<byte> X = { 1, 2, 3 }we want this to be viewed asReadOnlySpan<byte> X = new byte[] { 1, 2, 3 }notReadOnlySpan<byte> X = new int[] { 1, 2, 3 }. Not sure if that's trivial, or if that may require some hackery to understand these special types. Given the optimization around arrays/ROS already in the compiler today, my guess is that there are already these special type hacks, so this would be nbd to encode into the language and support here.
@jaredpar I'm interested in taking a stab at this once we have space in the schedule. Let's chat about when/if you think this is possible to slot in. Definitely low pri, but also seems nice to have, hopefully cheap, and well received by the community here.
Relax the rule on ref-structs as fields to allow them as static-readonly fields. Though need runtime/gc people to state if that will be a problem or not.
Yes, that's a problem for the runtime. One of the primary reason for ref-structs to exist is that they cannot appear on the GC heap.
@jkotas Can we relax this in some fashion? Conceptually in this case there should be no problem that i can see (def correct me if i'm wrong). However, i'm not sure if this would be just something about ROS, or something that could extend to ref-structs in general. And, to be clear, i solely mean the use of them in static-readonly locations, nowhere else. Thanks!
--
Note: i suppose a possible (though unpalatable to me) possibility would be to allow someone to write:
static readonly ReadOnlySpan<byte> X = { 1, 2, 3 };
and have that actually translate to a property under the covers. conceptually that would then work as far as the runtime and everything else was concerned. But it would def be a bit wonky as that syntax really implies this is a field through and through.
Can we relax this in some fashion?
static fields are stored on the GC heap just like anything else.
This is hard trade-off for the runtime to make. If we relax this, we will end up with increased GC latencies that translate to worse P99s that is something that top tier services care about a lot.
Can you clarify where hte increased GC latency comes from? Conceptually, a user can (and already did) write:
private static readonly T[] static_array = { ... }
Wrapping that with ROS to make the array non-mutable doesn't seem like it should have a GC impact.
The byref in Span/ReadOnlySpan are expensive for the GC to walk. The GC has to find the containing object for them that is an expensive operation.
The GC has to find the containing object for them
Why does the GC have to find the containing object for them?
So that it can mark the object as reachable.
Why does it need to mark the containing object as reachable? (aside, this back and forth seems to be really inefficient). Would it be possible for you to break down the entirety of what's going on here so each message doesn't need a further 'why' to get a sense of the very next step in the algorithm? Thanks.
I feel that I am trying to explain the basic internal working of the GC. There are books and talk series dedicated to explaining the full chain of why things are done in certain way and the trade-offs involved in the GC design. I agree that it is not useful to replay those here.
It would be certainly technically possible (though a ton of work) to allow byrefs on heap. The problem are the performance trade-offs that would come from that decision. We are not convinced that these trade-offs would be an overall win.
The short story is that marking byrefs is expensive. Marking byrefs has to find a containing object. It is ~100x more expensive than a simple pointer dereference that is used to mark object references. Limiting byrefs to active stack only makes this cost manageable (the typical code has only so many byrefs on live stacks) and keeps the core marking algorithm simple.
There are many possible trade-offs in this area. For example, the classic Java VMs do not have byrefs like .NET. Instead, they explicitly store and pass around the containing object of the byref. It means that the code has slower throughput, but the GC is simpler and can do less work it does not need to support lookup of containing object from arbitrary pointer.
I feel that I am trying to explain the basic internal working of the GC.
This doesn't feel basic to me. It's unclear to me why the containing type is relevant at all for a static ROS field. Perhaps you can go the other direction and show an example of why it would matter?
Consider static ReadOnlySpan<byte> s_field = new byte[] { 1, 2, 3 };. Something needs to keep the byte[] alive.
Or
static readonly ReadOnlySpan<byte> s_span = MemoryMarshal.CreateSpan(ref CreateSomeObject().SomeField, 1);
needing to keep that SomeObject alive.
Consider static ReadOnlySpan
s_field = new byte[] { 1, 2, 3 };. Something needs to keep the byte[] alive.
Sure. I expect the field would keep that alive. I don't see what that has to do with the containing type of 's_field' as per:
The GC has to find the containing object for them that is an expensive operation.
It's the containing object for what the ref inside of the span refers to, not the containing type for the static field. Forget spans for a moment, allowing spans there is akin to allowing static ref fields, and so if you hypothetically had:
static readonly ref byte s_byte = ref CreateSomeObject().SomeByteField;
that ref field needs to keep alive the object containing that SomeByteField that's being referenced.