stride icon indicating copy to clipboard operation
stride copied to clipboard

Move to a more modern and consistent API using Spans

Open dotlogix opened this issue 8 months ago • 14 comments

Is your feature request related to a problem? Please describe. When using the engine I found a bunch of places where we can benefit from using spans over the traditional array APIs. A common pattern atm is passing an array+length or array+index+length which is exactly what a Span is as well. Most of these usages copy the content of the Array to an internal Buffer so they absolutly don't care if we use a Span instead.

This would allow us to mark the array APIs as obsolete and move to more memory/cache friendly constructs around Span & Ref. Spans can also be pinned to use them as Native memory where needed. We could even use NativeMemory or other Buffer constructs internally for high performance usages.

I guess it would also make it easier to work with Silk.net which heavily relies on Spans as well

Describe the solution you'd like Instead of APIs like this: void SetValue(T arg1) void SetValue(T[] args) void SetValue(int count, T[] args)

I would suggest using void SetValue(T arg1) => SetValue(MemoryMarshal.CreateSpan(ref arg1, 1)) void SetValue(ReadOnlySpan<T> args)

This makes it easier to maintain as we only need to implement a single method compared to 3 while preserving all of the abilities as previously. It also allows you to use partial or stackallocated arrays (stackalloc int[] { 1, 2, 3 }), collection expressions (NET 8 [1,2,3] ), or fixed size buffers (NET 8) to be passed without any memory allocations.

Describe alternatives you've considered Array Pooling could work but seems like a waste when passing only part's of an array.

Additional context Collection expressions: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/operators/collection-expressions Spans: https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/january/csharp-all-about-span-exploring-a-new-net-mainstay

I would happily volunteer to do this migration :)

dotlogix avatar Oct 23 '23 13:10 dotlogix

May also be of interest: #1972 which shows what I am aiming for in more places.

dotlogix avatar Oct 23 '23 14:10 dotlogix

some notes

  1. files that get shared with a netstandard2.0 library will have problems with it as most span APIs dont exist in netstandard , in particualr here with assemblyprocessor which targets netstandard2.0 and net6.0 and both have to work
  2. spans, memory and so on is harder to use, so i wouldnt like it if the array or whatever apis get declared obsolete, it would make the entrance for new gamedevs harder
  3. for work that allready targets this was attempted by me here https://github.com/stride3d/stride/pull/1957 but i ran exactly into point 1
  4. most unsafe code can be replaced with memorymarshal and what not , but parts of memorymarshal and so on came with .net7 which means you cant convert yet this section

it would be good to have a summary of what needs to be rewritten and what can be rewritten after the port to net8

IXLLEGACYIXL avatar Oct 23 '23 14:10 IXLLEGACYIXL

Thx for your comment. For 1: Actually a lot of that can be polyfilled with libraries like Fody.

For 2: Can you describe what is more difficult to use with spans? Arrays can be converted to spans without any issues and even be sliced and copied more easily using spans. With NET 8 we also get collection expressions support for spans making them the much better choice in general (obviously biased 😄 )

For 3 and 4: I can provide these APIs as a polyfill if we are willing to reference FodyIL in this project. I assume there are also existing solutions for most of these APIs available out there. If you are interested here is a link: https://github.com/ltrzesniewski/InlineIL.Fody

I use it quite a lot in my ECS where I need to do stuff not currently supported outside of IL 🙈

dotlogix avatar Oct 23 '23 14:10 dotlogix

FYI, for 1 very few files are concerned. But I agree that it's not great UX-wise that you often notice AssemblyProcessor/VSPackage is broken much after a dev spent time making those changes. Especially before I fixed teamcity to compile those projects as part of PR checks, we would only notice during package build, delaying the release. Note: maybe those common files should become part of shared/extra projects that gets compiled with .NET Standard even as part of the normal Stride.sln build to quickly notice issues during refactors? This will result in a tiny bit of extra compile time though. However, now that teamcity auto build it, maybe it's not as necessary anymore?

xen2 avatar Oct 23 '23 15:10 xen2

the release for 4.2 is planned for november when net8 gets released

so adding now an extra library for that wouldnt make to much sense ( for the net6 section ) , for the netstandard2.0 section this is only a small portion but it should be paid attention to it

for the question "what is more difficult to use with spans" , if you learn C# and just want to get started the entrance level for working with spans and memory and blabla is much higher than just potato lame arrays as you learn arrays from the beginning

microsoft doesnt even mention spans or memory or all that at their getting started page which covers more than just "getting started" , you can call yourself advanced when you can do all of thatt https://learn.microsoft.com/en-us/dotnet/csharp/

also microsoft also keeps the array apis if the entrance level is too high you will just lose new comers who dont know much about C# yet

IXLLEGACYIXL avatar Oct 23 '23 15:10 IXLLEGACYIXL

@IXLLEGACYIXL Ah got it makes sense to me. We can keep them as is then and just redirect to the Span APIs. I think with dotnet 8 colelction expressions will be the more popular choice though making Spans more accessible for newbies. And also imho Stride is not really aiming for beginner devs but maybe that should/could change in the near future.

dotlogix avatar Oct 23 '23 15:10 dotlogix

What do you think if I build these changes onto the dotnet 8 branch instead? That way we can introduce them when the newer APIs are avaliable for all projects reducing the risks to introduce incompatibilities?

dotlogix avatar Oct 23 '23 15:10 dotlogix

What do you think if I build these changes onto the dotnet 8 branch instead? That way we can introduce them when the newer APIs are avaliable for all projects reducing the risks to introduce incompatibilities?

making a summary first on what even could be made better would be good to start with so more can participate as alot of areas in stride are from the old times

IXLLEGACYIXL avatar Oct 23 '23 15:10 IXLLEGACYIXL

So this is a quick summary of all the files containing methods with arrays. Listing all methods would be too much so the files have to do for now :) Most of what I saw should be very simple to change optimization can still be done afterwards report.txt

dotlogix avatar Oct 23 '23 16:10 dotlogix

https://gist.github.com/IXLLEGACYIXL/2bf489aeb66d0e4d0031f620b0a6a3d9 the report as gist

IXLLEGACYIXL avatar Oct 23 '23 16:10 IXLLEGACYIXL

Summary

The issue proposes moving to a more modern and consistent API using Spans, suggesting the use of Spans instead of traditional array APIs for improved performance and flexibility. The author highlights the benefits of Spans, including better memory/cache utilization and support for Native memory. They also mention that Silk.net heavily relies on Spans.

Key Takeaways

The author suggests replacing array APIs with Spans for improved performance and memory utilization. Spans can be used to work with partial arrays, stackallocated arrays, collection expressions, and fixed size buffers without memory allocations. The use of Spans can be facilitated through libraries like Fody. Compatibility issues may arise with netstandard2.0 libraries that lack support for Spans. There are concerns about the higher learning curve for new developers when working with Spans compared to arrays. The proposed changes can be made on the dotnet 8 branch for compatibility and reduced risks of incompatibilities. A report containing files with methods using arrays can help identify areas for improvement in the Stride project.

Leaving this here...

I haven't made up my mind yet, but I'm leaning towards the Spans API. Since Stride is a real-time app and performance is a key feature. I would like to how much performance improvement we can actually achieve. The .NET 8 runtime includes numerous performance optimizations. will it be 1%, 10%, 30%?

tebjan avatar Oct 25 '23 11:10 tebjan

OK so I went ahead and created some benchmarks here are the results: As you may see Spans are in average ~20-50% faster except of some outliers (LoopCopy) which you wouldn't really do in code anyway because there are intrinsic functions available for these operations.

Benchmarked with .Net 7 for reference.

Method Mean Error StdDev Gen0 Allocated
InvokeWith_ParamsArray 4.626 ns 0.0838 ns 0.0784 ns 0.0057 48 B
InvokeWith_Span 2.864 ns 0.0148 ns 0.0131 ns - -
InvokeWith_ExplicitArray 4.605 ns 0.0775 ns 0.0687 ns 0.0057 48 B
InvokeWith_SpanTuple 1.897 ns 0.0088 ns 0.0078 ns - -
InvokeWith_StaticArray 1.886 ns 0.0058 ns 0.0051 ns - -
InvokeWith_StaticSpan 2.126 ns 0.0097 ns 0.0091 ns - -
Method Mean Error StdDev Gen0 Allocated
IterateWith_ParamsArray 6.582 ns 0.1316 ns 0.2746 ns 0.0057 48 B
IterateWith_Span 3.273 ns 0.0288 ns 0.0255 ns - -
IterateWith_ExplicitArray 6.181 ns 0.0629 ns 0.0558 ns 0.0057 48 B
IterateWith_SpanTuple 2.891 ns 0.0217 ns 0.0203 ns - -
IterateWith_StaticArray 2.846 ns 0.0170 ns 0.0142 ns - -
IterateWith_StaticSpan 3.110 ns 0.0222 ns 0.0196 ns - -
IterateWith_BigStaticArray 45.863 ns 0.2340 ns 0.2189 ns - -
IterateWith_BigStaticSpan 3.121 ns 0.0359 ns 0.0336 ns - -
Method Mean Error StdDev Gen0 Allocated
CopyWith_ParamsArray 9.697 ns 0.1688 ns 0.1579 ns 0.0057 48 B
CopyWith_Span 6.677 ns 0.0317 ns 0.0297 ns - -
CopyWith_ExplicitArray 9.479 ns 0.1031 ns 0.0861 ns 0.0057 48 B
CopyWith_SpanTuple 6.607 ns 0.0205 ns 0.0181 ns - -
CopyWith_StaticBackedSpan 4.534 ns 0.0386 ns 0.0342 ns - -
CopyWith_StaticArray 5.719 ns 0.0305 ns 0.0271 ns - -
CopyWith_BigStaticArray 12.826 ns 0.0624 ns 0.0553 ns - -
CopyWith_BigStaticBackedSpan 13.004 ns 0.0327 ns 0.0290 ns - -
Method Mean Error StdDev Gen0 Allocated
LoopCopyWith_ParamsArray 9.100 ns 0.2121 ns 0.6254 ns 0.0057 48 B
LoopCopyWith_Span 5.387 ns 0.0528 ns 0.0494 ns - -
LoopCopyWith_ExplicitArray 7.183 ns 0.1367 ns 0.1404 ns 0.0057 48 B
LoopCopyWith_SpanTuple 4.617 ns 0.0725 ns 0.0678 ns - -
LoopCopyWith_StaticArray 3.807 ns 0.0392 ns 0.0348 ns - -
LoopCopyWith_StaticBackedSpan 4.144 ns 0.0221 ns 0.0207 ns - -
LoopCopyWith_BigStaticArray 39.892 ns 0.3126 ns 0.2771 ns - -
LoopCopyWith_BigStaticBackedSpan 68.370 ns 0.4283 ns 0.4006 ns - -
Method Mean Error StdDev Gen0 Allocated
IndexOfWith_BigStaticArray 23.186 ns 0.1333 ns 0.1247 ns - -
IndexOfWith_BigStaticSpan 4.838 ns 0.0709 ns 0.0664 ns - -
Method Mean Error StdDev Gen0 Allocated
LoopEqualWith_BigStaticArray 38.094 ns 0.2014 ns 0.1884 ns - -
SequenceEqualWith_BigStaticArray 27.509 ns 0.5464 ns 0.6910 ns - -
SequenceEqualWith_BigStaticSpan 4.258 ns 0.0211 ns 0.0187 ns - -

Code: SpanBenchmarks.txt

dotlogix avatar Oct 25 '23 13:10 dotlogix

Using Span is highly recommended not just instead of arrays but also in favor of almost all IntPtr/nint/void* usage.

Once you have Span pinning is mostly not necessary anymore. Namely ultimately only still in interop scenario's.

And in some key places for best performance even Span can be eliminated in favor of ref local and Unsafe.Add and friends. That can still shave tens of percents of performance off, depending on the scenario and willingness to write more complex, longer code.

ericwj avatar Jan 20 '24 00:01 ericwj