aspnetcore icon indicating copy to clipboard operation
aspnetcore copied to clipboard

Blazor.platform.readStringField not working anymore after dotnet 7.0 upgrade

Open devedse opened this issue 1 year ago • 20 comments

Description

Hello, I'm currently a user of nKast's WASM implementation of MonoGame: https://github.com/nkast/Wasm https://github.com/kniEngine/kni

However after a dotnet 7 upgrade this logic doesn't work anymore.

I've been trying to dive into the issue here but it seems the code that was previously used to pass parameters to the frontend isn't working anymore: image

Previously the var id should have the value theCanvas.

The way this functionality is being called can be found here: https://github.com/nkast/Wasm/blob/main/Wasm.Dom/Document.cs#L40

Reproduction Steps

I created a branch of my 3d Maze Generator making use of MonoGame / Wasm to run in the browser here: https://github.com/devedse/DeveMazeGeneratorCore/tree/dotnet7

Simply open the DeveMazeGeneratorCoreMonoGame.sln, set the Blazor project as startup and click F5.

This error should popup in the console window:

Unhandled Exception:
blazor.webassembly.js:1 System.NullReferenceException: Object reference not set to an instance of an object.
dotnet.7.0.0.3dkjxfc2jf.js:5 Uncaught Error: System.NullReferenceException: Object reference not set to an instance of an object.
   at :5259/MonoGame.Framework.BlazorGameWindow.get_ClientBounds() in C:\XGit\kni\MonoGame.Framework\Platform\Blazor\BlazorGameWindow.cs:line 62
   at Microsoft.Xna.Platform.ConcreteGraphicsDeviceManager..ctor(:5259/Game game) in C:\XGit\kni\MonoGame.Framework\ConcreteGraphicsDeviceManager.Blazor.cs:line 16
   at Microsoft.Xna.Framework.GraphicsDeviceManager..ctor(:5259/Game game) in C:\XGit\kni\MonoGame.Framework\GraphicsDeviceManager.cs:line 39
   at DeveMazeGeneratorMonoGame.TheGame..ctor(:5259/IContentManagerExtension contentManagerExtension, Nullable`1 desiredScreenSize, Platform platform) in C:\XGitPrivate\DeveMazeGeneratorCore\DeveMazeGeneratorCore.MonoGame.Core\TheGame.cs:line 152
   at :5259/DeveMazeGeneratorCore.MonoGame.Blazor.Pages.Index.TickDotNet() in C:\XGitPrivate\DeveMazeGeneratorCore\DeveMazeGeneratorCore.MonoGame.Blazor\Pages\Index.razor.cs:line 37
   at Microsoft.JSInterop.Infrastructure.DotNetDispatcher.InvokeSynchronously(:5259/JSRuntime jsRuntime, DotNetInvocationInfo& callInfo, IDotNetObjectReference objectReference, String argsJson)
   at Microsoft.JSInterop.Infrastructure.DotNetDispatcher.Invoke(:5259/JSRuntime jsRuntime, DotNetInvocationInfo& invocationInfo, String argsJson)
   at Microsoft.AspNetCore.Components.WebAssembly.Services.DefaultWebAssemblyJSRuntime.InvokeDotNet(:5259/String assemblyName, String methodIdentifier, String dotNetObjectId, String argsJson)
    at qi (dotnet.7.0.0.3dkjxfc2jf.js:5:80156)
    at Ji (dotnet.7.0.0.3dkjxfc2jf.js:5:80040)
    at _Microsoft_AspNetCore_Components_WebAssembly__Microsoft_AspNetCore_Components_WebAssembly_Services_DefaultWebAssemblyJSRuntime_InvokeDotNet (_Microsoft_AspNetCore_Components_WebAssembly__Microsoft_AspNetCore_Components_WebAssembly_Services_DefaultWebAssemblyJSRuntime_InvokeDotNet:30:5)
    at Object.invokeDotNetFromJS (blazor.webassembly.js:1:45192)
    at g (blazor.webassembly.js:1:1621)
    at A.invokeMethod (blazor.webassembly.js:1:3812)
    at tickJS ((index):61:32)

Expected behavior

I know that this behaviour is now deprecated (as of: https://learn.microsoft.com/en-us/aspnet/core/blazor/javascript-interoperability/import-export-interop?view=aspnetcore-7.0#call-javascript-from-net)

However I would expect the funcionality to remain working.

Actual behavior

The Blazor.platform.readStringField is returning: image

but should return theCanvas

Regression?

No response

Known Workarounds

Probably rewrite everything to [JSImport] [JSExport], but that's quite some work I'd rather not do now.

Configuration

dotnet 7.0 X64 Google Chrome

Other information

No response

devedse avatar Nov 22 '22 21:11 devedse

@devedse could you please create simpler repro ? As far as I could tell, we didn't remove any functionality in Net7 and marking it obsolete should not break it.

If I read the code correctly it uses readStringField and I don't know if that's public API. But it's fragile for sure. Maybe some memory offset changed ?

cc @danroth27 this seems most likely blazor interop.

pavelsavara avatar Nov 23 '22 10:11 pavelsavara

@pavelsavara, it's a bit hard for me to produce something smaller. The way I reproduced it is by manually including the sources of the NuGet packages I was using.

devedse avatar Nov 23 '22 10:11 devedse

@pavelsavara, @danroth27
here is a test branch. There are two projects for net6 and net7.

Here we are calling TestValueTupleString("theCanvas"); https://github.com/nkast/Wasm/blob/readStringFieldTest/Samples/CanvasGL/Pages/Index.razor.cs#L39 https://github.com/nkast/Wasm/blob/readStringFieldTest/Wasm.Dom/Document.cs#L51-L55 https://github.com/nkast/Wasm/blob/readStringFieldTest/Wasm.Dom/wwwroot/js/Document.7.0.0.js#L22-L31

nkast avatar Nov 23 '22 14:11 nkast

I'm not sure, but it seems to me that if you add string which is ref type into ValueTuple it would become non-blittable. And perhaps native offset of fields changed between net6 and net7 (as that's internal implementation detail).

pavelsavara avatar Nov 23 '22 16:11 pavelsavara

I found another example that is very similar. https://learn.microsoft.com/en-us/aspnet/core/blazor/javascript-interoperability/call-javascript-from-dotnet?view=aspnetcore-5.0#unmarshalled-javascript-interop-2 It uses a struct ('InteropStruct') to pass values to JS and the readStringField/readInt32Field methods. In my test I am passing two integer values before and after the string which I can read successfully with readInt32Field from offsets 0 and 8.

nkast avatar Nov 24 '22 09:11 nkast

Tagging subscribers to 'arch-wasm': @lewing See info in area-owners.md if you want to be subscribed.

Issue Details

Description

Hello, I'm currently a user of nKast's WASM implementation of MonoGame: https://github.com/nkast/Wasm https://github.com/kniEngine/kni

However after a dotnet 7 upgrade this logic doesn't work anymore.

I've been trying to dive into the issue here but it seems the code that was previously used to pass parameters to the frontend isn't working anymore: image

Previously the var id should have the value theCanvas.

The way this functionality is being called can be found here: https://github.com/nkast/Wasm/blob/main/Wasm.Dom/Document.cs#L40

Reproduction Steps

I created a branch of my 3d Maze Generator making use of MonoGame / Wasm to run in the browser here: https://github.com/devedse/DeveMazeGeneratorCore/tree/dotnet7

Simply open the DeveMazeGeneratorCoreMonoGame.sln, set the Blazor project as startup and click F5.

This error should popup in the console window:

Unhandled Exception:
blazor.webassembly.js:1 System.NullReferenceException: Object reference not set to an instance of an object.
dotnet.7.0.0.3dkjxfc2jf.js:5 Uncaught Error: System.NullReferenceException: Object reference not set to an instance of an object.
   at :5259/MonoGame.Framework.BlazorGameWindow.get_ClientBounds() in C:\XGit\kni\MonoGame.Framework\Platform\Blazor\BlazorGameWindow.cs:line 62
   at Microsoft.Xna.Platform.ConcreteGraphicsDeviceManager..ctor(:5259/Game game) in C:\XGit\kni\MonoGame.Framework\ConcreteGraphicsDeviceManager.Blazor.cs:line 16
   at Microsoft.Xna.Framework.GraphicsDeviceManager..ctor(:5259/Game game) in C:\XGit\kni\MonoGame.Framework\GraphicsDeviceManager.cs:line 39
   at DeveMazeGeneratorMonoGame.TheGame..ctor(:5259/IContentManagerExtension contentManagerExtension, Nullable`1 desiredScreenSize, Platform platform) in C:\XGitPrivate\DeveMazeGeneratorCore\DeveMazeGeneratorCore.MonoGame.Core\TheGame.cs:line 152
   at :5259/DeveMazeGeneratorCore.MonoGame.Blazor.Pages.Index.TickDotNet() in C:\XGitPrivate\DeveMazeGeneratorCore\DeveMazeGeneratorCore.MonoGame.Blazor\Pages\Index.razor.cs:line 37
   at Microsoft.JSInterop.Infrastructure.DotNetDispatcher.InvokeSynchronously(:5259/JSRuntime jsRuntime, DotNetInvocationInfo& callInfo, IDotNetObjectReference objectReference, String argsJson)
   at Microsoft.JSInterop.Infrastructure.DotNetDispatcher.Invoke(:5259/JSRuntime jsRuntime, DotNetInvocationInfo& invocationInfo, String argsJson)
   at Microsoft.AspNetCore.Components.WebAssembly.Services.DefaultWebAssemblyJSRuntime.InvokeDotNet(:5259/String assemblyName, String methodIdentifier, String dotNetObjectId, String argsJson)
    at qi (dotnet.7.0.0.3dkjxfc2jf.js:5:80156)
    at Ji (dotnet.7.0.0.3dkjxfc2jf.js:5:80040)
    at _Microsoft_AspNetCore_Components_WebAssembly__Microsoft_AspNetCore_Components_WebAssembly_Services_DefaultWebAssemblyJSRuntime_InvokeDotNet (_Microsoft_AspNetCore_Components_WebAssembly__Microsoft_AspNetCore_Components_WebAssembly_Services_DefaultWebAssemblyJSRuntime_InvokeDotNet:30:5)
    at Object.invokeDotNetFromJS (blazor.webassembly.js:1:45192)
    at g (blazor.webassembly.js:1:1621)
    at A.invokeMethod (blazor.webassembly.js:1:3812)
    at tickJS ((index):61:32)

Expected behavior

I know that this behaviour is now deprecated (as of: https://learn.microsoft.com/en-us/aspnet/core/blazor/javascript-interoperability/import-export-interop?view=aspnetcore-7.0#call-javascript-from-net)

However I would expect the funcionality to remain working.

Actual behavior

The Blazor.platform.readStringField is returning: image

but should return theCanvas

Regression?

No response

Known Workarounds

Probably rewrite everything to [JSImport] [JSExport], but that's quite some work I'd rather not do now.

Configuration

dotnet 7.0 X64 Google Chrome

Other information

No response

Author: devedse
Assignees: -
Labels:

arch-wasm, untriaged, area-VM-meta-mono

Milestone: -

ghost avatar Nov 24 '22 09:11 ghost

Observation:

  • the ValueTuple is [StructLayout(LayoutKind.Auto)]
  • the RenderTreeEdit where Blazor uses it internally is [StructLayout(LayoutKind.Explicit)] with [FieldOffset(16)]

https://github.com/dotnet/runtime/blob/89cfbfd84146e89ab18771c42928d23ff09232c1/src/libraries/System.Private.CoreLib/src/System/ValueTuple.cs#L626-L629

https://github.com/dotnet/aspnetcore/blob/ea529f4756841fdf8d7cee5b3c188210520c1323/src/Components/Components/src/RenderTree/RenderTreeEdit.cs#L14-L15

pavelsavara avatar Nov 24 '22 17:11 pavelsavara

I tried to reproduce by building your branch. And it works for me.

image

I also noticed that your screenshot has Document.6.0.2.js in it.

pavelsavara avatar Nov 24 '22 17:11 pavelsavara

Perhaps try to re-install your wasm-tools workload if you installed any of the previews.

pavelsavara avatar Nov 24 '22 17:11 pavelsavara

This issue has been marked needs-author-action and may be missing some important information.

ghost avatar Nov 24 '22 17:11 ghost

@pavelsavara I got the net7 from a VS update. I installed the wasm-tools just in case but I saw no difference.

I've added some more tests for ValueTuples and structs with varying number of arguments. The problem seems to be with single fields in structs/ValueTuples, which was the case originally with the 'GetElementById(..)'.

The value 'd' I am getting in TestValueTupleString1 and TestStructString1 can be converted directly to string with 'BINDING.conv_string(d)'. Normally 'readStringField(..)' resolves this value from HEAP32.

The TestStructInt1 receives directly the value of the struct field. 1 Bellow are the results I am getting from those tests.

*net6
TestValueTupleString1: theCanvas1
TestValueTupleString2: theCanvas2
TestValueTupleString3: theCanvas3
TestValueTupleString1: theCanvas1struct
TestValueTupleString2: theCanvas2struct
TestStructInt1: 42
TestString: theCanvas

*net7
TestValueTupleString1: ????
TestValueTupleString2: theCanvas2
TestValueTupleString3: theCanvas3
TestValueTupleString1: ????
TestValueTupleString2: theCanvas2struct
TestStructInt1: 0
TestString: theCanvas

nkast avatar Nov 25 '22 02:11 nkast

'BINDING.conv_string(d)'. Normally 'readStringField(..)' resolves this value from HEAP32.

It could be alignment problem. HEAP32 is aligned to 4 bytes, but maybe the struct is not.

pavelsavara avatar Nov 25 '22 08:11 pavelsavara

If you manage to reproduce this again, it would be very useful to know the address of the string itself to rule out any alignment issues and to identify whether a garbage pointer is being fetched via readStringField. You can do this by calling readInt32Field with the correct offset to get the address of the string for us. You could also try reading with offsets of 4 and 8 in addition to 0 in order to see whether the string was perhaps placed somewhere nearby due to ordering.

Incidentally, this bit of code from your sample: image Is fundamentally incorrect, so you shouldn't expect anything of this sort to keep working. Once a structure contains managed types like string or object there are few ordering/layout guarantees you can rely on anymore, even if you apply Sequential StructLayout to the struct. I think Explicit may work but that does not apply to ValueTuple since as a generic type it's not able to have pre-computed offsets.

It's also generally unsafe to pass raw managed pointers around as you are doing, you need to be very careful to ensure that the GC can never run while a raw pointer is in flight (among other things, this means you can never enable threading support with this code even once blazor has it available).

We'll still investigate whether there is a bug responsible for this change in behavior (and fix any bugs we find) but making your sample continue to work isn't something we can offer any guarantees about.

If migrating to .NET 7, using the new JSExport and JSImport APIs will allow you to safely pass strings, spans, and other types across the boundary with good performance.

From looking at the source of https://github.com/nkast/Wasm/blob/main/Wasm.Dom/Document.cs#L40 however, the implementation of GetElementById appears incorrect and should never have worked. readStringField(d, 0) would only be correct if you were passing a type like this to GetElementById: struct Box { string Id; } which would mean invoking it like InvokeRet<Box, int>(...). readStringField(d, 0) is the spiritual equivalent of *(string*)(d + 0).

kg avatar Nov 25 '22 09:11 kg

I'm curious on what the desired implementation would be for .net 6.0 if you need to pass multiple objects.

devedse avatar Nov 25 '22 09:11 devedse

I'm curious on what the desired implementation would be for .net 6.0 if you need to pass multiple objects.

A managed array of mixed types will be safe and have stable ordering.

kg avatar Nov 25 '22 09:11 kg

At this point we can exclude that the issue has to do with alignment, offsets and GC relocations. The test case TestStructInt1 makes this very clear.

It looks to be an optimization that elevates structs into their wrapped types, that propagates all the way down to 'InvokeUnmashalled(..)' and the JS boundary.

nkast avatar Nov 25 '22 10:11 nkast

At this point we can exclude that the issue has to do with alignment, offsets and GC relocations. The test case TestStructInt1 makes this very clear.

Even if we get this working again, all 3 of those are potential future problems with this code.

It looks to be an optimization that elevates structs into their wrapped types, that propagates all the way down to 'InvokeUnmashalled(..)' and the JS boundary.

I'll try to find out whether any optimization of that sort went into the runtime between 6 and 7. I'm not aware of one.

kg avatar Nov 25 '22 11:11 kg

added TestStructInt2(), TestStructFloat1() and TestStructFloat2().

In TestStructFloat1, instead of a reference to the Float1 struct we are getting the value 0x4228 0000. 0x4228 is the binary form of 42.0 in f32bit. The system is passing down the raw DWORD value of Float1.Begin.

@kg, All the issues you mention make sense. Hopefully I can future prof the library before wasm64 and threading is added in net8.

I can see a lot of PRs regarding struct improvements in net.core. Usually @CarolEidt oversees those changes.

nkast avatar Nov 25 '22 11:11 nkast

This sounds like it could be a calling convention issue, then. Because it's a struct and not a class, it may be permissible to pass the struct by-value in a 4-byte or 8-byte value argument instead of passing it by-reference as an address, though it is certainly surprising for it to happen in this scenario.

Are you running these tests with AOT enabled? It would be good to know whether it reproduces with AOT on and off or is limited specifically to AOT or interpreter. My understanding of how we handle structs and invocations in the interpreter suggests it shouldn't be happening there.

kg avatar Nov 25 '22 12:11 kg

The calling convention used by the runtime was changed in 7.0 to conform to the wasm calling convention spec: https://github.com/WebAssembly/tool-conventions/blob/main/BasicCABI.md According to the spec, small structs which fit in one integer are passed by value.

vargaz avatar Nov 25 '22 16:11 vargaz

I'm not sure we can make this work how it used to given the calling convention change. As a workaround, you can put padding at the end of your structs I think. It's still unclear to me why GetElementById would have worked before given that it's not passing a struct. Let us know if padding doesn't work.

kg avatar Nov 25 '22 21:11 kg

@kg The workaround worked. I had to add an additional element to the ValueTuple as padding. workaround fix for issue dotnet/runtime#78731

GetElementById is passed in a ValueTuple, which effectively is a struct.

nkast avatar Jan 20 '23 03:01 nkast

Closing as we have no plans to make any changes here as we consider this functionality internal to Blazor and also replaced by the new Blazor interop model.

danroth27 avatar Nov 28 '23 19:11 danroth27