fsharp icon indicating copy to clipboard operation
fsharp copied to clipboard

AnyCPU FSC.dll OOMs on LA64

Open shushanhf opened this issue 3 years ago • 21 comments

This issue is created for the discussion #13640

Repro steps

Provide the steps required to reproduce the problem:

The sdk6.0.7-LoongArch64 is built by the installer which disable the crossgen. The sdk6.0.7 is ok for building many C# projects but failed even the simple Hello_world F# app.

Is there any problem running simple Hello world F# app on LA64 for example?

qiao@Lap5K:~/fsharp/FSharpSample$ ~/work_qiao/dotnet-runtime-loongarch/.dotnet/dotnet build
用于 .NET 的 Microsoft (R) 生成引擎版本 17.0.0+c9eb9dd64
版权所有(C) Microsoft Corporation。保留所有权利。

  Determining projects to restore...
  All projects are up-to-date for restore.
FSC : error FS0193: 内部错误: Exception of type 'System.OutOfMemoryException' was thrown. [/home/qiao/fsharp/FSharpSample/src/Library/Library.fsproj]

生成失败。

FSC : error FS0193: 内部错误: Exception of type 'System.OutOfMemoryException' was thrown. [/home/qiao/fsharp/FSharpSample/src/Library/Library.fsproj]
    0 个警告
    1 个错误

已用时间 00:00:09.77
qiao@Lap5K:~/fsharp/FSharpSample/src/App$ ~/work_qiao/dotnet-runtime-loongarch/.dotnet/dotnet run Hello World
FSC : error FS0193: 内部错误: Exception of type 'System.OutOfMemoryException' was thrown. [/home/qiao/fsharp/FSharpSample/src/Library/Library.fsproj]

生成失败。请修复生成错误并重新运行。
qiao@Lap5K:~/work_qiao/dotnet-runtime-loongarch$ file .dotnet/sdk/6.0.107/FSharp/*.dll
.dotnet/sdk/6.0.107/FSharp/fsc.dll:                                  PE32 executable (console) Intel 80386 Mono/.Net assembly, for MS Windows
.dotnet/sdk/6.0.107/FSharp/FSharp.Build.dll:                         PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
.dotnet/sdk/6.0.107/FSharp/FSharp.Compiler.Interactive.Settings.dll: PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
.dotnet/sdk/6.0.107/FSharp/FSharp.Compiler.Service.dll:              PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
.dotnet/sdk/6.0.107/FSharp/FSharp.Core.dll:                          PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
.dotnet/sdk/6.0.107/FSharp/FSharp.DependencyManager.Nuget.dll:       PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
.dotnet/sdk/6.0.107/FSharp/fsi.dll:                                  PE32 executable (console) Intel 80386 Mono/.Net assembly, for MS Windows
.dotnet/sdk/6.0.107/FSharp/Microsoft.Build.Framework.dll:            PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
.dotnet/sdk/6.0.107/FSharp/Microsoft.Build.Tasks.Core.dll:           PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
.dotnet/sdk/6.0.107/FSharp/Microsoft.Build.Utilities.Core.dll:       PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
.dotnet/sdk/6.0.107/FSharp/Microsoft.NET.StringTools.dll:            PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
.dotnet/sdk/6.0.107/FSharp/System.Resources.Extensions.dll:          PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows

Is the format of the fsc.dll is ok liking PE32 executable (console) but not PE32 executable (DLL) (console) ?

Expected behavior

building is sucessful.

Provide any related information (optional):

  • Operating system is LoongArch64-Linux-debian.
  • .NET Runtime kind is .NET6.

shushanhf avatar Aug 08 '22 02:08 shushanhf

Is there any way to access some kind of publicly available LA64 machine by the way? Or is that strictly contributor only?

En3Tho avatar Aug 10 '22 12:08 En3Tho

Is there any way to access some kind of publicly available LA64 machine by the way? Or is that strictly contributor only?

I am not aware of any.

vzarytovskii avatar Aug 10 '22 13:08 vzarytovskii

Is there any way to access some kind of publicly available LA64 machine by the way? Or is that strictly contributor only?

@En3Tho @vzarytovskii How can I email you the remote-cloud-LA-linux account ? My email is [email protected]

shushanhf avatar Aug 12 '22 07:08 shushanhf

I've sent you an email.

En3Tho avatar Aug 12 '22 10:08 En3Tho

So far I haven't been able to make it not fail with OOM. --tailscalls- or DOTNET_ReadyToRun=0 didn't help Also, for some reason dotnet trace doesn't produce a trace I can actually run, it fails with an exception "Read past the end of a stream"

Passing random flags to F# compiler works tho - at least it doesn't fail with OOM and produces correct error message So this means at least it can run and I don't think dll format is wrong

En3Tho avatar Aug 18 '22 10:08 En3Tho

Is there any way to get a trace from F# compiler, at least a stack trace or smth?

En3Tho avatar Aug 18 '22 11:08 En3Tho

@jkotas sorry to tag but maybe you can help here? What is the best way to diagnoze oom when app fails right away?

I'm not sure how to collect traces from it using dotnet-trace because they end up corrupted somehow. dotnet-trace collect -- dotnet fsc.dll hw.fs end up with 3meg trace. On exit dotnet trace indicates that process (I guess dotnet itself) ended with code 134. And this trace cannot be opened afterwards.

En3Tho avatar Aug 18 '22 11:08 En3Tho

This is an output of "free" util

              total        used        free      shared  buff/cache   available
Mem:        4013232      828400      620368      215744     2564464     2237376
Swap:             0           0           0

En3Tho avatar Aug 18 '22 11:08 En3Tho

@En3Tho Thanks very much for your debuging.

Does the fsc.dll have some special configuration, for example the PageSize ? LA64's PageSize is 16k which is different with the arm64/x64.

shushanhf avatar Aug 18 '22 11:08 shushanhf

I've been able to collect a better trace by modifying fsc. Will try to look into it.

Just in case I'm sharing it here

dotnet_20220818_213246.zip

En3Tho avatar Aug 18 '22 13:08 En3Tho

@shushanhf I'm not sure. It is running in fact. I'd rather see it as GC/Jit bug to be honest. But this needs more investigation

En3Tho avatar Aug 18 '22 13:08 En3Tho

Maybe there is some knobs on GC that need to be tuned or something is calculated incorretly in LA64. Max GC Heap Size: 34,337 MB It's doing lots of Start/Stop EE after last AllocSmall event and fails to reclaim anything I guess.

I will try to run a memory intensive test. Maybe it will result in the same OOM

En3Tho avatar Aug 18 '22 14:08 En3Tho

Tried a small app just to test allocations. App runs fine with 2gb heap and 1.2gb of gen0 and alloc rate of 150-180 megs/sec. Also, building F# app on windows and then running on LA works too.

I'm not sure what is so special about F# compiler though :(

En3Tho avatar Aug 18 '22 15:08 En3Tho

Some new information:

OOM is deterministic, it basically has this stack:

 ---> System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
   at System.String.Concat(String str0, String str1, String str2)
   at FSharp.Compiler.TcGlobals.initializer@1-4(TcGlobals this, Int32 idx) in G:\source\repos\dotnet\fsharp\src\Compiler\unknown:line 341
   at FSharp.Compiler.TcGlobals.TcGlobals..ctor(Boolean compilingFSharpCore, ILGlobals ilg, CcuThunk fslibCcu, String directoryToResolveRelativePaths, Boolean mlCompatibility, Boolean isInteractive, FSharpFunc`2 tryFindSysTypeCcu, Boolean emitDebugInfoInQuotations, Boolean noDebugAttributes, PathMap pathMap, LanguageVersion langVersion) in G:\source\repos\dotnet\fsharp\src\Compiler\TypedTree\TcGlobals.fs:line 335
   at FSharp.Compiler.CompilerImports.BuildFrameworkTcImports@2392-5.Invoke(Tuple`2 _arg9) in G:\source\repos\dotnet\fsharp\src\Compiler\Driver\CompilerImports.fs:line 2405
*** omitted

It's always failing when initializing TcGlobals object (it's quite huge)

I've been able to magically avoid this oom by randomly putting Console.WriteLine all over the place.

Now it succesfully passes that stage and error message is different. Debug build doesn't reproduce OOM and goes straight to those error messages. So it's probably a codegen issue.

(these are translated from chinese to english)
error FS0084: Assembly reference 'System.Runtime.Remoting.dll' not found or invalid
error FS0084: Assembly reference 'System.Runtime.Serialization.Formatters.Soap.dll' not found or invalid
error FS0084: Assembly reference 'System.Web.Services.dll' not found or invalid
error FS0084: Assembly reference 'System.Windows.Forms.dll' not found or invalid

I guess I need to build fsc differently to have it run on unix?

UPD. probably it's because I build from main and there is dotnet 7.0 sdk specified in global.json (but it still targets net6.0) while sdk on LA64 machine is dotnet/dotnet-sdk-6.0.107-1-loongarch64 ? Could this be an issue?

En3Tho avatar Aug 18 '22 17:08 En3Tho

Interesting it OOMs in the concatenation

vzarytovskii avatar Aug 18 '22 19:08 vzarytovskii

it's probably a codegen issue.

Yes, this looks like a codegen issue or GC heap corruption. It does not look like a real OOM.

jkotas avatar Aug 18 '22 21:08 jkotas

it's probably a codegen issue.

Yes, this looks like a codegen issue or GC heap corruption. It does not look like a real OOM.

Interestingly, It only crashes on that arch, the compiler is anycpu, and works fine on both arm and x86.

Since debug build does not produce the issue, the only thing that comes to my mind now is tail prefixes (though I'm not sure whether we emit them even in debug for fsc).

vzarytovskii avatar Aug 18 '22 21:08 vzarytovskii

It is most likely a bug in the .NET runtime port to LA64. It is unlikely to be a bug in the F# compiler itself.

jkotas avatar Aug 18 '22 22:08 jkotas

@En3Tho @jkotas Thanks a lot. I will try to debug it on LA.

shushanhf avatar Aug 18 '22 22:08 shushanhf

@shushanhf You're welcome. If you need help please tell. I will do whatever I can :)

En3Tho avatar Aug 19 '22 23:08 En3Tho

@shushanhf You're welcome. If you need help please tell. I will do whatever I can :)

Thanks very much again for your helping. I will debug it on LA first.

shushanhf avatar Aug 20 '22 01:08 shushanhf

If you can still reproduce this problem with the latest SDK please create a bug with the runtime team.

0101 avatar Feb 13 '23 18:02 0101