AspNetCoreModule icon indicating copy to clipboard operation
AspNetCoreModule copied to clipboard

IIS crashes with AV time to time after installing latest ANCM

Open AndreiGorlov opened this issue 8 years ago • 30 comments

ASP.NET Core 2.0 app, IIS 8.5 on Windows Server 2012 R2 with latest updates, aspnetcore.dll file version 7.1.1982.0. IIS crashes every 1-3 hours.

After rollback to previous version of ANCM (uninstalling DotNetCore.2.0.0-WindowsHosting.exe and installing DotNetCore.1.0.7_1.1.4-WindowsHosting.exe (aspnetcore.dll file version 7.1.1972.0) - no crashes for 20+ hours.

Event log error example:

Faulting application name: w3wp.exe, version: 8.5.9600.16384, time stamp: 0x5215df96
Faulting module name: aspnetcore.dll, version: 7.1.1982.0, time stamp: 0x594ab904
Exception code: 0xc0000005
Fault offset: 0x000000000000f8a9
Faulting process id: 0x1670
Faulting application start time: 0x01d335123bc5b2f6
Faulting application path: c:\windows\system32\inetsrv\w3wp.exe
Faulting module path: C:\Windows\system32\inetsrv\aspnetcore.dll
Report Id: 9a753da7-a107-11e7-80ec-00155d000301
Faulting package full name: 
Faulting package-relative application ID: 

DebugDiag Analysis Report with 3 crashes: MultipleDumps_CrashHangAnalysis (5).zip

I can share DebugDiag .dmp files if necessary.

AndreiGorlov avatar Sep 24 '17 18:09 AndreiGorlov

@pan-wang @shirhatti Seems to be a lot of issues with ANCM on Win Server 2012, could be platform specific?

jkotalik avatar Sep 25 '17 19:09 jkotalik

@AndreiGorlov Could please provide a dump? There was a known AV for Websocket scenario when client disconnected without handshake.

pan-wang avatar Sep 25 '17 21:09 pan-wang

@pan-wang dumps: https://drive.google.com/open?id=0B6nsyqRBUOTkbEstOF82ZGJmYlk

There was a known AV for Websocket scenario when client disconnected without handshake.

My app uses signalr (and websockets).

Now I'm trying to build minimal repro. Already cut off everything related to signalr/websockets, installed ANCM 2.0 - IIS started crashing again. Here is new dumps (app without websockets stuff): https://drive.google.com/open?id=0B6nsyqRBUOTkejAyVi03eEtUT1U

AndreiGorlov avatar Sep 26 '17 13:09 AndreiGorlov

@AndreiGorlov I looked at the dump. It seems for a post request, somehow IIS notification context which is used to track the request state in IIS pipeline became NULL and thus caused AV. Based on the dump, it is impossible to tell which module cleaned the notification context. Do you have other custom IIS module installed? Could you please share a repro app so that I can debug it in-house. Otherwise, we may need your help to collect idna trace.

pan-wang avatar Sep 26 '17 22:09 pan-wang

@pan-wang I'm have repro. Аfraid to publish because of denial-of-service capabilities. How can I contact you privately?

@jkotalik Not platform specific, reproduced on WinSrv 2016/IIS 10.0 as well.

AndreiGorlov avatar Oct 24 '17 05:10 AndreiGorlov

@AndreiGorlov you can send me email at panwang @ Microsoft . com. Thank you so much!

pan-wang avatar Oct 24 '17 06:10 pan-wang

@pan-wang Mail sent.

AndreiGorlov avatar Oct 24 '17 07:10 AndreiGorlov

We had the exact same issue & downgrading to 7.1.1972.0 fixed our problem. This is a severe bug requiring immediate action.

sepehr1014 avatar Dec 14 '17 16:12 sepehr1014

Same issue with the same server configuration:

Faulting application name: w3wp.exe, version: 8.5.9600.16384, time stamp: 0x5215df96
Faulting module name: aspnetcore.dll, version: 7.1.1988.0, time stamp: 0x5a2eeca2
Exception code: 0xc0000005
Fault offset: 0x000000000000d990
Faulting process id: 0x4a7c
Faulting application start time: 0x01d378f6e76fa57f
Faulting application path: c:\windows\system32\inetsrv\w3wp.exe
Faulting module path: C:\Windows\system32\inetsrv\aspnetcore.dll
Report Id: 2b46a68a-e4ea-11e7-80e0-a0369fbe0782
Faulting package full name: 
Faulting package-relative application ID: 


Fault bucket , type 0
Event Name: APPCRASH
Response: Not available
Cab Id: 0

Problem signature:
P1: w3wp.exe
P2: 8.5.9600.16384
P3: 5215df96
P4: aspnetcore.dll
P5: 7.1.1988.0
P6: 5a2eeca2
P7: c0000005
P8: 000000000000d990
P9: 
P10: 

These files may be available here:
C:\ProgramData\Microsoft\Windows\WER\ReportQueue\AppCrash_w3wp.exe_e3c0a787166295e9ef249eac737b0f57d9466d5_9e3fd63b_cab_17018a43

Analysis symbol: 
Rechecking for solution: 0
Report Id: 2ab06f4f-e4ea-11e7-80e0-a0369fbe0782
Report Status: 4
Hashed bucket: 

mtinnes avatar Dec 19 '17 18:12 mtinnes

If it helps, it's not Server 2012 specific - I'm getting it on Windows 10.

And sadly on Windows 10 (admittedly with a site making heavy use of signalr) it's happening several times an hour [often when the site isn't even being used]

OracPrime avatar Jan 07 '18 18:01 OracPrime

I'm also finding it hard to work around. The new VS2017 publish process expects to find the 2.0 store on the hosting machine for the app. If I roll back the hosting to 1.1 (by removing 2.0.3 hosting, I'm assuming) then it removes the 2.0 store and the application won't start up.

OracPrime avatar Jan 07 '18 18:01 OracPrime

@OracPrime try to install 2.0.3 hosting and replace aspnetcore.dll in %SystemRoot%\system32\inetsrv with 1.1 one.

AndreiGorlov avatar Jan 07 '18 19:01 AndreiGorlov

@AndreiGorlov thanks for the tip. However I'm finding it hard to find the 1.1 version of aspnetcore.dll anywhere. Even if I uninstall the 2.0.3 hosting and reinstall 1.1 I end up with the 7.1.1982.0 build.

OracPrime avatar Jan 08 '18 00:01 OracPrime

@OracPrime When can we except this to be fixed? Are you aware of the team's plans?

sepehr1014 avatar Jan 08 '18 19:01 sepehr1014

@sepehr1014 I'm just an end-user struggling with both the problem and the workaround - I've no idea when it's going to be fixed :( Which is a shame as it's a bit of a showstopper.

@sepehr1014 you said you downgraded to the 1972 build - how did you manage that? I find I always end up with the 1982 aspnetcore.dll

OracPrime avatar Jan 08 '18 20:01 OracPrime

My app uses SignalR , Fixed This Bug in Windows Server Hosting 2.0.4

OmidRafiee avatar Jan 09 '18 14:01 OmidRafiee

2.0.4 installed here and whilst it hasn't been up that long, I'm crash free so far (2.0.3 would have had several in the same time frame). Looking good, thanks guys!

OracPrime avatar Jan 09 '18 20:01 OracPrime

Unfortunately it still exists for us on 2.0.5 (using .NET 4.7.1 on Windows Server 2016):

Faulting application name: w3wp.exe, version: 10.0.14393.0, time stamp: 0x57899b8a
Faulting module name: iiscore.dll, version: 10.0.14393.1532, time stamp: 0x5965b173
Exception code: 0xc0000005
Fault offset: 0x00000000000163d3
Faulting process id: 0xcbc
Faulting application start time: 0x01d38a06f3aafba5
Faulting application path: c:\windows\system32\inetsrv\w3wp.exe
Faulting module path: C:\Windows\system32\inetsrv\iiscore.dll
Report Id: bbf4c674-2ab7-47e9-b781-544e8592b70c
Faulting package full name: 
Faulting package-relative application ID: 

A process serving application pool '...' suffered a fatal communication error with the Windows Process Activation Service. The process id was '3260'. The data field contains the error number.

This happens frequently (every 20 minutes or so) and we're using Web Sockets if that's related.

sepehr1014 avatar Jan 10 '18 12:01 sepehr1014

@sepehr1014 could please share a repro to my email address

pan-wang avatar Jan 10 '18 15:01 pan-wang

@pan-wang This is a really big project (a full blown social network & instant messaging system) we ported from ASP.NET 5 a while back. I have no clue what exactly is causing these issues ...

sepehr1014 avatar Jan 11 '18 09:01 sepehr1014

My project using SignalR (indeed DotNetify with React) is now well past 24 hours without a single problem. I might just be lucky, but I suspect the problem I was seeing is fixed.

OracPrime avatar Jan 11 '18 10:01 OracPrime

@sepehr1014 Does you application reject any request? I have a theory about the potential root cause. Is it possible for you to provide a dump at crash time? you can use windbg to attach w3wp.exe https://github.com/Windower/Issues/wiki/Creating-crash-dumps-with-Windbg. You can notify me at "panwang AT Microsoft dot com"

pan-wang avatar Jan 11 '18 19:01 pan-wang

@sepehr1014 could you please ping me at my Microsoft email address. Need your help to verify something.

pan-wang avatar Jan 12 '18 22:01 pan-wang

@pan-wang Of course. I'd be glad to help! Just sent you the dump file.

sepehr1014 avatar Jan 13 '18 09:01 sepehr1014

The problem definitely seems related to Web Sockets: capture

sepehr1014 avatar Jan 13 '18 13:01 sepehr1014

Having the same issue but right after application startup. Windows Server 2008 R2 SP1, IIS 7.5, latest DotNetCore.2.0.5-WindowsHosting

Windows Log > Application

Error

Faulting application name: w3wp.exe, version: 7.5.7601.17514, time stamp: 0x4ce7a5f8 Faulting module name: aspnetcore.dll, version: 7.1.1989.0, time stamp: 0x5a38211f Exception code: 0xc0000005 Fault offset: 0x0001bc34 Faulting process id: 0x22b8 Faulting application start time: 0x01d3bc7307e039de Faulting application path: C:\Windows\SysWOW64\inetsrv\w3wp.exe Faulting module path: C:\Windows\system32\inetsrv\aspnetcore.dll Report Id: 52b802d4-2866-11e8-bc51-00155dfade45

Information

Fault bucket , type 0 Event Name: APPCRASH Response: Not available Cab Id: 0

Problem signature: P1: w3wp.exe P2: 7.5.7601.17514 P3: 4ce7a5f8 P4: aspnetcore.dll P5: 7.1.1989.0 P6: 5a38211f P7: c0000005 P8: 0001bc34 P9: P10:

Attached files:

These files may be available here: C:\ProgramData\Microsoft\Windows\WER\ReportQueue\AppCrash_w3wp.exe_6150e2d5b7d61824d81f7844a5fdefba1132ab3_23307c90

Analysis symbol: Rechecking for solution: 0 Report Id: 52b802d4-2866-11e8-bc51-00155dfade45 Report Status: 4

BeniFreitag avatar Mar 15 '18 15:03 BeniFreitag

FYI DotNetCore.2.0.6-WindowsHosting is now available with at least one ANCM AV fix.

Tratcher avatar Mar 15 '18 16:03 Tratcher

@BeniGemperle A patch was released this Tuesday. Could you please try https://go.microsoft.com/fwlink/?linkid=869674 . For any issue , please ping me at panwang'at'microsoft'dot'com

pan-wang avatar Mar 15 '18 16:03 pan-wang

Thanks for your update. The software-update didn't fix the issue but I was able to track it down to a configuration in the IIS Application Pool. We had changed these 3 default-settings which caused the error:

  1. Enable 32-Bit Applications True
  2. CPU Limit: 70000
  3. CPU Limit Action: Killw3WP

These settings seem to be related because:

  1. When only Enable 32-Bit Applications is set to True, it works!
  2. When only CPU limit is set, it results to a HTTP Error 502.5 - Process Failure (but no crash)
  3. When CPU limit and 32-Bit are set, it results to the process-crash as described above.

I tested this on another IIS 7.5 (Windows Web Server 2008 R2 SP1, Version 6.1 (Build 7601: Service Pack 1)). Both servers have the same build and latest DotNetCore.2.0.6-WindowsHosting and the exact same behavior!

On IIS 8.5 (Windows Server 2012 R2 Standard, Version 6.1 (Build 9600)) this issue does not occur but the Application-Pool-Setting there is named CPU Limit (percent) and set to 70.

Obviously 32-Bit mode for Asp.net Core doesn't make much sense, so we changed it to 64-Bit (default). But the CPU-Limit on IIS 7.5 should work or at least not result in a complete crash. As a workaround we disabled this CPU-Limit on IIS 7.5. So we have now a working setup on all servers :smile:

BeniFreitag avatar Mar 21 '18 10:03 BeniFreitag

I was having this same problem, thanks BeniGemperle

GGavazzo avatar May 21 '18 17:05 GGavazzo