DotNetZip.Semverd icon indicating copy to clipboard operation
DotNetZip.Semverd copied to clipboard

Intermittent System.NullReferenceException at ZipFile.Reset line 2377

Open psantoro opened this issue 7 years ago • 1 comments

Firstly, this is not actually a bug in DotNetZip; however, I'm documenting a DotNetZip usage scenario in which intermittent failures can occur due to an unexpected race condition.

Scenario Description

A binary vendor.exe (with a dependency on DotNetZip.dll) calls customer in-process plugin DLLs. The vendor.exe uses DotNetZip to zip/unzip networked files. This has worked fine for years. A recent vendor.exe update included some feature changes whereby their usage of DotNetZip changed (i.e. they are now calling the ZipFile.Save method repeatedly on the same ZipFile instance). This scenario produced the following intermittent exception:

System.NullReferenceException: Object reference not set to an instance of an object. at Ionic.Zip.ZipFile.Reset(Boolean whileSaving) at Ionic.Zip.ZipFile.Save()

Neither the vendor or my colleagues were able to reproduce this issue. The intermittent nature of these exceptions suggested that I was dealing with a race condition (normally associated with concurrent applications). I initially used tracing to see if any ZipFile instances were being accessed via multiple threads. They were not.

Analysis of the Problem

Modern versions of Windows include three SMB 2.0 client caches https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-7/ff686200(v=ws.10). Client-side SMB caching is generally a very good feature, as it improves performance of applications that perform network I/O - while at the same time reduces stress on the associated file servers. Unfortunately, SMB 2.0 caching also injects potential race conditions into any Windows application that performs networked file I/O.

For example: An application/library may A) create the network file XYZ, B) then later delete or rename the network file XYZ, and C) then later recreate the network file XYZ. However, if the client-side SMB cache is refreshed by the operating system right after step B (but before step C), then the cache temporarily would be in a position (i.e. cache is stale) to report that the file is not found when it actually is there. While the application state and SMB caches are out of sync, subsequent calls by the application/library to see if the network XYZ file exists will now return an incorrect result and cause unexpected behavior. This type of logic (A, B, C above) exists in the ZipFile.Save (line 230) and is in fact what is contributing to the the intermittent NullReferenceExceptions in the Reset method:

    internal void Reset(bool whileSaving)
    {
        if (_JustSaved)
        {
            // read in the just-saved zip archive
            using (ZipFile x = new ZipFile())
            {
                if (File.Exists(this._readName ?? this._name))  // <---- sometimes returns false, when in fact the file exists
                {
                    // workitem 10735
                    x._readName = x._name = whileSaving
                        ? (this._readName ?? this._name)
                        : this._name;
                }
                else // if we just saved to a stream no file is available to read from
                {
                    if (_readstream.CanSeek)  // <---- NullReferenceException occurs here

I was able to show, via extensive tracing, that although the File.Exists call noted above sometimes returned false, if I inserted logic to wait for 4-5 seconds, the subsequent calls to File.Exists did in fact return true. This 4-5 second delay is significant, given the default value for the FileNotFoundCacheLifetime which is 5 seconds.

Solution

Here's the solution that I successfully used to prevent this race condition from happening:

Per the MS link above, create and set the following registry keys to 0 (zero) a. DirectoryCacheLifetime b. FileNotFoundCacheLifetime

Peter Santoro

psantoro avatar Feb 02 '18 14:02 psantoro

I worked with Peter on this as well. It would be helpful to add some meaningful exception handling to the ZipFile class and to this Reset method in particular. An enormous amount of work was put in by Peter for us to even discover that the file was not found in the first place. If we had known that at the outset, I think we may have been able to track this down a little quicker.

A simple null check and a throws new would be very helpful - I haven't had time to fully digest the content of the library, but as with any software, I am sure there are places for improvement. Some additional exception handling would go a long way.

matthewrishii avatar Feb 02 '18 16:02 matthewrishii