dokany icon indicating copy to clipboard operation
dokany copied to clipboard

Problem in the mirroring of files when they are read using memory mapping io.

Open grandemk opened this issue 2 years ago • 2 comments

Environment

  • Windows version: 10
  • Processor architecture: x64
  • Dokany version: 2.0.5.1000
  • Library type (Dokany/FUSE): Dokany

Check List

  • [ x ] I checked my issue doesn't exist yet
  • [ x ] My issue is valid with mirror default sample and not specific to my user-mode driver implementation
  • [ x ] I can always reproduce the issue with the provided description below.
  • [ x ] I have updated Dokany to the latest version and have reboot my computer after.
  • [ x ] I tested one of the last snapshot from appveyor CI

Description

Hi :)

The changes done in the root directory of the mirror do not seem to be mirrored in some cases related to memory mapping. The change in the size of the file is detected, but the content is wrong at the end of the file.

This problem appears in the mirror default sample. Here is a 100% reproduction of the problem.

Test1.txt used for the reproduction (you need a file sufficiently big for the error to occur) Test1.txt

# in one shell
# Create tests/Assets and tests/AssetsVirtual and put Test1.txt in tests/Assets.
# Then mount tests/Assets as tests/AssetsVirtual
$ ./x64/Release/mirror.exe /r "tests/Assets" /l "tests/AssetsVirtual" /e /d /s
# in another shell
# tests/Assets/ is mounted as tests/AssetsVirtual for this reproduction

# First test, Assets/Test1.txt and AssetsVirtual/Test1.txt read the same.
$ ./test_map.exe
SUCCESS   

# Modification of Test1.txt in the root directory Assets/
$ echo "AA" >> tests/Assets/Test1.txt          

# The modification isn't correctly seen in AssetsVirtual
$ ./test_map.exe
ERROR
[!] tests/AssetsVirtual/Test1.txt and tests/Assets/Test1.txt do not have the same content
        (acc:20074995 vs acc:20075060)                                                                                                                                                                      
# Latest character of the file:
Assets/Test1.txt: '222222\nAA\n'
AssetsVirtual/Test1.txt: '222222\nA\n ' # one A is missing, replaced by a \n and there is an extra space at the end.
// test_mapping.cpp
#include <Windows.h>
#include <fileapi.h>
#include <stdio.h>

struct TResult
{
    long long int acc = -1;
    long long int fileSize = -1;
    long long int regionSize = -1;
};

TResult test_mmap(const wchar_t* filename)
{
    TResult res {};
    HANDLE file = CreateFileW(filename, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
    if(!file)
    {
        printf("CreateFileW failed\n");
        return res;
    }
    HANDLE mo = CreateFileMappingW(file, NULL, PAGE_READONLY, 0, 0, NULL);
    if (!mo)
    {
        printf("CreateFileMapping failed\n");
        return res;
    }
    const unsigned char* pBuf = (unsigned char*)MapViewOfFile(mo, FILE_MAP_READ, 0, 0, 0);
    if(!pBuf)
    {
        printf("MapViewOfFile failed\n");
        return res;
    }

    MEMORY_BASIC_INFORMATION mbi = {0};
    if (!VirtualQueryEx(GetCurrentProcess(), pBuf, &mbi, sizeof(mbi)))
    {
        printf("VirtualQueryEx failed\n");
        return res;
    }

    DWORD fileSizeHigh;
    DWORD fileSizeLow = GetFileSize(file, &fileSizeHigh);
    long long int fileSize = ((long long int)fileSizeHigh << 32) | fileSizeLow;

    long long int textSize = fileSize > mbi.RegionSize ? mbi.RegionSize : fileSize;

    CloseHandle(mo);
    CloseHandle(file);

    long acc = 0;
    for (long i = 0; i < textSize; ++i)
    {
        acc += pBuf[i];
    }

    UnmapViewOfFile(pBuf);

    res.acc = acc;
    res.fileSize = fileSize;
    res.regionSize = mbi.RegionSize;

    return res;
}

int main()
{
    const wchar_t* virtual_filename = L"tests/AssetsVirtual/Test1.txt";
    const wchar_t* real_filename = L"tests/Assets/Test1.txt";
    TResult virt = test_mmap(virtual_filename);
    TResult real = test_mmap(real_filename);
    int errorCode = 0;
    if (real.acc != virt.acc)
    {
        printf("ERROR\n");
        printf("[!] %ls and %ls do not have the same content\n\t(acc:%llu vs acc:%llu)\n", virtual_filename, real_filename, virt.acc, real.acc);
        errorCode = 1;
    }
    else
    {
        printf("SUCCESS\n");
    }

    printf("INFO\n");
    printf("%ls\n\tfile size: %llu\n\tregion size: %llu\n\tacc: %llu\n", real_filename, real.fileSize, real.regionSize, real.acc);
    printf("%ls\n\tfile size: %llu\n\tregion size: %llu\n\tacc: %llu\n", virtual_filename, virt.fileSize, virt.regionSize, virt.acc);
    return errorCode;
}

(Note that there is a correlation between this bug and the fact that it passes in the Cleanup then Read behaviour documented in the Dokan documentation and mentionned in https://github.com/dokan-dev/dokany/issues/1016. I couldn't replicate it without having the Cleanup then Read behavior.)

grandemk avatar Sep 22 '22 11:09 grandemk

@grandemk Sorry for the long delay. Could you elaborate more on (Note that there is a correlation between this bug and the fact that it passes in the Cleanup then Read behaviour documented in the Dokan documentation and mentionned in #1016. I couldn't replicate it without having the Cleanup then Read behavior.) ? Did you changed the sample for it to work or fail ?

Liryna avatar Dec 11 '22 22:12 Liryna

Hi, I didn't change the mirror sample.

Mostly, some sizes of the file Test1.txt trigger the bug, so I just tweaked the content of Test1.txt until the bug occured (with, for example the file I linked).

From what I remember, I talked about correlation because I could get two path of execution in the Mirror Sample when I read the content of the file:

  1. Create -> Read -> Cleanup (normal path of execution)
  2. Create -> Cleanup -> Read (the path that can happen with mmaped file)

While tweaking the content of Test1.txt to trigger the bug, I would only see 2. Create -> Cleanup -> Read when I reproduced the bug.

This would be called only one time, and then subsequent Read wouldn't show any calls to the Mirror Sample Dokan API Implementation, as if the information was cached before the user space driver and wasn't correctly invalidated in some case (probably related to some kind of rounding in page size, as writing enough data would at some point invalidate this cache and the correct data would be read.)

grandemk avatar Dec 12 '22 16:12 grandemk