nob.h icon indicating copy to clipboard operation
nob.h copied to clipboard

`NOB_EXPERIMENTAL_DELETE_OLD` on Windows

Open mikmart opened this issue 4 months ago • 16 comments

I caught the request for someone to test NOB_EXPERIMENTAL_DELETE_OLD on Windows on stream, but I didn't see the whole thing so not sure if this was already done; apologies if so.

I tried it out with 1.23.0 with the following program and found that for me it doesn't work:

#define NOB_IMPLEMENTATION
#define NOB_EXPERIMENTAL_DELETE_OLD
#include "nob.h"

int main(int arg, char **argv) {
    NOB_GO_REBUILD_URSELF(arg, argv);
    nob_log(NOB_INFO, "Hello, World!");
    return 0;
}
$ nob
[INFO] renaming nob.exe -> nob.exe.old
[INFO] CMD: cl.exe /Fe:nob.exe nob.c
Microsoft (R) C/C++ Optimizing Compiler Version 19.44.35213 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

nob.c
Microsoft (R) Incremental Linker Version 14.44.35213.0
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:nob.exe
nob.obj
[INFO] deleting nob.exe.old
[ERROR] Could not delete file nob.exe.old: Access is denied.
[INFO] CMD: nob.exe
[INFO] Hello, World!

mikmart avatar Aug 23 '25 07:08 mikmart

@mikmart yep, just as I thought. Thank you so much!

I wonder what would be a solution on Windows...

rexim avatar Aug 23 '25 11:08 rexim

As far as i know, it's never possible to delete a file when in use. Not even unlinking it. The only thing i can think of that would work would be a helper program that renames the file, calls nob again to delete that file and immediately exit nob again. So

  • Compile the helper program that renames the nob output
  • Run it after nob is done so it can rename the nob output
  • When the helper program is done, call nob with a flag of some sort to just delete the helper program and immediately exit.

I'm only familiar with nob from the stream, and i realize this is a very silly and roundabout way of doing it. But this should at least work

arxae avatar Aug 24 '25 07:08 arxae

I tried some more things, like:

HANDLE hFile = CreateFileA(
    path,
    /* dwDesiredAccess       = */ GENERIC_READ,
    /* dwShareMode           = */ FILE_SHARE_READ , // or FILE_SHARE_DELETE
    /* lpSecurityAttributes  = */ 0,
    /* dwCreationDisposition = */ OPEN_EXISTING,
    /* dwFlagsAndAttributes  = */ FILE_FLAG_DELETE_ON_CLOSE | FILE_FLAG_OPEN_REPARSE_POINT | FILE_FLAG_POSIX_SEMANTICS,
    /* hTemplateFile         = */ 0
);

if (hFile == INVALID_HANDLE_VALUE) {
    nob_log(NOB_ERROR, "no file? %p (%s)", hFile, nob_win32_error_message(GetLastError()));
    return false;
}

CloseHandle(hFile);

and renaming it again before delete (just in case):

char tmp_name[MAX_PATH+1] = {0};

if (!GetTempFileName(".", "del", 0, tmp_name)) { 
    nob_log(NOB_ERROR, "Could not get temp name");
    return false; 
}
if (!nob_rename(path, tmp_name)) { 
    nob_log(NOB_ERROR, "Could not rename to %s", tmp_name); 
    return false;
}
if (!DeleteFileA(tmp_name)) {
    nob_log(NOB_ERROR, "Could not delete file %s: %s", tmp_name, nob_win32_error_message(GetLastError()));
    return false;
}

But the actual problem on Windows appears to be that handles to running processes are linked to their original executable, regardless of whether that executable is moved or renamed. With Sysinternals Process Explorer, I could verify that the current running instance of nob that tries to delete the "old" nob build, is actually the nob.exe.old itself. So since it's still running and referring to itself, it cannot be deleted, hence the Access is denied error.

A working solution I came up with, is to defer the deletion of the old nob to the new nob instance. By passing an argument with the old nob's binary path, and then running the new nob async, so the old one can exit. In the new nob, because of the extra argument, we know to delete the old -now exited- nob.

This is what @arxae also said above (I was already working on the PR #116), with the difference that the temp program is just the new nob instance.

ThenTech avatar Aug 24 '25 08:08 ThenTech

A working solution I came up with, is to defer the deletion of the old nob to the new nob instance. By passing an argument with the old nob's binary path, and then running the new nob async, so the old one can exit. In the new nob, because of the extra argument, we know to delete the old -now exited- nob.

I was thinking along these lines too. But now the problem is that the new nob is running as a background process, making it difficult to e.g. terminate. I wonder if there's a way to somehow promote the new nob to the foregound process when the old nob exits?

mikmart avatar Aug 24 '25 09:08 mikmart

One way could be to just let the second instance exit after deleting the first, instead of returning and running user code in nob.c. When the user runs nob again afterwards, no rebuild or delete would be triggered, so then it's safe to proceed into user code. But then you end up having to run it twice every time it needs to rebuild, kind of defeating the purpose.

Another way, like you said before, would be to just generate a .bat file that waits a bit, deletes the old nob, and then itself (I think a script is able to delete itself?). And only do this after the new nob exits. (So let the old nob synchronously wait on the new one like normal, then create the script and run it, and then exit)

ThenTech avatar Aug 24 '25 09:08 ThenTech

Another way, like you said before, would be to just generate a .bat file that waits a bit, deletes the old nob, and then itself (I think a script is able to delete itself?). And only do this after the new nob exits. (So let the old nob synchronously wait on the new one like normal, then create the script and run it, and then exit)

Yeah and actually you can just call del. The below seems to work fairly reliably, just before exiting the old nob.

#if defined(NOB_EXPERIMENTAL_DELETE_OLD) && defined(_WIN32)
    nob_cmd_append(&cmd, "cmd.exe", "/c", "del", old_binary_path);
    nob_cmd_run_async(cmd);
#endif

It's not ideal to do this only after the new nob has finished executing, but seems like a workable solution for now?

mikmart avatar Aug 24 '25 10:08 mikmart

Didn't think of that, bat files can indeed delete themselves. Doesn't even have to be a long one. This should work (notepad for test)

$ProcessName = "notepad"
$CheckIntervalSeconds = 3

if (Get-Process -Name $ProcessName -ErrorAction SilentlyContinue) {
    while (Get-Process -Name $ProcessName -ErrorAction SilentlyContinue) {
        Start-Sleep -Seconds $CheckIntervalSeconds
    }
}

Remove-Item ./old_nob.exe
Remove-Item ./del_nob.ps1

It's in powershell, but that should be an basically any pc now anyways. But thats quite a bit shorter then doing it in bat

arxae avatar Aug 24 '25 10:08 arxae

Another way, like you said before, would be to just generate a .bat file that waits a bit, deletes the old nob, and then itself (I think a script is able to delete itself?). And only do this after the new nob exits. (So let the old nob synchronously wait on the new one like normal, then create the script and run it, and then exit)

Yeah and actually you can just call del. The below seems to work fairly reliably, just before exiting the old nob.

#if defined(NOB_EXPERIMENTAL_DELETE_OLD) && defined(_WIN32)
    nob_cmd_append(&cmd, "cmd.exe", "/c", "del", old_binary_path);
    nob_cmd_run_async(cmd);
#endif

It's not ideal to do this only after the new nob has finished executing, but seems like a workable solution for now?

Since it's an experimental feature we can go with such weird solution as this. Let's implement it and see how it works in the wild. Also, what exactly do you mean "fairly reliably"? Did you have instances when it failed?

rexim avatar Aug 24 '25 12:08 rexim

I think this can still sometimes trip over itself if the nob process doesn't exit fast enough. Seeing the circumstances, i don't think it's that weird. If we want the feature to work like this, it has to happen outside of the nob process. Not to push my solution, but i think having a script (whether it would be powershell, bat or something else) is still the way to go. Nob should just call it when exiting, and the script can delete the old nob and itself and it's all cleaned up. The only way it could go wrong is when the nob process doesn't want to exit, but then we have different issues. But i'm not an expert, just my 2 cents.

arxae avatar Aug 24 '25 13:08 arxae

Also, what exactly do you mean "fairly reliably"? Did you have instances when it failed?

@rexim I didn't see any failures, but I havent tested it very thoroughly.

I think this can still sometimes trip over itself if the nob process doesn't exit fast enough.

@arxae Yes, inserting a Sleep(1) before exiting is enough to often trigger an "Access is denied" error.

Not to push my solution, but i think having a script (whether it would be powershell, bat or something else) is still the way to go. Nob should just call it when exiting, and the script can delete the old nob and itself and it's all cleaned up.

The same race condition is still there, no? It's just a bit more convenient to deal with the waiting in a script?

mikmart avatar Aug 24 '25 13:08 mikmart

The script can wait for the process to exit, sidestepping the race condition completely. It can then delete the old nob build and itself, leaving just the new build behind.

arxae avatar Aug 24 '25 13:08 arxae

Yeah I just meant it doesn't need to be a script file to do that; e.g. something like this could do that:

#if defined(NOB_EXPERIMENTAL_DELETE_OLD) && defined(_WIN32)
    Nob_String_Builder sb = {0};
    nob_sb_appendf(&sb, "$Attempts = 0; While ($Attempts++ -Lt 10) {\n", old_binary_path);
    nob_sb_appendf(&sb, "    Remove-Item -Path %s -ErrorAction SilentlyContinue\n", old_binary_path);
    nob_sb_appendf(&sb, "    If (Test-Path -Path %s) { Start-Sleep -Milliseconds 100 }\n", old_binary_path);
    nob_sb_appendf(&sb, "}");
    nob_sb_append_null(&sb);
    Nob_Procs procs = {0};
    opt.async = &procs;
    nob_cmd_append(&cmd, "PowerShell", "-Command", sb.items);
    if (!nob_cmd_run_opt(&cmd, opt)) exit(1);
#endif // NOB_EXPERIMENTAL_DELETE_OLD
$ delete_old
[INFO] renaming delete_old.exe -> delete_old.exe.old
[INFO] CMD: cl.exe /Fe:delete_old.exe tests\delete_old.c
Microsoft (R) C/C++ Optimizing Compiler Version 19.44.35213 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

delete_old.c
Microsoft (R) Incremental Linker Version 14.44.35213.0
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:delete_old.exe
delete_old.obj
[INFO] CMD: delete_old.exe
[INFO] touching tests\delete_old.c
[INFO] CMD: PowerShell -Command '$Attempts = 0; While ($Attempts++ -Lt 10) {
    Remove-Item -Path delete_old.exe.old -ErrorAction SilentlyContinue
    If (Test-Path -Path delete_old.exe.old) { Start-Sleep -Milliseconds 100 }
}'

Which is quite funny to me, but also maybe clearer than running a file which is deleted so the user doesn't know what it was? 😄

mikmart avatar Aug 24 '25 17:08 mikmart

@mikmart I added your changes from above to my PR #116 instead of my previous attempt.

Running a script adds the least changes. I also added quotes around the %s format strings, as those are probably required for when the path contains spaces.

The only additional note I have, is that running a PowerShell script like this might trigger antivirus software to think we are trying to run malicious code. So the script might not run, or nob.exe itself could potentially be quarantined (since it "behaves strangely" by running additional processes, renaming itself, and then running a PowerShell script).

ThenTech avatar Aug 24 '25 19:08 ThenTech

I was fairly tired when i posted that, didn't come up that you could just run the script without saving it as a file. But running it like that is a lot cleaner then with a temporary file

The only additional note I have, is that running a PowerShell script like this might trigger antivirus software to think we are trying to run malicious code. So the script might not run, or nob.exe itself could potentially be quarantined (since it "behaves strangely" by running additional processes, renaming itself, and then running a PowerShell script).

I don't think it will be that much of a problem? As long as you don't do it in one of the special folders like appdata or so, we should be fine? Although i guess that could use some actual testing

arxae avatar Aug 25 '25 02:08 arxae

I don't think it will be that much of a problem? As long as you don't do it in one of the special folders like appdata or so, we should be fine? Although i guess that could use some actual testing

An antivirus with behavioural scanning will trigger (PowerShell) scripts executions like this. I use Comodo Firewall personally, which has a "Host Intrusion Protection System" that "monitors system activity and stops processes from modifying important files and interfaces." In essence, this sandboxes all indirect script executions to prevent running malicious code like I said. And the first time this is detected, it will ask to allow or block it.

It's just a side note. I expect that a developer using nob, would be familiar with what they are doing and see that specifically on Windows, nob will try to delete the old artefact with a script. So I think this is a reasonable thing to exclude from any (antivirus) software that tries to intervene. And even if the script is blocked, the only thing that happens is that nob.exe.old is still there. All other operations did work as expected.

In my PR, having nob_log(NOB_INFO, "We will use a PowerShell command to defer deleting %s...", old_binary_path); or something similar, should be enough to let the user know that this will happen.

ThenTech avatar Aug 25 '25 21:08 ThenTech

I've come up with a partial solution that does not involve defering deletion to scripts or running a del command.

I say partial because it works only with NTFS Volumes.

A detailed description of how it works is in the description of the PR as well as in the code itself. The technique is still rename -> delete workflow, but uses properties of NTFS streams to circumnavigate Windows processes desire to keep a reference to the binary holding their Image.

EdoardoFigini avatar Sep 01 '25 17:09 EdoardoFigini