cshatag icon indicating copy to clipboard operation
cshatag copied to clipboard

Updating after corrupt file detected

Open drnickyoung opened this issue 2 years ago • 10 comments

Hi,

Firstly excellent tool. I have used a few of these/similar tools, the last one saving to a file per directory but I wanted something that moved with the files. Anyway I know this is me not understanding, so what I have done:

  1. Create a test file with a known timestamp and run cshatag
echo test_1 > test
touch -t 202301300000 test
cshatag test

I get the result as I would expect:

<new> test
 stored: 0000000000000000000000000000000000000000000000000000000000000000 0000000000.000000000
 actual: 5a18f75b3ce3ed6550c33f23bb21f833bd63a159cb592a272fd1c61f98de5111 1675036800.000000000
  1. I update the file:
echo test_2 > test
touch -t 202301300001 test 
cshatag test

And as expected get the output:

<outdated> test
 stored: 5a18f75b3ce3ed6550c33f23bb21f833bd63a159cb592a272fd1c61f98de5111 1675036800.000000000
 actual: 8f1d878efe7586c55c8f0d7578ac59efda6831778eb5fba5f68b2f21a3519609 1675036860.000000000
  1. Simulate some corruption:
 echo test_3 > test
 touch -t 202301300001 test
 cshatag test

And as expected it is detected.

 Error: corrupt file "test"
<corrupt> test
 stored: 8f1d878efe7586c55c8f0d7578ac59efda6831778eb5fba5f68b2f21a3519609 1675036860.000000000
 actual: 8f89c43b0cd072e7127bcf26635d4e2febdacbb737bdb44f797e4e96b2408d73 1675036860.000000000
  1. Now I don't touch anything and rerun the command 'cshatag test' I expect to see the same error as above (3), but instead I get:
 <ok> test

I know this is the expected result according to the 'run_tests.sh' script you have. However I am failing to see why. If a file is corrupt then surely the attribute should not get updated, wouldn't you want it to keep showing as corrupt?

drnickyoung avatar Jan 30 '23 10:01 drnickyoung

I had the same question of how cshatag would behave after running into the same corrupt file. It's current logic would say that the file is okay, because the corrupt file didn't have any change, but of course it goes against the intuition that that file is still corrupt. Unfortunately, I don't think this is a straight forward problem to solve. In the mean time, I think we need to keep logs of files that cshatag report as corrupted.

Right now ... when cshatag says "test ok", it just means it didn't detect any new change that could corrupt for file since the last time cshatag runs a check, but it DOES NOT mean the file is actually okay and not corrupted. Therefore, without keeping detailed logs of past corruption reports, the message "test ok" doesn't mean much.

Ken0sis avatar Apr 08 '23 11:04 Ken0sis

Right now ... when cshatag says "test ok", it just means it didn't detect any new change that could corrupt for file since the last time cshatag runs a check, but it DOES NOT mean the file is actually okay and not corrupted. Therefore, without keeping detailed logs of past corruption reports, the message "test ok" doesn't mean much.

Maybe a command line modifier could be a good approach to fix this issue? Like separate update from crc check?

ifsnop avatar Apr 08 '23 21:04 ifsnop

I've set up a script to save cshatag outputs to log files that I can reference, and check for occurrences of corruption in the past.

Ken0sis avatar Apr 10 '23 13:04 Ken0sis

Hi,

I have a possible solution for this. Patch file attached. I have added one argument/option '-corruptupdate' which when used will update the CRC of any corrupted files (i.e. the current default behaviour). Without this option the code doesn't update the attribute/CRC and subsequent runs of the code will still show the file as corrupt.

I have done it this way as in my view a corrupted file should remain flagged as corrupt until fixed (or some other actions is taken).

corrupted_errors.patch

drnickyoung avatar Apr 16 '23 12:04 drnickyoung

Dr. Nick Young highlighted a valid concern, which I also encountered. As a result, I developed a bash script for managing file hashing and verification that doesn't rely on extended file attributes. It's available in the "lunacopy" repository. I trust it may be of value to those still seeking such a solution.

artem-r-d avatar Aug 19 '23 04:08 artem-r-d

Hi,

I have a possible solution for this. Patch file attached. I have added one argument/option '-corruptupdate' which when used will update the CRC of any corrupted files (i.e. the current default behaviour). Without this option the code doesn't update the attribute/CRC and subsequent runs of the code will still show the file as corrupt.

I have done it this way as in my view a corrupted file should remain flagged as corrupt until fixed (or some other actions is taken).

corrupted_errors.patch

how can be applied your patch?

franalta avatar Aug 22 '23 14:08 franalta

Hi, I have a possible solution for this. Patch file attached. I have added one argument/option '-corruptupdate' which when used will update the CRC of any corrupted files (i.e. the current default behaviour). Without this option the code doesn't update the attribute/CRC and subsequent runs of the code will still show the file as corrupt. I have done it this way as in my view a corrupted file should remain flagged as corrupt until fixed (or some other actions is taken). corrupted_errors.patch

how can be applied your patch?

Yes, the patch Dr. Nick Young posted is in the standard format produced by the git diff command, and it can be applied directly using the git apply command.

Here's a step-by-step guide on how you can apply this patch to a local copy of the repository:

  1. Save the patch to a file: If you haven't done so already, save the contents of the patch to a file, e.g., corrupted_errors.patch.

  2. Navigate to the repository: Open a terminal and navigate to the root directory of the local copy of the repository where you wish to apply the patch.

  3. Check for uncommitted changes: Before you apply the patch, ensure that you don't have any uncommitted changes. You can do this using:

    git status
    

    If there are any changes, commit them or stash them.

  4. Apply the patch: Use the git apply command to apply the patch:

    git apply path/to/corrupted_errors.patch
    
  5. Review the changes: After applying the patch, you can review the changes using git diff to see what modifications have been made to the working directory.

  6. Commit the changes: If you're satisfied with the changes, you can commit them:

    git commit -am "Description of the changes"
    

Please note:

  • Always make sure to review the changes brought in by the patch before committing them, especially if you're pulling the patch from an untrusted source.

  • Sometimes, patches might fail to apply cleanly due to differences between the patch's base code and the current state of the repository. In such cases, you may need to manually resolve the conflicts.

The format of the patch is quite standard. Lines that are prepended with a - are lines that have been removed, and lines prepended with a + are lines that have been added. The @@ lines show the context, specifically which lines in the original file are being modified. This context helps git figure out where to apply the changes even if the file has changed slightly since the patch was created.

The patch file can also be opened as a text file by Notepad or any text editor and you can see what the changes are.

artem-r-d avatar Aug 22 '23 15:08 artem-r-d

@artem-r-d thank You for the help :)

franalta avatar Sep 06 '23 19:09 franalta

There is one problem with not updating the checksum: You will not notice when the file gets corrupted again (unless you compare the checksum).

But making the behavoir configurable would be a good thing.

Related: https://github.com/rfjakob/cshatag/pull/9

rfjakob avatar Oct 29 '23 12:10 rfjakob

@rfjakob Thanks for the reply.

There is one problem with not updating the checksum: You will not notice when the file gets corrupted again (unless you compare the checksum).

Yes, however if the file is already corrupt, any further corruption is irrelevant until it is fixed. The way I see this working, with the patch, is thus:

Case 1 - no action

  1. I run cshatag
  2. It detects a corrupt file.
  3. I do nothing at all
  4. I run cshatag
  5. The corrupt file is still flagged.

Case 2 - fixing the corrupt file

  1. I run cshatag
  2. It detects a corrupt file.
  3. I fix the file and then on this file only...
  4. I run cshatag -corruptupdate - that will update the file and should no longer flag as corrupt.
  5. File is fixed, no longer shows as corrupt on further cshatag runs. Others will until they are fixed.

That way any corruption will show on every run until such time as I specifically tell it to update the file, i.e. I have fixed it.

drnickyoung avatar Oct 30 '23 06:10 drnickyoung