fscrawler icon indicating copy to clipboard operation
fscrawler copied to clipboard

Use file ctime, not mtime for detecting modified or new files

Open slowfranklin opened this issue 1 year ago • 2 comments

Hi David,

I just ran into the following problem using fscrawler to index files on a Samba fileserver for use with Spotlight and Apple clients:

when Macs copy a file to the Samba server the client will subsequently set the timestamps (atime, mtime and btime (birthtime/creation date)) of the copied file to the same value as the original file.

For older files the mtime of the files will typically be older compared to the last fscrawler invocation, as a result fscrawler will ignore this will in an index run.

Here's an example stat output from today on such a file on a Samba server:

# stat 'some.tif'
  File:  some.tif
  Size: 2416300         Blocks: 4728       IO Block: 4096   regular file
Device: fd00h/64768d    Inode: 67268640    Links: 1
Access: (0666/-rw-rw-rw-)  Uid: ( 1001/ smbtest)   Gid: ( 1001/ smbtest)
Context: system_u:object_r:default_t:s0
Access: 2022-08-03 15:36:45.779778451 +0200
Modify: 2022-03-23 07:51:38.000000000 +0100
Change: 2022-08-03 15:36:44.847786265 +0200
 Birth: 2022-08-03 15:36:42.485806064 +0200

As the ctime is dating back 5 months, fscrawler refused to index the file unless forced with --restart.

If fscrawler would look at the ctime value instead of mtime, this would solve this problem as the ctime can't be set by userspace and will always reflect the last date the file inode was created and possibly modified subsequently by any file content or metadata changes.

If not changing the default behaviour, would it be possible to get an fscrawler option to use ctime instead of mtime?

Thanks! -slow

slowfranklin avatar Aug 03 '22 14:08 slowfranklin

If fscrawler would look at the ctime value instead of mtime, this would solve this problem as the ctime can't be set by userspace and will always reflect the last date the file inode was created and possibly modified subsequently by any file content or metadata changes.

That'd be surely a good thing to do.

If not changing the default behaviour, would it be possible to get an fscrawler option to use ctime instead of mtime?

I know that changing the time field might have side effect so an option would may be better.

dadoonet avatar Aug 21 '22 11:08 dadoonet