coreutils icon indicating copy to clipboard operation
coreutils copied to clipboard

Expose touch for Nushell

Open ysthakur opened this issue 1 year ago • 25 comments

Part of https://github.com/uutils/coreutils/issues/5088 (and https://github.com/nushell/nushell/issues/11549, over in the Nushell repo). This should allow Nushell (and possibly Rust applications) to use uu_touch. I'll be making a PR in Nushell based on this one soon.

Stuff that's public now:

  • A touch function taking a list of InputFile objects and an Options object
  • An InputFile struct with a Path(PathBuf) variant (for normal files) and a Stdout variant (for -). It's essentially an Option, so maybe it would be a better idea for touch to accept an Option<&Path> so the caller doesn't have to make a PathBuf?
  • An Options struct with the following fields:
    • no_create: bool - -c, --no-create
    • no_deref: bool - -h, --no-dereference
    • change_times: ChangeTimes - An enum indicating whether the atime, mtime, or both are to be changed
    • date: Option<String> - -d, --date
    • source: Source, where Source is an enum with the following variants:
      • Reference(PathBuf) - for -r/--reference
      • Timestamp(FileTime) - for -t
      • Now - If neither -r nor -t are passed
    • strict: bool - True to error on the first error of any sort, false to only print an error and continue if a file doesn't exist and either --no-dereference was passed or the file couldn't be created
  • An error module containing a TouchError enum

Things that might need to be changed:

  • The TouchError enum has a variant for errors that happened on a specific file. To let Nushell and other similar users know exactly which file touch failed on, this variant includes the index of the file in the original list of files passed to touch. I'm not sure if this is unidiomatic in some way, and if anyone can think of alternatives, that'd be great.
  • The strict option is not great. Users don't have the option to collect errors. Because of that, I made another branch where touch only works on a single file at a time. The tradeoff is that we need to make an OptsWithTimes struct with private fields that holds the current time. If you have multiple files to touch, you would reuse an instance of this struct when touching them so that they're all set to the same time. This comment goes into more detail.

ysthakur avatar Feb 05 '24 07:02 ysthakur

The only thing I'm doubting is what format we should expose for the times. I'd like to not expose to much of the internal implementation. For example, if we use FileTime, we impose that same crate to the users of touch (like nushell). I really want this to be a single call on the nushell side, so without exposing a conversion like datetime_to_filetime.

@tertsdiepraam If we want to accommodate Nushell, maybe a DateTime? I guess uu_touch itself will always pull in chrono and filetime even if users of uutils don't use them directly.

Also, I wonder if we should expose the reference file part as part of that touch function.

I initially tried going for an enum representing the time to set rather than two separate atime and mtime. The latter is easier to implement, but I guess the former would be a lot more ergonomic, so I could try that.

Basically, we should make the errors such that nushell can decide what to do on the error. I think we shouldn't show USimpleErrors but instead make a proper error enum that we can expose to nushell.

Good idea, I was considering making an error enum but felt too lazy to bother at the time :)

ysthakur avatar Feb 06 '24 02:02 ysthakur

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

github-actions[bot] avatar Feb 07 '24 06:02 github-actions[bot]

This is all quite interesting. Let's separate this from nushell for a bit. I think the goal on our side is to provide functions that allow for calling the coreutils from Rust, with an interface that matches the command line utilities as closely as possible. But it's sometimes a bit ambiguous as to what that means.

  • For example, should we provide a way to pass a reference file? If so, should that be given as a file or as path? Or should we just do what you did now and let people compute the times from a reference file themselves. I'm not quite sure.

  • The question around the time type is similar. It could be a FileTime, DateTime or even a datetime string that we parse.

Do you have ideas on what nushell would prefer here? Whatever we choose, I do still think it basically has to be a single function call. I'd like to keep the public API to a minimum so that we can keep changing things without having to keep the public API in mind.

tertsdiepraam avatar Feb 08 '24 13:02 tertsdiepraam

@tertsdiepraam Sorry for the late response. I think:

  • Providing a way to pass the reference file would probably be nice for anyone else using uutils. I guess taking a file rather than a path would eliminate the case where the reference file doesn't exist, which would be nice.
  • Having only a datetime string would be inconvenient if callers already have a FileTime or DateTime. To avoid choosing, perhaps all three could be allowed?

I also posted in Nushell to get others' feedback on this.

I was thinking about replacing the atime and mtime fields with a single field of type Times, which would be an enum like:

enum Times {
  Separate { atime: Source, mtime: Source },
  Same(Source),
}

enum Source {
  Reference { file: File, date: Option<String> },
  Timestamp(FileTime),
  Now, // Use current time
  Keep, // Don't update time
}

// Both functions would create `Source::Timestamp`s
impl Source {
  fn from_datetime(_: DateTime) -> Source {...}
  fn from_str(_: DateTime) -> Source {...}
}

By the way, calling touch separately for every file means that Local::now() will be called for each file. Is that fine? It might be a noticeable change in behavior if you're touching a lot of files.

ysthakur avatar Feb 09 '24 23:02 ysthakur

GNU testsuite comparison:

GNU test failed: tests/touch/fail-diag. tests/touch/fail-diag is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/touch/not-owner. tests/touch/not-owner is passing on 'main'. Maybe you have to rebase?

github-actions[bot] avatar Feb 10 '24 00:02 github-actions[bot]

By the way, calling touch separately for every file means that Local::now() will be called for each file. Is that fine? It might be a noticeable change in behavior if you're touching a lot of files.

Yeah, that's not great. It also means that all the files are set to different "now" times.

tertsdiepraam avatar Feb 10 '24 10:02 tertsdiepraam

Hmm, one solution to that would be passing in a list of files, but if there's an error, for Nushell to be able to highlight which exact file touch failed on, touch would have to return which file it failed on along with the error. This could be the file itself, but that would require comparisons, or the index of the file. I don't see any obvious problems with the latter, but an index might be kind of an odd thing to return (or at least it would be in the Java world, I'm not sure how acceptable it'd be in a Rust codebase).

ysthakur avatar Feb 10 '24 20:02 ysthakur

So I refactored touch to take a list of files so that it can use the same Local::now() timestamp for all the files. If it encounters an error on a particular file, the error includes the file's path as well as the file's index in the list of files (the solution discussed in my previous comment, so Nushell knows which argument to highlight).

I also made it so that users will have to pass in DateTime<Local>s if they want to explicitly pass in a timestamp. This was slightly more convenient, since DateTimes can be converted to FileTimes without problems, but FileTimes may be invalid DateTimes. If wanted, the timestamp could be a FileTime instead. Also, the current logic will throw an error if the reference file has a filetime that can't be converted to a DateTime, whether or not -d was passed (but that can be changed).

ysthakur avatar Feb 28 '24 05:02 ysthakur

GNU testsuite comparison:

GNU test failed: tests/touch/fail-diag. tests/touch/fail-diag is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/touch/not-owner. tests/touch/not-owner is passing on 'main'. Maybe you have to rebase?

github-actions[bot] avatar Feb 28 '24 06:02 github-actions[bot]

GNU testsuite comparison:

GNU test failed: tests/touch/fail-diag. tests/touch/fail-diag is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/touch/not-owner. tests/touch/not-owner is passing on 'main'. Maybe you have to rebase?

it means that you regressed the feature of touch, sorry!

sylvestre avatar Feb 28 '24 07:02 sylvestre

@sylvestre Whoops, I didn't realize the error messages had to match the GNU touch's error messages, fixed that now. The CI will still fail on test_no_dereference_no_file, though, because when touching multiple files, this PR makes it return on the very first error instead of showing an error and continuing. Maybe you could say through the Options enum if you want it to show an error or return immediately?

ysthakur avatar Feb 28 '24 18:02 ysthakur

GNU testsuite comparison:

Congrats! The gnu test tests/mv/backup-dir is no longer failing!
Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

github-actions[bot] avatar Mar 03 '24 11:03 github-actions[bot]

GNU testsuite comparison:

Congrats! The gnu test tests/chown/preserve-root is no longer failing!

github-actions[bot] avatar Mar 10 '24 00:03 github-actions[bot]

GNU testsuite comparison:

Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)

github-actions[bot] avatar Mar 11 '24 17:03 github-actions[bot]

@tertsdiepraam I think this is ready to review now. I've made some changes but updated the PR description accordingly.

ysthakur avatar Mar 13 '24 15:03 ysthakur

Alright, I'll try to answer some your questions. I'm not sure I have all the answers, because it's been a while, but here are my takes :)

  • I've changed my mind and I think we need to allow people to pass in a FileTime instead of a DateTime, because that simply avoids a lot of conversions, timezone shenanigans and that kind of stuff. touch is meant to set a FileTime on a file, so that's what it should take. We still should not expose a conversion method, because the caller should decide how they want to do that conversion (for example, which timezone to use).
  • I think accepting a slice is good enough at least for now. We don't need an iterator at the moment.
  • The strict thing is interesting. Because non-fatal errors might be interesting for nushell too. Ideally, we'd be able to either print or collect the errors, but maybe that's better to do later.

tertsdiepraam avatar Mar 21 '24 09:03 tertsdiepraam

GNU testsuite comparison:

Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)

github-actions[bot] avatar Mar 21 '24 15:03 github-actions[bot]

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

github-actions[bot] avatar Mar 21 '24 16:03 github-actions[bot]

  • I've changed my mind and I think we need to allow people to pass in a FileTime instead of a DateTime, because that simply avoids a lot of conversions, timezone shenanigans and that kind of stuff. touch is meant to set a FileTime on a file, so that's what it should take. We still should not expose a conversion method, because the caller should decide how they want to do that conversion (for example, which timezone to use).

So I originally went with DateTimes because it seemed to require conversions, but I just took your advice and used FileTimes instead and it does require fewer conversions now. I don't know what I was thinking before.

  • I think accepting a slice is good enough at least for now. We don't need an iterator at the moment.

Makes sense. I'll leave it as is.

  • The strict thing is interesting. Because non-fatal errors might be interesting for nushell too. Ideally, we'd be able to either print or collect the errors, but maybe that's better to do later.

Yeah, being able to collect the errors would be nice. Nushell's existing touch command fails on the first error rather than reporting non-fatal ones, but that might be mostly because Nushell doesn't have proper warnings yet. I can see two obvious solutions: returning a Result<Vec<TouchFileError>, TouchError> or something of that sort, or making touch accept a single file so that the caller can decide what to do with each error.

I'd prefer the second option (it's what I initially did). The main problem with that, though, is that if no reference file, date, or timestamp are passed, then file times will need to be set to the current time. That means that the first argument and last argument would have very different atimes/mtimes, which, as you said when this came up before, is not desirable.

One solution for this would be for touch to accept some kind of CompiledOptions object. This struct would be pretty much the same as Options, except with hidden fields. CompiledOptions::new() would put Local::now() into the CompiledOptions object, fixing the current time. Then, touch could be called a bunch of times with this CompiledOptions object and the atime/mtime would be the same for each call. But it feels like overkill. Thoughts?

ysthakur avatar Mar 21 '24 16:03 ysthakur

GNU testsuite comparison:

Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)

github-actions[bot] avatar May 21 '24 15:05 github-actions[bot]

LGTM but I would Tersts to look at this

sylvestre avatar May 23 '24 11:05 sylvestre

if it is ready, remove the draft please :)

sylvestre avatar May 23 '24 12:05 sylvestre

@sylvestre Done֫. I actually kept it as a draft because I wanted some advice on collecting errors. I posted this in the Discord server but forgot to copy it here:

I made a separate branch to explore having touch only process a single file at a time (diff with main, diff with this branch). This sidesteps the issue of collecting/printing errors, allowing the caller to choose what to do themselves.

The drawback is a teeny bit of added complexity. I needed to add an OptsWithTimes struct that basically just contains the Options along with the access and modification time to set. The intention is that the same object will be reused when touching multiple files so that all of them will have the same access and/or modification time. OptsWithTimes's fields are pub(crate) so that callers can't create it and mess with atimes/mtimes.

I personally prefer the solution in that other branch to the one in this PR, where you only have two options: use strict: true and fail on the first error, or use strict: false and get log output on some errors. If someone wants to avoid failing on the first minor error but would like to handle errors themselves, the other branch gives them that control.

ysthakur avatar May 23 '24 15:05 ysthakur

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

github-actions[bot] avatar Jun 09 '24 18:06 github-actions[bot]

@tertsdiepraam @sylvestre Do you have time to review and land this so we can get this new uutils functionality landed in nushell?

fdncred avatar Jun 09 '24 19:06 fdncred

GNU testsuite comparison:

Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)

github-actions[bot] avatar Jul 07 '24 02:07 github-actions[bot]

Thanks so much!

fdncred avatar Jul 09 '24 23:07 fdncred