coreutils tail: Refactor tail continued

Follow up pr of #3905

@tertsdiepraam The last pr grew pretty big mostly because of the initial extraction of modules. I'm sure I can keep it smaller this time :)

Sep 18 '22 13:09 Joining7943

this isn't really refactor but just moving stuff around. What is the goal ?

Sep 18 '22 20:09 sylvestre

I've oversimplified the process when I called the initial part of #3905 "moving code around" and should have called it by name: An extraction of modules. Maybe, a problem is, that you won't see the other refactorings only in the diff because there was a lot more happening after this extraction. @tertsdiepraam came up with the idea of splitting the pr to simplify the review, but I couldn't stop right there because the program wasn't stable, yet. I needed to proceed until I found a point where on the one hand it was stabilized, what we've got so far and on the other hand have a proper base to move on. I've left an overview of introduced changes, additions etc. in the beforementioned pr, so you get an idea of what is going on. The overall goal of these refactorings (including this one) are:

Reduce complexity, improve readablity
Improve maintainablity
Change the internal structure to make it easier to understand and cheaper to modify
Do not change the observable behaviour, but maybe fix latent bugs on the way which we haven't observed, yet
Eliminate code duplication

I intend to fix the lack of deeper explanations of specific refactorings in the heading of this pr, within the documentation of the code and in the README.

Sep 19 '22 16:09 Joining7943

GNU testsuite comparison:

GNU test failed: tests/misc/timeout. tests/misc/timeout is passing on 'main'. Maybe you have to rebase?

Nov 05 '22 02:11 github-actions[bot]

I am sorry but your refactors are huge and very hard to review. Could you please split it into small PRs? thanks

Nov 10 '22 16:11 sylvestre

Sorry for that. Would it also be ok to squash related commits and add more explanation to them? Currently, the commits are pretty raw and I guess I could limit the commits to around 10, so the diffs between them would be smaller and easier to review.

Nov 10 '22 18:11 Joining7943

I would like smaller PR with changes that make sense to be grouped together

Nov 10 '22 18:11 sylvestre

GNU testsuite comparison:

Congrats! The gnu test tests/misc/timeout is no longer failing!

Nov 11 '22 15:11 github-actions[bot]

Sorry, it's not possible to split this pr. Please see the pr as a whole. However, to better the review situation, I'll squash the commits, add some explanation and document the tail module as far as possible. I'll also write more documention for tests I added and which made changes necessary. I think we can then review the commits like a pr, if you like to.

Nov 11 '22 18:11 Joining7943

yeah, this is far from ideal to do development this way ... @jhscheer @tertsdiepraam wdyt?

Nov 11 '22 18:11 sylvestre

Yeah I'm not a big fan of this either.

I think we can then review the commits like a pr, if you like to.

Sorry, but this is still a big ask. As reviewers we have to do a lot of context switching. Every time we review, we have to reread the PR, try to remember what's changed since last time, what our feedback was, etc.. With small PRs this is nicely chunked and we can merge them one at a time with quick iteration time. With a PR like this, we might want something changed and then have to check the entire PR again, which takes a long time. This is also why I merged the previous PR: it got too big and unwieldy for me to keep track of, just too many things going on at once, so I figured that starting with a clean slate and having smaller PRs in the future would be better.

I'll look through this PR now to see if I can identify a way to split this up and then you can open new PRs based on that. Does that sound good?

Nov 12 '22 13:11 tertsdiepraam

Here are the parts I think should at the very least be split off into separate PRs:

Splitting up the code over more files
error.rs (also look more closely into UResult/UError, which should make the whole IoContextError thing redundant if I'm not mistaken)
The refactored checking of the warnings with the new atty dependency and VerrificationResult and all that.
parse_duration

Nov 12 '22 13:11 tertsdiepraam

I really try to understand your concerns regarding the review and I am honestly sorry, that there are some. However, maybe I can clear them up? Please try to understand me too, that it is by far harder, close to impossible, to change the implementation and its history than changing the review to be based on the commit history instead of standalone prs.

This is actually a draft and there's also a lot of documentation missing, which I think would clarify a lot in advance. Some changes only make sense in the whole context. We could also talk things through on the base of the topics you suggested or else have in mind. If there's something wrong on my side I only have to change that once based on the current state and don't have to transport it through a pr chain. Please let me try to fix the commits and documentation first and then let's see? Sooner or later, I need to do this anyways.

However, as a peace offering, I could try to unhook the parsing of the duration and the additional warning adding the atty dependency. These changes should't have much implications on the rest of the code. I'm not sure what you mean with 1. splitting up the code over more files?

Concerning number 2, I need the IoErrorcontext in two places. That's when printing the correct error message and when figuring out the different behavior on different platforms based on the context where the error appeared. Compliance with gnu's tail error handling and when/how they add a file/stdin for monitoring with follow makes this necessary and this enum simplifies coordinating that. The IoContextError is mainly just transporting this context which is a little bit specific to tail so I can't use UioError for that, at least as far as I've seen, but maybe you have an idea?

Nov 12 '22 17:11 Joining7943

Alright, I'm gonna try to clear your questions up.

I think that your idea of cleaning up the commits and separate PRs are already quite close. The PRs I'm suggesting don't have to be independent, but we can merge one first and then you can put up a new one that builds on the previous one. So if you have those commits ready, all you have to do is make a PR with just the first one and we'll continue from there.

This is actually a draft and there's also a lot of documentation missing, which I think would clarify a lot in advance

This might help a bit but is not the main issue, I think. The main issue is too many things happening at once, regardless of whether they're well-documented.

However, as a peace offering, I could try to unhook the parsing of the duration and the additional warning adding the atty dependency.

This would be already a great improvement!

I'm not sure what you mean with 1. splitting up the code over more files?

Nevermind, I thought this PR created new files, but that's not the case I see. Sorry!

Concerning number 2, I need the IoErrorcontext in two places.

Printing the message should be all UIoError, which I think you are indeed (partially) delegating too. For the second usecase, I see what you mean, but I can't evaluate the solution very well in the current state of the PR. The current implementation uses a lot of show_error! and set_exit_code which is less than ideal, so you're right that it needs some refactoring. But, I'm not sure that we need the whole handler and the context. I'd love to discuss this further in a separate PR :)

Nov 12 '22 21:11 tertsdiepraam

Sorry, but I have to ask, before I'm going through this. Do you know, that you can select (multiple) commits in the Github review panel and start a review on them?

Nov 13 '22 14:11 Joining7943

I do, but the functionality is far from perfect in my opinion

Nov 13 '22 16:11 tertsdiepraam

What's wrong with it? Pity that this is a matter of tooling.

Printing the message should be all UIoError, which I think you are indeed (partially) delegating too.

The delegation wasn't possible entirely because the ErrorKind::IsADirectory (and BadFileDescriptor) isn't stable and we need to detect this ErrorKind in a few places, that's one part the error handler provides.

The current implementation uses a lot of show_error! and set_exit_code which is less than ideal, so you're right that it needs some refactoring.

I haven't spotted a useless print with show_error. Can you give an example? Or maybe you confused show_error! with show!? The latter would set the error code indeed more often than necessary.

I'd love to discuss this further in a separate PR :)

Well, after I sorted out the other things we talked about into other PRs, there's still this one left, which we could use for all what's left :) . Honestly, extracting the error handling into an own pr isn't possible and doesn't make much sense, since it has been gone through some evolution pretty much from the beginning of this pr.

Nov 13 '22 17:11 Joining7943

What's wrong with it?

It's still hard to keep track of all the changes in a single PR. GitHub just does not have good support for a stacked PRs workflow.

I haven't spotted a useless print with show_error. Can you give an example?

I'm talking about the current main branch, e.g. in the tail_file function:

https://github.com/uutils/coreutils/blob/46016bf9ce06832365454754f5543dcb91b6d81b/src/uu/tail/src/tail.rs#L124-L150

Well, after I sorted out the other things we talked about into other PRs, there's still this one left, which we could use for all what's left :)

Alright, this will be the error PR then :)

Nov 13 '22 17:11 tertsdiepraam

I'm talking about the current main branch, e.g. in the tail_file function: Ah ok, sorry, You're right, the error handling wasn't very good there :)

There's the first one #4135 mainly around the check warnings refactor. Do you need the additional tests around check warnings in the other pr, or is it sufficient to keep them here? Problem is, I've taken the TestRunner for the additional tests which I've enhanced from commit to commit by need.

Nov 13 '22 18:11 Joining7943

Thanks for splitting it off! A quick scroll-through makes it look much more manageable.

The TestRunner case is interesting. We do indeed need some tests, but maybe it's better to do them without the TestRunner for now. The TestRunner could actually be a whole PR on it's own. I think that, at least in its current iteration, it's not the best abstraction. The first reason is that we should keep abstraction in the tests specifically to a minimum. Like, I would hate to figure out after a long debugging session that the bug was in the TestRunner (or CmdResult or something else). The DRY principle also does not really apply to tests. It also blurs the distinction a command and it's result, requiring some carefulness (like calling self.cleanup_resources). That being said I do like a lot of the things you've implemented on the TestRunner, like assert_alive. I'm wondering if we could actually directly implement those on our existing testing library, so that we don't need a tail-specific TestRunner.

Nov 13 '22 19:11 tertsdiepraam

Actually, I haven't written the TestRunner because I wanted to apply the DRY principle, I just wanted a simple Setup -> Run -> Assert chain in a row. One of the problems was, that I had a lot of boilerplate in the tests, which are distractions from the things that are important (also for others when they read the test and have to figure out what's actually going on) and to repeat the same setup and more complicated asserts (like stderr_to_stdout, assert_stdout_stderr or assert_alive) isn't good in tests either. Maybe tail is also a bit specific because there's a process running with follow which you have to check, assert and control somehow. I wasn't sure if this happens in other coreutils? There was also the need to run a plain cmd or sh shell with tail in the middle of a pipe and one or two fifos or whatever necessary.

I hope I haven't written too much logic which can go sideways into the Testrunner and it's more or less tested by usage, like most of test fixtures. That's currently around 15 tests or so, but it's working as far as I could experience. The last rewrites of the Testrunner were also about the specific problem to not have to care about calling cleanup_resources or reset. I forgot myself... However, if I were successful, you should be able to setup the test case, aruments and call run whichever you like intuitively without hassle or thinking much about possible sideeffects. The call to reset() in some of the tests are only a leftover.

I'm wondering if we could actually directly implement those on our existing testing library, so that we don't need a tail-specific TestRunner.

Sure :) but I don't have much insight into the other uutils, so you need to tell me which parts you're interested in. Maybe we could talk about this first a little bit, so if I'm lucky I don't have to rewrite all the tests :)

Nov 13 '22 20:11 Joining7943

I completely understand and I'm in favor of adding it, but the fact that we already have so much to discuss about just the TestRunner shows that it should in a separate PR, so let's do that and once we have that we can add the tests for the other PRs.

Nov 13 '22 20:11 tertsdiepraam

The test runner is in an own pr here #4136. I took the version from the latest commit here.

Nov 13 '22 21:11 Joining7943

GNU testsuite comparison:

Congrats! The gnu test tests/tail-2/inotify-dir-recreate is no longer failing!

Dec 07 '22 11:12 github-actions[bot]

GNU testsuite comparison:

Congrats! The gnu test tests/tail-2/inotify-dir-recreate is no longer failing!
GNU test failed: tests/misc/timeout. tests/misc/timeout is passing on 'main'. Maybe you have to rebase?

Dec 07 '22 12:12 github-actions[bot]

@Joining7943 not needed anymore, right ?

Feb 18 '23 20:02 sylvestre

It's still a WIP. I just had to do a lot of other things first.

Feb 18 '23 20:02 Joining7943

GNU testsuite comparison:

GNU test failed: tests/rm/rm2. tests/rm/rm2 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/tail-2/inotify-dir-recreate. tests/tail-2/inotify-dir-recreate is passing on 'main'. Maybe you have to rebase?

Feb 18 '23 22:02 github-actions[bot]

@Joining7943 Are you still interested by this ?

Mar 27 '23 18:03 sylvestre

needs rebasing

No problem. I didn't have much time lately to work on this pr but I haven't forgotten it.

Mar 28 '23 14:03 Joining7943

The last pr grew pretty big mostly because of the initial extraction of modules. I'm sure I can keep it smaller this time :)

Reminder :)

Apr 10 '23 08:04 tertsdiepraam

coreutils coreutils copied to clipboard

tail: Refactor tail continued

coreutils
coreutils copied to clipboard