Moving displaying process state/results into library users

This PR reworks how the conserve library reports process states and logs errors.
Also including using the tracing logging system for the conserve binary. This includes support for a logfile (fixing #104).

Rework library logging including display utilities

The goal is to rework the library such it does not uses logging at all except for debugging purposes.
All output should be handled by the library users such as the conserve command line interface.

Attention: This PR submits API breaking changes!

Aug 03 '22 18:08 WolverinDEV

@sourcefrog any thoughts on this one? Turned out logging got into a more complex thing to do.

Aug 04 '22 07:08 WolverinDEV

Hi, yeah, I had not realized you were going to start right away. I can give you some guidance on the weekend.

Aug 04 '22 11:08 sourcefrog

Ahh yeah all right.
I got some spare time yesterday so I thought I would give it a shot.

After some fiddling I came up with the concept of extracting the whole logging thing from the library and move it into the binary/into the responsibility of the library users. The prevents conflicts with the use of any other logging systems along with other libraries.

My approach would be passing some kind of statistic callbacks for progress logging.
May I push some first drafts this evening (UTC+2).

Aug 04 '22 15:08 WolverinDEV

This would be easier to review as a few PRs, but I understand how you might need to write the whole thing first to understand the code or what you want to change. e.g. introducing better structured error codes could be its own PR.

Yes, perspective changed a bit.
Initially I thought, piping output trough a logging system including several targets (terminal and file in particular) would do the job. But this seemed to be a bit inflexible especially if you want to use conserve as library and not as stand alone tool.

I'm not sure it's a good idea for e.g. file listings that might be consumed by other programs, or show_versions

I skipped answering that point since I agree on that one as well, this is still one of my concerns. I'm playing with some ideas for that:

General log only into a file (which kinda defeats the whole purpose of this)
Add a flag to omit all message prefixes (I'm not really a fan of that)
Using a different output stream for the log and process output (e.g. log into stderr and output into stdout)

Right not I'm not feeling well about any of these solutions.
As the PR title mentions it's WIP and seems to bring more issues than initially thought.

The src/bin entry point should remain small

Okey, I'll keep that in mind.

Edit:
Basically this PR includes multiple steps:

Split-up library usage, process logging, and output logging
Determine feasible output formats for the end user
Implement proper logging targets

Aug 04 '22 19:08 WolverinDEV

I'm happy to talk about it more when you want! I would suggest perhaps starting with options to write trace to a log file that is not stdout, without removing -D. But, it can of course towards whatever you're energized by.

Aug 05 '22 19:08 sourcefrog

I'd thought about the system in the recent time and think, I came up with a pretty need system.
Currently I've mainly focused on the backup function (my test dummy :P). All other functions will be refactored to follow that pattern. I've also resolved the issue with logging to stdout and using a progress bar at the same time (I'm not a that big fan of the current solution, but there might be some clean-ups).

Using ProgressMonitors like I've shown in backup would be my way to go.
This easily allowed me to factor out the UI related code (ref here). There is still a lot work needed, but the basic concept is pretty much done. Lot's of polishing and cleanup is needed tought!

PS: Don't worry, you'll get your -D switch back.

Aug 05 '22 23:08 WolverinDEV

Hey,
most of the work has been done.
All nutmeg process bars have been factored out into the binary and logging is now done by using tracings logging functions.
Additional command line parameters allow logging to files (log-file) as well remove the level prefix entirely (log-raw).

Tests have been adjusted accordingly and should succeeded (only tested on Windows though).
Please leave your thoughts about the suggested changes.

Aug 09 '22 13:08 WolverinDEV

Thanks, I'll look on the weekend at the latest.

Aug 11 '22 15:08 sourcefrog

Hey,
I've not yet had time to go trough your reviews, but I've merged this PR with the main branch as well (hopefully) fixed the CI failure.

Aug 12 '22 15:08 WolverinDEV

Overall there's probably more we could do but it's a big step in the right direction.

Yes there is a lot more.
The monitors also enable a lot of test writing (which tbh I was a bit too lazy to do with this PR).
(Edit: Noticed that you've noticed it as well in your review).

Since I'm busy working and studying now I'll probably work on your review in a few days.
But it seems you're doing a great job!

Aug 14 '22 00:08 WolverinDEV

Thinking about this a bit more overnight: perhaps the monitor interface should look more like something that just accepts Event and Problem enums.

Then, the implementation used in testing can simply record the events/problems into a vec to inspect later.

A different implementation can report significant events to log events. Another monitor (or perhaps the same one with a flag set) can update nutmeg models.

pub enum Event {
    StartListBlocks,
    ListBlocks { block_count: usize },
    FinishListBlocks { block_count: usize },
    StartValidateBand { band_id: BandId },
}
pub enum Problem {
    ...
}
trait Monitor {
    fn report_event(&self, event: &Event);
}

This would still have, to some extent, parallel state machines in the application code and the monitor. But, the states are not really that complex: validate just proceeds through a few phases. So that would be fine.

Aug 14 '22 15:08 sourcefrog

perhaps the monitor interface should look more like something that just accepts Event and Problem enums.

I had the same thought at the beginning, but didn't go for it.
The reason why I didn't is that I actually tried it (first draft I had in mind):

When using an enum to propagate all status events we could just pass a callback functions which takes one argument (the progress enum). I've tried that pattern and imo DX wasn't really good. It just felt off, made code harder to understand and didn't look appealing to me. Maybe try it yourself and see how it goes, might just be me :) Side note: A problem is only an event which would mean intuitively you rather have a ProbleEvent than a separate callback for that.

In terms of testing, your right.
Saving events in a Vec<_> would be way more easy.

Aug 15 '22 11:08 WolverinDEV

The more I think about it, the satisfied am I with the amount of callback methods currently supplied.
I'll think any experiment a bit more as well.

Aug 15 '22 12:08 WolverinDEV

Hey, I've been pretty busy with my last exam for this semester but I had a couple of thoughts about the enum thing.

I intuitively used separate functions since I mainly use that kind of pattern when the state of a struct needs to be observed.
An example would be a server with rooms where each room has a room monitor which gets called on user join, leave, etc...
For such concept having extra methods works well because most likely the callback code will be a bit larger.

Adapting this approach for events, especially if these happen in a predefined order works but might not be the best.
Instead propagating the progress should just be seen as one "event" containing the current state.
These progress updates should not be used for error reporting.

Therefore I'll adjust the monitor style accordingly.

Aug 24 '22 13:08 WolverinDEV

@sourcefrog any notes on the changes made?

Sep 05 '22 10:09 WolverinDEV

Hi, I appreciate all the work you put into it.

At the moment this gets only a small amount of time from me because it's a spare time project and I'm trying to finish cargo-mutants on my weekends.

I'm not sure about merging it just as is. It is very big. It does at least indicate some interesting directions. I think the proliferation of monitor methods, the handling of bulk output, and the particular way that trace interacts with the nutmeg view is not quite what I'm comfortable with at the moment.

Sep 06 '22 16:09 sourcefrog

Hi, so I finally came back to this over the break to try to work out how to merge or otherwise resolve it. I think overall this is a really good step. The monitor seems fine as it is.

The main thing I'd want to change is to have a concept of output to stdout that is not a log message: listings whether json or text of files, versions, etc. I'll try to build on this to put that back in...

Jan 03 '23 18:01 sourcefrog

The main thing I'd want to change is to have a concept of output to stdout that is not a log message

If I' not mistaken you're referring to "show_index_json" within the "show" command? I agree, that piping the json output trought the logger is quite a crual task to to, but I haven't really had any better idea. Maybe use for the conserve binary some kind of logging abstraction?

PS: I've updated the PRs description and tile to a more (imo) accurate one as it has quite changed over time.

Jan 04 '23 23:01 WolverinDEV

What might work for this kind of case:

A library API like "write a list of versions as json to this Write."
The CLI can call that, asking it to just write to stdout.

Both of these could then be extended:

There are a few cases of "produce a list of things, render them as text or json, or maybe verbose text": versions, file listings. (Maybe just two for now?) Possibly those should change to iterators of objects that impl Serialize and Display.
Maybe the CLI code that coordinates logging and so on should have a function that when called turns off trace to stdout and progress, to make sure that nothing interferes with this bulk output.
Perhaps really trace and progress should go to stderr rather than stdout, so that you could potentially redirect output but still see error messages. (This may be annoying if it breaks a lot of existing tests making assertions about stdout.)

Jan 05 '23 05:01 sourcefrog

It occurs to me that it might be possible to separate out just the addition of the monitor concept from everything else. That would, at least, get the PR size down a bit, be a step forward itself, and also probably be possible to land without regressing any behavior by introducing log prefixes on bulk output and without needing to update so many tests...

Jan 08 '23 05:01 sourcefrog

Well, along with the monitor come the logging/ui changes. A core part of the new monitor system is to separate the whole display part. Therefore we could not just leave the old UI stuff within the conserve library. A possible solution tough would be to replace the logging done via tracing within the binary by just println!. I suggest keeping tracing as the logging backend for the library. We could just change the tracing subscriber and print them as standard messages (without the fancy colouring) as well.

Jan 08 '23 13:01 WolverinDEV

08d26ce81c6cab605dbefeb9b5db3e82ad665832 has a work-in-progress experiment with using the monitor pattern only for validate, including collecting a structured list of problems, which will make testing validation/corruption better.

A possible solution tough would be to replace the logging done via tracing within the binary by just println!. I suggest keeping tracing as the logging backend for the library. We could just change the tracing subscriber and print them as standard messages (without the fancy colouring) as well.

I'm not sure I'm following here. I agree with using tracing for reporting errors, warning, debug info, status messages and so on. I just think that file listings, version listings, and so on are a different kind of thing and probably should not go through tracing.

Jan 08 '23 20:01 sourcefrog

I decided to instead merge #197 which builds on these ideas. Thanks for the thoughts and provocation!

Feb 12 '23 03:02 sourcefrog

conserve
conserve copied to clipboard

Separating library functions from log/display utilities

Moving displaying process state/results into library users

Rework library logging including display utilities

conserve conserve copied to clipboard

Separating library functions from log/display utilities

Moving displaying process state/results into library users

Rework library logging including display utilities

conserve
conserve copied to clipboard