conserve
conserve copied to clipboard
Separating library functions from log/display utilities
Moving displaying process state/results into library users
This PR reworks how the conserve library reports process states and logs errors.
Also including using the tracing logging system for the conserve binary.
This includes support for a logfile (fixing #104).
Rework library logging including display utilities
The goal is to rework the library such it does not uses logging at all except for debugging purposes.
All output should be handled by the library users such as the conserve command line interface.
Attention: This PR submits API breaking changes!
@sourcefrog any thoughts on this one? Turned out logging got into a more complex thing to do.
Hi, yeah, I had not realized you were going to start right away. I can give you some guidance on the weekend.
Ahh yeah all right.
I got some spare time yesterday so I thought I would give it a shot.
After some fiddling I came up with the concept of extracting the whole logging thing from the library and move it into the binary/into the responsibility of the library users. The prevents conflicts with the use of any other logging systems along with other libraries.
My approach would be passing some kind of statistic callbacks for progress logging.
May I push some first drafts this evening (UTC+2).
This would be easier to review as a few PRs, but I understand how you might need to write the whole thing first to understand the code or what you want to change. e.g. introducing better structured error codes could be its own PR.
Yes, perspective changed a bit.
Initially I thought, piping output trough a logging system including several targets (terminal and file in particular) would do the job. But this seemed to be a bit inflexible especially if you want to use conserve as library and not as stand alone tool.
I'm not sure it's a good idea for e.g. file listings that might be consumed by other programs, or show_versions
I skipped answering that point since I agree on that one as well, this is still one of my concerns. I'm playing with some ideas for that:
- General log only into a file (which kinda defeats the whole purpose of this)
- Add a flag to omit all message prefixes (I'm not really a fan of that)
- Using a different output stream for the log and process output (e.g. log into stderr and output into stdout)
Right not I'm not feeling well about any of these solutions.
As the PR title mentions it's WIP and seems to bring more issues than initially thought.
The src/bin entry point should remain small
Okey, I'll keep that in mind.
Edit:
Basically this PR includes multiple steps:
- Split-up library usage, process logging, and output logging
- Determine feasible output formats for the end user
- Implement proper logging targets
I'm happy to talk about it more when you want! I would suggest perhaps starting with options to write trace to a log file that is not stdout, without removing -D. But, it can of course towards whatever you're energized by.
I'd thought about the system in the recent time and think, I came up with a pretty need system.
Currently I've mainly focused on the backup function (my test dummy :P). All other functions will be refactored to follow that pattern. I've also resolved the issue with logging to stdout and using a progress bar at the same time (I'm not a that big fan of the current solution, but there might be some clean-ups).
Using ProgressMonitors like I've shown in backup would be my way to go.
This easily allowed me to factor out the UI related code (ref here). There is still a lot work needed, but the basic concept is pretty much done. Lot's of polishing and cleanup is needed tought!
PS: Don't worry, you'll get your -D switch back.
Hey,
most of the work has been done.
All nutmeg process bars have been factored out into the binary and logging is now done by using tracings logging functions.
Additional command line parameters allow logging to files (log-file) as well remove the level prefix entirely (log-raw).
Tests have been adjusted accordingly and should succeeded (only tested on Windows though).
Please leave your thoughts about the suggested changes.
Thanks, I'll look on the weekend at the latest.
Hey,
I've not yet had time to go trough your reviews, but I've merged this PR with the main branch as well (hopefully) fixed the CI failure.
Overall there's probably more we could do but it's a big step in the right direction.
Yes there is a lot more.
The monitors also enable a lot of test writing (which tbh I was a bit too lazy to do with this PR).
(Edit: Noticed that you've noticed it as well in your review).
Since I'm busy working and studying now I'll probably work on your review in a few days.
But it seems you're doing a great job!
Thinking about this a bit more overnight: perhaps the monitor interface should look more like something that just accepts Event and Problem enums.
Then, the implementation used in testing can simply record the events/problems into a vec to inspect later.
A different implementation can report significant events to log events. Another monitor (or perhaps the same one with a flag set) can update nutmeg models.
pub enum Event {
StartListBlocks,
ListBlocks { block_count: usize },
FinishListBlocks { block_count: usize },
StartValidateBand { band_id: BandId },
}
pub enum Problem {
...
}
trait Monitor {
fn report_event(&self, event: &Event);
}
This would still have, to some extent, parallel state machines in the application code and the monitor. But, the states are not really that complex: validate just proceeds through a few phases. So that would be fine.
perhaps the monitor interface should look more like something that just accepts
EventandProblemenums.
I had the same thought at the beginning, but didn't go for it.
The reason why I didn't is that I actually tried it (first draft I had in mind):
When using an enum to propagate all status events we could just pass a callback functions which takes one argument (the progress enum). I've tried that pattern and imo DX wasn't really good. It just felt off, made code harder to understand and didn't look appealing to me. Maybe try it yourself and see how it goes, might just be me :) Side note: A problem is only an event which would mean intuitively you rather have a ProbleEvent than a separate callback for that.
In terms of testing, your right.
Saving events in a Vec<_> would be way more easy.
The more I think about it, the satisfied am I with the amount of callback methods currently supplied.
I'll think any experiment a bit more as well.
Hey, I've been pretty busy with my last exam for this semester but I had a couple of thoughts about the enum thing.
I intuitively used separate functions since I mainly use that kind of pattern when the state of a struct needs to be observed.
An example would be a server with rooms where each room has a room monitor which gets called on user join, leave, etc...
For such concept having extra methods works well because most likely the callback code will be a bit larger.
Adapting this approach for events, especially if these happen in a predefined order works but might not be the best.
Instead propagating the progress should just be seen as one "event" containing the current state.
These progress updates should not be used for error reporting.
Therefore I'll adjust the monitor style accordingly.
@sourcefrog any notes on the changes made?
Hi, I appreciate all the work you put into it.
At the moment this gets only a small amount of time from me because it's a spare time project and I'm trying to finish cargo-mutants on my weekends.
I'm not sure about merging it just as is. It is very big. It does at least indicate some interesting directions. I think the proliferation of monitor methods, the handling of bulk output, and the particular way that trace interacts with the nutmeg view is not quite what I'm comfortable with at the moment.
Hi, so I finally came back to this over the break to try to work out how to merge or otherwise resolve it. I think overall this is a really good step. The monitor seems fine as it is.
The main thing I'd want to change is to have a concept of output to stdout that is not a log message: listings whether json or text of files, versions, etc. I'll try to build on this to put that back in...
The main thing I'd want to change is to have a concept of output to stdout that is not a log message
If I' not mistaken you're referring to "show_index_json" within the "show" command? I agree, that piping the json output trought the logger is quite a crual task to to, but I haven't really had any better idea. Maybe use for the conserve binary some kind of logging abstraction?
PS: I've updated the PRs description and tile to a more (imo) accurate one as it has quite changed over time.
What might work for this kind of case:
- A library API like "write a list of versions as json to this
Write." - The CLI can call that, asking it to just write to stdout.
Both of these could then be extended:
- There are a few cases of "produce a list of things, render them as text or json, or maybe verbose text": versions, file listings. (Maybe just two for now?) Possibly those should change to iterators of objects that impl Serialize and Display.
- Maybe the CLI code that coordinates logging and so on should have a function that when called turns off trace to stdout and progress, to make sure that nothing interferes with this bulk output.
- Perhaps really trace and progress should go to stderr rather than stdout, so that you could potentially redirect output but still see error messages. (This may be annoying if it breaks a lot of existing tests making assertions about stdout.)
It occurs to me that it might be possible to separate out just the addition of the monitor concept from everything else. That would, at least, get the PR size down a bit, be a step forward itself, and also probably be possible to land without regressing any behavior by introducing log prefixes on bulk output and without needing to update so many tests...
Well, along with the monitor come the logging/ui changes. A core part of the new monitor system is to separate the whole display part. Therefore we could not just leave the old UI stuff within the conserve library. A possible solution tough would be to replace the logging done via tracing within the binary by just println!. I suggest keeping tracing as the logging backend for the library. We could just change the tracing subscriber and print them as standard messages (without the fancy colouring) as well.
08d26ce81c6cab605dbefeb9b5db3e82ad665832 has a work-in-progress experiment with using the monitor pattern only for validate, including collecting a structured list of problems, which will make testing validation/corruption better.
A possible solution tough would be to replace the logging done via tracing within the binary by just println!. I suggest keeping tracing as the logging backend for the library. We could just change the tracing subscriber and print them as standard messages (without the fancy colouring) as well.
I'm not sure I'm following here. I agree with using tracing for reporting errors, warning, debug info, status messages and so on. I just think that file listings, version listings, and so on are a different kind of thing and probably should not go through tracing.
I decided to instead merge #197 which builds on these ideas. Thanks for the thoughts and provocation!