fern Code refactoring + cleanup + maybe some new features

Code refactoring + cleanup + maybe some new features

Open piegamesde opened this issue 3 years ago • 14 comments

See #83.

At the moment, I'm trying to get rid of all large enums and replace them with traits or something else suitable.

I'm not sure where I'm actually heading with this, but I have a vague idea of exposing a lot of log_impl as a public API. The goal is to assimilate Output and Log, since at the moment they are 99% the same. I'd like to have outputs be individual structs instead of methods on the Output struct.

Jul 09 '21 17:07 piegamesde

Well, that turned weird quickly :sweat_smile:. It usually never takes me long to wreck the commit history, and I also tend to run into the limits of Rust's type system fairly quickly …

Needless to say, I'm unsatisfied with my new approach. However, I see multiple alternatives going forward from here:

Remove the logging specific max_level (and thus, LogExt). It provides rather low benefit for the pain it causes. I can't really judge the performance gains it brings in the wild. Removing it would turn Dispatch into a glorified Log multiplexer, greatly reducing complexity.
Separate logger and builder. This means renaming LogExt to Output or something. But the key change would be to not let that trait extend Log, and instead have it provide a to_log method.

Jul 09 '21 21:07 piegamesde

Needless to say, I'm unsatisfied with my new approach. However, I see multiple alternatives going forward from here:

* Remove the logging specific max_level (and thus, `LogExt`). It provides rather low benefit for the pain it causes. I can't really judge the performance gains it brings in the wild. Removing it would turn `Dispatch` into a glorified `Log` multiplexer, greatly reducing complexity.

* Separate logger and builder. This means renaming `LogExt` to `Output` or something. But the key change would be to not let that trait extend `Log`, and instead have it provide a `to_log` method.

Right - I'm down for option one. I believe I originally added max_level as part of a bigger plan to like, "optimize" the log chain during construction - thinking maybe fern could see a complex configuration, and through logical analysis, make it as simple as possible. That never really manifested, and while I still think it would be cool, it is not the direction the crate's been going in since then.

Jul 11 '21 07:07 daboross

I'm not sure where I'm actually heading with this, but I have a vague idea of exposing a lot of log_impl as a public API. The goal is to assimilate Output and Log, since at the moment they are 99% the same. I'd like to have outputs be individual structs instead of methods on the Output struct.

I definitely like the internal refactoring here, but I'm wondering about the external API. I originally designed Output to put almost everything you can do with this crate in one place - no need to search through docs of 10 different structs, each with multiple constructors, to figure out what's available. There's a little bit of this with log_file and log_reopen, but most of everything is still on Output.

That does make it a bit messy, but I'd like to think the experience of searching through docs on Output, especially with all the examples, is an overall nice experience.

I don't think there's anything else which actually keeps all the functions still up and available in the same way, besides just keeping a single struct - and that doesn't really match the new paradigm. One possibility I was thinking of for a refactor like this which would retain some of the flatness would be having all constructor functions exported as public functions of the fern crate - so the docs wouldn't be open, but you'd be able to see all available ways to make loggers under "Functions" on the fern root docs. What would you think of that?

The one thing I'm mostly against here is having the API be 5+ different structs with 2-3 constructors each. It's nice from an internal organization perspective, but in my opinion would not be that great to look through. It requires users to care about what internal fern struct they're making, and I don't want to require that.

Ideally, I want someone looking at fern to be able to choose from a flat list of things to log to. They should not care whether "a logging file with a static name", "a logging file whose name changes based on time" and "a logging file whose name rotates based on size" are the same struct or different structs - we represent them differently right now, but that's an implementation detail. I'm OK exposing that, but understanding that choice shouldn't be part of the main API (and if we didn't expose it at all, that would almost be better?)

That's a lot of rambling from me about API design. Thoughts on this?

I'm open to almost anything here, but I do hold some opinions on what's user-friendly, enumerated above. I'd like contradicting opinions, but I also wanted to write all that to give background on why I've designed it the way I have so far.

Jul 11 '21 08:07 daboross

Hope too many comments aren't bad! Well, if they are, that's what I've written anyways :p

Jul 11 '21 08:07 daboross

Thanks for having a look at this!

Remove the logging specific max_level (and thus, LogExt). It provides rather low benefit for the pain it causes. I can't really judge the performance gains it brings in the wild. Removing it would turn Dispatch into a glorified Log multiplexer, greatly reducing complexity.

Right - I'm down for option one. I believe I originally added max_level as part of a bigger plan to like, "optimize" the log chain during construction - thinking maybe fern could see a complex configuration, and through logical analysis, make it as simple as possible. That never really manifested, and while I still think it would be cool, it is not the direction the crate's been going in since then.

I tried removing max_level, but it turned out to be necessary in order to know which global log level to set (the alternative would be to always Trace). I think solved this by still keeping OutputInner for the Dispatch variant. That way, our logging tree has Dispatches as inner nodes and Logers as leaves, which I find rather beautiful.

I definitely like the internal refactoring here, but I'm wondering about the external API. I originally designed Output to put almost everything you can do with this crate in one place - no need to search through docs of 10 different structs, each with multiple constructors, to figure out what's available.

Personally, I kind of like the style of having a list of structs for each thing I want to do and then a list of constructors for each configuration how I want to do it. But I can see why you're against this.

I suggest adding one convenience method for each to Output, and additionally keep all structs in a documented separate module. Thus there is an easy way to get started while keeping the possibility to do deep custom configuration. I'm not sure if this will actually turn out to work well, but I'll try it out and we'll see.

Jul 11 '21 11:07 piegamesde

I've thought some more about the public output API. Maybe making all those structs public isn't that a great idea. However, I can now point down my problem with the current way of doing things: it's inconsistent.

See, it's really cool to be able to call .chain(io::stdout()) or .chain(some_file), but then there is also Output::file() but no Output::stdout() and no Output::panic(). So I have to search multiple places depending on the output that I want, which is exactly what should be avoided.

Jul 11 '21 14:07 piegamesde

I know this isn't part of this PR (I'll factor it out later, but at the moment it makes dogfooding easier), but why do we have a copy of the colored crate in colors? It feels like kind of the same thing in some ways, but more inferior and harder to use in others.

Jul 15 '21 16:07 piegamesde

Just wanted to acknowledge that I've seen this, and haven't had time to review it in depth yet! Looking forward to it, just been busy.

A few comments just from reading this thread, without rereading the code:

I tried removing max_level, but it turned out to be necessary in order to know which global log level to set (the alternative would be to always Trace). I think solved this by still keeping OutputInner for the Dispatch variant. That way, our logging tree has Dispatches as inner nodes and Logers as leaves, which I find rather beautiful.

Right, I'd forgotten about that. If we can keep the transformation on Dispatch and nothing else, and keep the API consistent, that sounds good.

My one worry is that this could turn into a situation where chaining another dispatch is not as intuitive as chaining a "regular" logger, and that breaks some of the niceness of things.

There's also a potential of adding more non-Dispatch loggers in the future which chain things. I think the main one I've considered is a "dispatch-to-thread" logger which takes in one child logger, and handles records by formatting them to a String, sending that + metadata over a channel to a dedicated thread, and then giving that string + metadata to its child logger. That may or may not end up being a design we actually want, but things like that make me think it might be nice to leave room for non-Dispatch things as non-leaf nodes?

Personally, I kind of like the style of having a list of structs for each thing I want to do and then a list of constructors for each configuration how I want to do it. But I can see why you're against this.

I suggest adding one convenience method for each to Output, and additionally keep all structs in a documented separate module. Thus there is an easy way to get started while keeping the possibility to do deep custom configuration. I'm not sure if this will actually turn out to work well, but I'll try it out and we'll see.

I've thought some more about the public output API. Maybe making all those structs public isn't that a great idea. However, I can now point down my problem with the current way of doing things: it's inconsistent.

See, it's really cool to be able to call .chain(io::stdout()) or .chain(some_file), but then there is also Output::file() but no Output::stdout() and no Output::panic(). So I have to search multiple places depending on the output that I want, which is exactly what should be avoided.

Thanks for pointing that out - that's something I think I completely missed in designing the current interface.

I'm definitely interested in fixing this inconsistency. However, I'm not yet convinced that individual constructor methods are the way to do it - I think we could still have a solution which adds all possible outputs as methods in one place, such as on Output.

I've got two concerns with a solution which has individual constructors and Output methods:

First, if we only add a single convenience method to Output for each struct, but add more options in the struct constructors, I think this brings us back to an inconsistent interface wherein one has to look at each individual struct to figure out what all is actually possible.

Second, if we mirror everything, or even if we only mirror some of the individual methods in Output, where do the docs end up? I'd like to believe the detailed examples and descriptions of each logger are one of fern's strengths, and I'd ideally like to maintain each of those in one place, not split between two methods one wrapping the other.

I'm coming around somewhat to the idea of having just individual struct constructors. My main worry is still just exposing too many implementation details by how we split up the structures, and locking ourselves into a specific split even if another one comes up which makes more sense later.

I'd be down for a different compromise though - what if we had everything as individual constructors, but they all returned impl Log, rather than each individual struct?

That could let us change internal details in the future, if we want to, while keep existing constructors around. As an example, say we wanted to merge Stdout and Stderr into a Stream logger. We could then replace Stdout and Stderr with deprecated blank structs, and since Stdout::new and Stderr::new both return impl Log, they could both be changed to return a Stream.

Thanks so much for continuing to push this forward, and I hope my delayed responses are still useful! Like I said I haven't had a lot of time, but I'm happy to continue forward with this, and I'll see if I can carve out some time to look at code again soon.

Jul 20 '21 20:07 daboross

My main worry is still just exposing too many implementation details by how we split up the structures, and locking ourselves into a specific split even if another one comes up which makes more sense later. […] an example, say we wanted to merge Stdout and Stderr into a Stream logger.

I think that having all structs opaque should give us enough room to work with: in your example, we could simply change the inner of Stdout to delegate to the new Stream (and also deprecate it).

Btw I've just tried to unify the structs using generics, because they seemed so similar. Sadly, the stdout/stderr ones stick out because they don't use a Mutex to guard their value. I tried factoring out that part into a closure, but it failed because it would require to be generic over lifetimes, which Rust cannot (easily) do. It should be possible in theory, if we stopped using lock() on Stdout and Stderr. I don't know what the implications of this are or if they are optional anyways. Using more trait objects instead of generics would help too.

I've been thinking a lot about possible API designs and how to keep them backwards compatible. One thing that always sticks out as odd is the line_separator option. One single option doesn't really benefit from a full-blown builder pattern like DateBased does. I don't really know where to put it. If I was designing the API from scratch with this requirement, I'd probably go for an Option<String> (inspired by Gtk-rs), but it would require a method signature change (also it has some other downsides). On the other hand, without that option things would be a lot easier.

Some other random findings I should write down before I forget them again:

The current usage of Into<Output> effectively results in Rust-y method overloading, which is interesting. I still need to decide on whether I like this as pattern or not.
One important limitation though is that it cannot be used together with blanket implementations, so one could not use this to make chain accept any Log implementation.
I'd like to add one design constraint to the new outputs: they should be usable as independent loggers. That way, fern would double as an inter-operable collection of generic log helpers. This could be achieved by impl Log for Output among other things, but exposing the outputs as individual structs is an equal way to achieve that goal.

My one worry is that this could turn into a situation where chaining another dispatch is not as intuitive as chaining a "regular" logger, and that breaks some of the niceness of things.

Suppose I made .chain for Dispatch-like loggers only, and added another method, call it fanout, add_output or chain_logger or whatever to add "normal" log implementations. Would you consider this "nice enough"? (Don't worry about backwards compatibility on this one for now)

Jul 24 '21 17:07 piegamesde

Picking this up again. I noticed I had kind of painted myself into a corner and things ground to a halt. I thus lifted my requirement of keeping backwards compatibility and then progress came back. This is obviously a draft, but please generate the docs and have a look.

First of all, I moved a lot of things in the modules around. I'm still not sure whether the logger module should be called output instead.
The new public API focuses mainly on methods and on the builder pattern:
- Where possible, the only constructor is a method that returns impl Log + Into<Output>.
- There is the common case where may set an alternative line separator. I implemented that feature using a trait, keeping the types as opaque as possible.
- The Panic and Null loggers are zero-sized structs. I could equally well have made them methods instead. You decide.
- The Write internally unifies a lot of implementations. However at the moment this means that stdout and stderr are doubly wrapped in a Mutex. I still need some idea to make the Mutex optional without reintroducing all that code duplication.
- The DateBased is a more complex builder pattern, as before. The build method is implicit in the Into<Output> implementation (as before).
  - I do like the concept, but I'll probably add a build or into_log method that allows getting a Log type instead.
Btw I removed the line separator from the Sender logger. I think that should be part of the receiver.
The syslog module now contains method that explicitly creates the wrapper types (again, opaque)
The Output design has changed a little:
- The methods on the implementation got moved to the logger and syslog modules.
- The From implementations for non-loggers (i.e. io::StdOut or File) got removed. You now need to explicitly convert them to a Log using the methods provided. This is an attempt to increase consistency of the API at the cost of conciseness.
- Output implements From<T> for as many logger types as possible, so that it can be used with chain. I added a convenience chain_logger but the ultimate goal is to remove that once we have trait specialization.

Nov 10 '21 13:11 piegamesde

Awesome, thanks for being on top of this! Wanted to say I saw this, and I'm looking forward to reading your comments and code, and I'll reply in depth once I have. It may be more than a few days due to my current schedule and mental health, but I'll get to this ASAP.

Nov 15 '21 08:11 daboross

Sure, no need to rush. A quick note on the completeness state of the changes:

Documentation is a complete TODO
Tests run again, examples compile (except for doc examples / doc tests)
I have one API change planned based on experience from dog-fooding: fern::log_file will be merged into fern::logging::file. Same for log_reopen.
Once everybody's happy, we can talk about code formatting and making the history pretty. Also the formatter will probably not be part of the final changes (I'd like to iterate on it some more)

Nov 15 '21 10:11 piegamesde

Hi! Sorry for the 6 month delay here.

I'm hoping to get a full review soon. If you want to, we can then go back to back-and-forth & get it merged, or otherwise I'll probably make additional changes myself & merge it. I want to get these changes in, but especially given how unresponsive I've been here, I don't want to require any more of your time than necessary.

Apr 15 '22 10:04 daboross

Thanks. Do it however suits you best. I am still interested in this and ready to invest some more time. I'm using my fork here and there as dependency for a while now, and I'm really happy with the changes, although it still is not finished. The logging and output handling is okay IMO, but the formatters and coloring will need some more work. I think the best way forward would be to take these into a separate PR.

Apr 15 '22 20:04 piegamesde

fern fern copied to clipboard

Code refactoring + cleanup + maybe some new features

fern
fern copied to clipboard