lol-html icon indicating copy to clipboard operation
lol-html copied to clipboard

Add blanket `impl<T: Write> OutputSink for T`

Open jyn514 opened this issue 3 years ago • 7 comments

Currently, HtmlRewriter::try_new takes only closures and nothing else. For the common case of 'I have a buffer to write to' this causes the code to be more complicated than necessary:

    let mut output = vec![];

    // NOTE: never panics because encoding is always "utf-8".
    let mut rewriter = HtmlRewriter::try_new(settings.into(), |c: &[u8]| {
        output.extend_from_slice(c);
    })
    .unwrap();

If OutputSink took any io::Write impl, it could be simpler:

    let mut output = vec![];

    // NOTE: never panics because encoding is always "utf-8".
    let mut rewriter = HtmlRewriter::try_new(settings.into(), &mut output).unwrap();

jyn514 avatar Aug 02 '20 19:08 jyn514

~~This also has the benefit that you can perform streaming utf8-validation~~ Nope, I forgot that String only implements fmt::Write, not io::Write.

jyn514 avatar Aug 02 '20 19:08 jyn514

This isn't currently possible because there's no way to return an error from handle_chunk; the only thing to do is panic. @inikulin would it make sense to have that return an error? I'm imagining something like this:

pub trait OutputSink {
    type Error;

    fn handle_chunk(&mut self, chunk: &[u8]) -> Result<(), Self::Error>;
}

impl<W: io::Write> OutputSink for W {
    type Error = io::Error;

    fn handle_chunk(&mut self, chunk: &[u8]) -> io::Result<()> {
        self.write_all(chunk)
    }
}

pub enum RewritingError<E> {
    MemoryLimitExceeded(MemoryLimitExceededError),
    ParsingAmbiguity(ParsingAmbiguityError),
    ContentHandlerError(Box<dyn StdError>),
    WriteError(E),
}

If you don't want to make RewritingError generic, you could add a Error: std::io::Error bound and use Box<dyn Error>, but I don't think the usability hit from a generic error will be very bad.

I guess the existing impl<F: FnMut(&[u8])> OutputSink for F could use type Error = Infallible? Or even better, this could add a new impl<E, F: FnMut(&[u8]) -> Result<(), E>> OutputSink for F.

jyn514 avatar Oct 18 '20 00:10 jyn514

@jyn514 sgtm, we already use Box<dyn Error> for handler errors, so let's probably keep it without generics.

inikulin avatar Oct 18 '20 06:10 inikulin

My second idea is also not currently possible because the compiler thinks for some reason that Write and FnMut can overlap.

error[E0119]: conflicting implementations of trait `transform_stream::dispatcher::OutputSink`:
  --> src/transform_stream/dispatcher.rs:76:1
   |
60 | impl<F: FnMut(&[u8])> OutputSink for F {
   | -------------------------------------- first implementation here
...
76 | impl<W: io::Write> OutputSink for W {
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ conflicting implementation

Not sure if this is possible to fix.

jyn514 avatar Oct 20 '20 22:10 jyn514

Some discussion about how to make this possible in https://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/tell.20the.20trait.20system.20functions.20and.20Write.20are.20disjoint.3F. It would need changes in the standard library, so definitely a long-term improvement.

jyn514 avatar Oct 22 '20 14:10 jyn514

There has been some recent progress on this: https://zulip-archive.rust-lang.org/243200tlangmajorchanges/08462negativeimplsintegratedintocoherencelangteam96.html. The next steps are for the feature to be implemented and for the standard library to add impl !FnMut for &str {} somewhere. That may or may not require stabilizing the language feature, sometimes (like for const generics) the library will stabilize internal uses before stabilizing the ability for users to do it themselves.

jyn514 avatar Jun 09 '21 19:06 jyn514

Another alternative is for lol-html to add its own custom trait that's implemented for functions and Vec.

jyn514 avatar Jun 09 '21 19:06 jyn514