html5ever icon indicating copy to clipboard operation
html5ever copied to clipboard

Convert BufferQueue to use Interior Mutability

Open Taym95 opened this issue 1 year ago • 7 comments

Part of https://github.com/servo/servo/issues/32454

from @jdm comment

Taym95 avatar Jun 10 '24 14:06 Taym95

These changes make sense for BufferQueue's internals, but we should also be able to update the usages of &mut BufferQueue in the rest of the library. We can then verify that the panic from the testcase in servo/servo#32454 (comment) should not occur.

I tried to do that but Iterator is issue here, next(&mut sel)

impl Iterator for BufferQueue {
    type Item = char;

    /// Get the next character if one is available, removing it from the queue.
    ///
    /// This function manages the buffers, removing them as they become empty.
    fn next(&mut self) -> Option<char> {
        let (result, now_empty) = match self.buffers.borrow_mut().front_mut() {
            None => (None, false),
            Some(buf) => {
                let c = buf.pop_front_char().expect("empty buffer in queue");
                (Some(c), buf.is_empty())
            },
        };

        if now_empty {
            self.buffers.borrow_mut().pop_front();
        }

        result
    }
}

and it is neeed 2 places:

    fn eat(
        &mut self,
        input: & BufferQueue,
        pat: &str,
        eq: fn(&u8, &u8) -> bool,
    ) -> Option<bool> {
        if self.ignore_lf {
            self.ignore_lf = false;
            if self.peek(input) == Some('\n') {
                self.discard_char(input);
            }
        }

        input.push_front(mem::take(&mut self.temp_buf));
        match input.eat(pat, eq) {
            None if self.at_eof => Some(false),
            None => {
                self.temp_buf.extend(input);
                None
            },
            Some(matched) => Some(matched),
        }
    }
    ```
    
    ```
        fn eat(&mut self, input: & BufferQueue, pat: &str) -> Option<bool> {
        input.push_front(replace(&mut self.temp_buf, StrTendril::new()));
        match input.eat(pat, u8::eq_ignore_ascii_case) {
            None if self.at_eof => Some(false),
            None => {
                self.temp_buf.extend(input);
                None
            },
            Some(matched) => Some(matched),
        }
    }
    ``

Taym95 avatar Jun 17 '24 13:06 Taym95

Sorry, I missed your last comment when you wrote it. What's the code that actually uses BufferQueue as an iterator?

jdm avatar Jul 12 '24 02:07 jdm

Ah, I see:

error[E0277]: `&BufferQueue` is not an iterator
   --> html5ever/src/tokenizer/mod.rs:340:51
    |
340 |                 self.temp_buf.borrow_mut().extend(input);
    |                                            ------ ^^^^^ `&BufferQueue` is not an iterator
    |                                            |
    |                                            required by a bound introduced by this call
    |
    = help: the trait `Iterator` is not implemented for `&BufferQueue`, which is required by `&BufferQueue: IntoIterator`
    = note: `Iterator` is implemented for `&mut BufferQueue`, but not for `&BufferQueue`
    = note: required for `&BufferQueue` to implement `IntoIterator`
note: required by a bound in `extend`
   --> /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/iter/traits/collect.rs:448:5
help: consider dereferencing here
    |
340 |                 self.temp_buf.borrow_mut().extend(*input);
    |                                                   +

error[E0596]: cannot borrow `*input` as mutable, as it is behind a `&` reference
   --> html5ever/src/tokenizer/mod.rs:218:21
    |
218 |                     input.next();
    |                     ^^^^^ `input` is a `&` reference, so the data it refers to cannot be borrowed as mutable
    |
help: consider changing this to be a mutable reference
    |
210 |     pub fn feed(&self, input: &mut BufferQueue) -> TokenizerResult<Sink::Handle> {
    |                                +++

jdm avatar Jul 20 '24 19:07 jdm

Sorry, I missed your last comment when you wrote it. What's the code that actually uses BufferQueue as an iterator?

Sorry 🙏 , I am not getting notification from this repo, yes the code you shared! any ideas?

Taym95 avatar Jul 20 '24 21:07 Taym95

I suspect we'll need to use my idea of accepting an argument that can obtain a mutable reference on demand. We might want to try an enum that is either &mut or RefCell and add a method like: fn mutate<T, F: FnMut(&mut BufferQueue) -> T>(f: F) -> T which will call the provider function with the needed mutable reference.

jdm avatar Jul 20 '24 22:07 jdm

Actually, I think the easiest solution is to extract the next() method from the impl Iterator for BufferQueue so it's just a normal method (and can therefore be &self instead of &mut self). Then there are only two places in the code that actually try to use it as an iterator, and they can be rewritten.

jdm avatar Jul 26 '24 03:07 jdm

Thanks! I'll pull the latest change into my branch and run tests with it.

jdm avatar Jul 26 '24 12:07 jdm

Verified that these changes are correct via https://github.com/servo/html5ever/pull/548 and https://github.com/servo/servo/pull/32820, so let's merge this.

jdm avatar Aug 03 '24 04:08 jdm