serenity icon indicating copy to clipboard operation
serenity copied to clipboard

LibWeb: Fix all the Acid2 regressions!

Open awesomekling opened this issue 2 years ago • 6 comments

We had a perfect result on Acid2 in the past (http://acid2.acidtests.org)

It has since regressed:

image

We need to fix whatever the issues are, and add automated tests for everything along the way. :^)

awesomekling avatar Oct 26 '23 07:10 awesomekling

At least a meaningful amount of the rendering issue is due to <object> elements not falling back correctly. Specifically the object element <object data="data:application/x-unknown,BAD_PAYLOAD">. The other bad object element <object data="http://www.damowmow.com/404/" type="text/html"> falls back correctly.

Removing the faulty element, or adding the custom check:

if (resource_type.has_value() && *resource_type == "application/x-unknown"sv) {
    run_object_representation_fallback_steps();
    return;
}

At the top of: https://github.com/SerenityOS/serenity/blob/52340dc78ad35f625434ca521bd2af40aebae274/Userland/Libraries/LibWeb/HTML/HTMLObjectElement.cpp#L238

Produces: image (the "Hello World" dotted text box is unrelated and occurs on master)

Here is a minimized example I made to recreate the fallback issue in isolation in Ladybird:

<html><head>
  <object data="data:application/x-unknown,BAD_PAYLOAD">
    <object data="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAAFUlEQVR42mP8z8BQz0AEYBxVSF+FABJADveWkH6oAAAAAElFTkSuQmCC">
    </object>
  </object>
</body></html>

There's a few different ways you could go about fixing this, but after staring at different spec pages for hours I still can't figure out where the spec would want a bogus MIME type and resource to fail. My leading theories are:

  1. Check for navigate() failures on the child navigable that's created: https://github.com/SerenityOS/serenity/blob/52340dc78ad35f625434ca521bd2af40aebae274/Userland/Libraries/LibWeb/HTML/HTMLObjectElement.cpp#L259 The navigate calls document_load and fails to render anything because the resource type doesn't make any sense. Theoretically we could somehow check if the child navigable document is null and fallback then, but note that the spec makes no mention of any such error checking.
  2. Check for MIME types BEFORE we get to creating a child navigable. Run fallback if we don't know how to render according to some semi-hardcoded criterion. (e.g. not [image, pdf, text, markdown, etc.])

aduerig avatar Jan 11 '24 08:01 aduerig

EDIT: This was false, look below.

Note that when playing around with bogus `<object>` elements in Firefox, I noticed that the bad element DID shift the layout of the fallback elements even if it parent bad element never rendered. This might mean that the slight issues with the pixels at the bottom of the face I posted above might also be solved if we fallback correctly. Or it's some other bug.

aduerig avatar Jan 11 '24 08:01 aduerig

@aduerig I think what is supposed to happen is:

  • Perform the fetch, which succeeds because it's a data URL.
  • Enter step 3.9 (Handler)
  • Hit the "Otherwise" case
  • Fallback to content.

But, the first case is:

If the resource type is an XML MIME type, or if the resource type does not start with "image/"

and that seems to me like it would catch unsupported types too? That seems incorrect but I'm not very awake.

AtkinsSJ avatar Jan 11 '24 09:01 AtkinsSJ

Oh wait! There are a couple of notes about how the "otherwise" case should trigger if the resource type is unknown. That sounds like something we might not be doing correctly.

AtkinsSJ avatar Jan 11 '24 09:01 AtkinsSJ

Oh wait! There are a couple of notes about how the "otherwise" case should trigger if the resource type is unknown. That sounds like something we might not be doing correctly.

I agree that could be a place to determine if we know how to process a "resource type", but the rules in the spec seem to not talk about ANY decisions on if we know how to process a resource here.

At this point in the code if the Resource has "Content-Type" metadata, the spec seems to want to just push that along as a resource type. The Content-Type metadata is set based off of the mime-type by ResourceLoader: https://github.com/SerenityOS/serenity/blob/340936ad09b37d11583c92aa895ee22771208791/Userland/Libraries/LibWeb/Loader/ResourceLoader.cpp#L254

The mime-type for data URLs is set by this spec, as well as this spec. The data url parsing happens here: https://github.com/SerenityOS/serenity/blob/340936ad09b37d11583c92aa895ee22771208791/AK/URL.cpp#L423

Making decisions if we support a certain content type at the data URL parsing stage feels wrong, but the ResourceLoader location above does seem like a potential location.

There ARE some notes on checking if a user agent supports a mime type here, but we don't implement this part of the spec yet, and its a bit unclear to me even when this is supposed to be called. In fact, NONE of the ResourceLoader code calls any part of the MIME sniffing spec. Perhaps this section we are supposed to run inside of ResourceLoader?

aduerig avatar Jan 11 '24 19:01 aduerig

The bottom of the chin (2 extra pixels) is due to a CSS table formatting bug. Minimized reproducible example of the chin.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html><head>
  <style type="text/css">
   ul { display: table; padding: 0; margin: 0em 0em 0; background: red; }
   ul li { padding: 0; margin: 0; }
   ul li.first-part { display: table-cell; height: 1em; width: 1em; background: blue; }
   ul li.second-part { display: table; height: 1em; width: 1em; background: green; }
</style></head><body><ul class="table_parent"><li class="first-part"></li><li class="second-part"></li></ul></body></html>

Produces (left is firefox, right is ladybird): image

Relevant reading for this part is the acid2 explanation "After row 14".

The size of "first-part" is twice as large as "second-part" with padding.top and padding.bottom being the reason why it renders larger. The padding is from the cell.baseline being incorrect.

Relevant layout dump of the above code:

    BlockContainer <li.first-part> at (8,24) content-size 16x0 table-cell [0+0+0 16 0+0+0] [0+0+16 0 16+0+0] [BFC] children: not-inline
    BlockContainer <(anonymous)> at (24,8) content-size 16x16 table-cell [0+0+0 16 0+0+0] [0+0+0 16 0+0+0] [BFC] children: not-inline
      TableWrapper <(anonymous)> at (24,8) content-size 16x16 [0+0+0 16 0+0+0] [0+0+0 16 0+0+0] [BFC] children: not-inline
        Box <li.second-part> at (24,8) content-size 16x16 table-box [0+0+0 16 0+0+0] [0+0+0 16 0+0+0] [TFC] children: not-inline

Layout dump where "second-part" is just table-cell instead of table:

    BlockContainer <li.first-part> at (8,8) content-size 16x0 table-cell [0+0+0 16 0+0+0] [0+0+0 0 16+0+0] [BFC] children: not-inline
    BlockContainer <li.second-part> at (24,8) content-size 16x0 table-cell [0+0+0 16 0+0+0] [0+0+0 0 16+0+0] [BFC] children: not-inline

My theory here is that the content-height of "second-part" is 0 when it is initialized as a table-cell, but when it is changed to table, the content-height is suddenly 16 after being wrapped inside of a tablewrapper: (relevant spec) https://github.com/SerenityOS/serenity/blob/96675e61cd22ee08503f80c76533fd5db83f26aa/Userland/Libraries/LibWeb/Layout/TreeBuilder.cpp#L706

This causes the cells baseline to be wrong: https://github.com/SerenityOS/serenity/blob/96675e61cd22ee08503f80c76533fd5db83f26aa/Userland/Libraries/LibWeb/Layout/TableFormattingContext.cpp#L882 And then it messes with the sizing of "first-part", causing it to double in size.

I think its because the content-heights are mistakenly becoming non-zero, but I can't seem to see where that happens.

aduerig avatar Feb 13 '24 19:02 aduerig