floki icon indicating copy to clipboard operation
floki copied to clipboard

nth-child selector not working as expected

Open bcardarella opened this issue 1 year ago • 7 comments

Flok: v0.36.0

I ran into an unexpected behavior when using nth-child selector today:

header = [
  {"tr",
   [
     {"class", "headerRow"},
     {"data-colval", "A"},
     {"data-coldb", "fleeta"},
     {"data-flt", "A"},
     {"data-chid", "631"}
   ],
   [
     {"td",
      [{"class", "fleet1"}, {"style", "text-align:left"}, {"colspan", "40"}],
      [
        {"span",
         [{"style", "display: flex; justify-content: space-between; gap:8px;"}],
         [
           {"span", [{"class", "headRowStart"}], ["Class: A "]},
           "Start: 18:45:00",
           {"span", [],
            ["Race Len.: ", {"span", [{"class", "raceLength"}], ["3.42"]}]},
           {"span", [], ["Course Desc: 22-19-17-HB"]},
           {"span", [], [{"span", [], ["Rating Type:"]}, "RANDOM LEG - Light"]},
           {"span", [],
            [
              "# of Racers: 9 ",
              {"input",
               [{"type", "hidden"}, {"name", "fleet_row"}, {"value", "A"}], []}
            ]},
           {"span", [],
            [
              "   # of Entries: 14    ",
              {"input",
               [{"type", "hidden"}, {"name", "fleet_row"}, {"value", "A"}], []}
            ]}
         ]}
      ]}
   ]}
]

Floki.find(header, "td span span:nth-child(1)")

this reesults in:

[
  {"span", [{"class", "headRowStart"}], ["Class: A "]},
  {"span", [{"class", "raceLength"}], ["3.42"]},
  {"span", [], ["Rating Type:"]}
]

however in the browser if I do a similar selector on the same fragment:

$0.querySelectorAll('td span span:nth-child(1)')

results correctly in:

<span class="headRowStart">
  <span class="condense" data-dbname="fleeta" data-hideval="A">−</span>
  <span class="expand" data-dbname="fleeta" data-hideval="A" style="display:none">+</span>
  Class: A
</span>

I realize Floki is using Mochi under the hood. But before I go too far down the rabbit hole I wanted to validate if this behavior is expected for Floki or not? If not I will continue to dig and isolate where the issue is.

bcardarella avatar Mar 03 '24 11:03 bcardarella

:first-child is producing the same result, which I would expect but wanted to confirm

bcardarella avatar Mar 03 '24 11:03 bcardarella

Using the immediate children operator works:

Floki.find(header, "td > span > span:nth-child(1)")

and I believe that Floki is actually correct. So... is Chrome wrong?

bcardarella avatar Mar 03 '24 12:03 bcardarella

Can you double check if this is the actual HTML the browser is receiving? This is the raw html for the example you shared

> header |> Floki.raw_html(pretty: true) |> IO.puts
<tr class="headerRow" data-colval="A" data-coldb="fleeta" data-flt="A" data-chid="631">
  <td class="fleet1" style="text-align:left" colspan="40">
    <span style="display: flex; justify-content: space-between; gap:8px;">
      <span class="headRowStart">
        Class: A
      </span>
      Start: 18:45:00
      <span>
        Race Len.:
        <span class="raceLength">
          3.42
        </span>
      </span>
      <span>
        Course Desc: 22-19-17-HB
      </span>
      <span>
        <span>
          Rating Type:
        </span>
        RANDOM LEG - Light
      </span>
      <span>
        # of Racers: 9
        <input type="hidden" name="fleet_row" value="A"/>
      </span>
      <span>
        # of Entries: 14
        <input type="hidden" name="fleet_row" value="A"/>
      </span>
    </span>
  </td>
</tr>

Putting this HTML in the browser and running the queries above gives the same results in Floki, Firefox and Chrome. Since <span class="raceLength">3.42</span> and <span>Rating Type:</span> are the first children for their parents, they are expected to be in the find response.

ypconstante avatar Mar 04 '24 13:03 ypconstante

Unfortunately yes the HTML is from an actual site, and it is horrible.

bcardarella avatar Mar 04 '24 14:03 bcardarella

Source: https://regattaman.com/results.php?yr=2023&race_id=378&rnum=0&eid=378&sort=0&ssort=12&sdir=true&ssdir=true

I guess a case could be made for not supporting poorly written markup

bcardarella avatar Mar 04 '24 14:03 bcardarella

I think you're checking a different element from the one you shared

image

The html structure changes between tables, depending on the displayed data there are nested spans, but comparing the same entries on Floki and Firefox the results are the same.

You'll need to use td > span > span:nth-child(1) to avoid issues when there are nested spans

ypconstante avatar Mar 04 '24 14:03 ypconstante

Yes I noted that above https://github.com/philss/floki/issues/550#issuecomment-1975140099

bcardarella avatar Mar 04 '24 17:03 bcardarella

Sorry for not being active here. And thank you @ypconstante for the research and replies! ❤️

@bcardarella thanks for the info as well! I believe there is nothing to do here, right? I'm closing now.

philss avatar Jun 06 '24 16:06 philss