zimit icon indicating copy to clipboard operation
zimit copied to clipboard

networkidle is no longer a valid waitUntil

Open brandonocasey opened this issue 1 year ago • 7 comments

Trying to use networkidle to wait for videos to load, but I get an error. Seems like there is now networkidle0 and networkidle2 but no networkidle

brandonocasey avatar Mar 16 '24 21:03 brandonocasey

Thank you @brandonocasey ; @benoit74 our default option in zimit still references networkidle. With the latest browsertrix-crawler updates, there must have been a pupeeter upgrade.

We need to find out which version is in use as v18 (and 19) offers networkidle0 and networkidle2 but it seems that main has changed this method.

rgaudin avatar Mar 18 '24 07:03 rgaudin

Agreed

@brandonocasey thank you!

FYI, we usually do not rely on networkidle to wait for Youtube / Vimeo / HTML5 videos, usually autoplay behavior is sufficient (but you might be dealing with different kind of videos / player and trying to use this option on purpose)

benoit74 avatar Mar 18 '24 08:03 benoit74

I think that autoplay for vimeo iframes is not working for me. The iframes are cross origin and the iframe.contentDocument is undefined so the query to get video elments and call play doesn't work. At least that is my understanding.

brandonocasey avatar Mar 20 '24 16:03 brandonocasey

@rgaudin what do you think of getting rid of the choices in zimit input? This is checked anyway only few seconds after the scraper start by the crawler?

benoit74 avatar Apr 08 '24 06:04 benoit74

Hum, I don't know. It's not just a check but it's self-documentation as well. We can revisit where we set our cursor of required-browsertrix-knowledge for Zimit users. At first, I think we exposed only the important BT flags but then exposed most except those that makes little sense in our Zimit context. Maybe we could only expose those that are very-important or for which there's Zimit value but tweak the output of --help so it explains that BT flags are accepted and include crawler --help` in this output. We'd have the flexibility of not duplicating stuff while easily exposing all flags. WDYT?

rgaudin avatar Apr 08 '24 08:04 rgaudin

I understand your doubts, but my problem is that:

  • deciding which flag is sufficiently important to expose might be a never-ending discussion (e.g. I consider this waitFor flag as important for instance since it is mandatory to set it to a specific value to scrape Vimeo videos)
  • adding the flag with a description is already a lot of documentation for Zimit users, way better than nothing
  • the concerns we had lately are more around improper handling of flag value rather than flags themselves being renamed / removed / added if I have good memories (which I don't so ...)

benoit74 avatar Apr 08 '24 08:04 benoit74

If you don't know what values to input for the field, it's not much useful. I really think including crawler's help into ours would help a lot. We can then start with removing the choices and progressively remove some fields we deem of no value for zimit to duplicate

rgaudin avatar Apr 10 '24 16:04 rgaudin