hound
hound copied to clipboard
Wait for the page to load
I'm trying to scrape the products from Amazon with a simple code, and when I take a screenshot I've got this : https://ibb.co/nDmCf5
As you can see, I can't get after the 8th product, I tried waiting for so long using sleep 120_000
but it still the same, I'm not sure what I have to do right now.
Waiting for something or many things on the page to be true is something I've had to do often. I wrote a little Retryer
module that has a couple of methods.
You pass in a list of functions that hopefully run quickly and retry them every few seconds until one of them returns a truthy value, and then it should continue on.
def oscillate(function_list, retries \\ 60, retry_time \\ @retry_time) do
# Allows you to check multiple things that might be true at a given time
results = Enum.map(function_list,
fn x -> (try do x.() rescue _ -> false end)
end)
if retries > 0 do
if Enum.any?(results) do
results
else
:timer.sleep(retry_time)
oscillate(function_list, retries - 1, retry_time)
end
else
results
end
end
Example usage:
Retryer.oscillate(
[
fn -> "Thing one that should be truthy" end,
fn -> "Thing 2 that should be truthy" end
]
)
If one of the things you're looking for is in the visible page text, you can use oscillate_with_page
, which pulls the page text on every iteration and passes it into the functions.
def oscillate_with_page(function_list, retries \\ 60, retry_time \\ @retry_time) do
# Allows you to check multiple things that might be true at a given time, but passes in the page text along with it
vpt = try do
Hound.Helpers.Page.visible_page_text
rescue _ -> false
""
end
results = Enum.map(function_list,
fn x -> (try do x.(vpt) rescue _ -> false end)
end)
if retries > 0 do
if Enum.any?(results) do
results
else
:timer.sleep(retry_time)
oscillate_with_page(function_list, retries - 1, retry_time)
end
else
results
end
end
Example usage:
Retryer.oscillate_with_page(
[
fn(vpt) -> Regex.match?(~r/Is this on the page?/, vpt) end,
fn(vpt) -> Regex.match?(~r/How about this?/, vpt) end,
fn(_) -> "Some other condition that doesn't need the page text" end
]
)
Hope that helps.