phantom-of-the-capitol icon indicating copy to clipboard operation
phantom-of-the-capitol copied to clipboard

phantomjs alternative

Open wioux opened this issue 8 years ago • 20 comments

Should we find an alternative to phantomjs? The maintainer has stepped down.

wioux avatar Sep 22 '17 01:09 wioux

There is now firefox headless https://mykzilla.org/2017/08/30/headless-firefox-in-node-js-with-selenium-webdriver/ or I guess more popularly, chrome headless.

mfb avatar Sep 22 '17 01:09 mfb

We'd support a move. I heard chrome headless is much faster. That said, we have no development time to devote to this at the moment unfortunately. And I don't feel a ton of urgency about it, we don't even keep up with phantom updates as it is.

On Sep 21, 2017, at 6:52 PM, mark burdett [email protected] wrote:

There is now firefox headless https://mykzilla.org/2017/08/30/headless-firefox-in-node-js-with-selenium-webdriver/ or I guess more popularly, chrome headless.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

j-ro avatar Sep 22 '17 03:09 j-ro

Do we still need to support webkit/waitir? REQUIRES_WAITIR is empty and all the bioguide ids from REQUIRES_WEBKIT are house members so we can clear that out, but I'm not sure what the need for the alternative drivers was originally and whether it might come up again. We could really simplify parts of the app if we removed support for those drivers.

wioux avatar Sep 22 '17 22:09 wioux

I think that's probably fine over here, yeah...

j-ro avatar Sep 22 '17 22:09 j-ro

https://github.com/GoogleChrome/puppeteer

ghost avatar Sep 30 '17 09:09 ghost

has anyone started work on this?

j-ro avatar Jan 13 '18 17:01 j-ro

Not yet @j-ro.

wioux avatar Jan 16 '18 19:01 wioux

Thanks @wioux, us either, though it's starting to become more important for us. I'll let you know if it lands on my roadmap. Can you do the same, so we don't duplicate work?

j-ro avatar Jan 17 '18 04:01 j-ro

Definitely, I'll let you know.

wioux avatar Jan 17 '18 18:01 wioux

We're actually doing a bit of initial investigation work on this today, maybe tomorrow too. We'll let you know how it works. There may be just a drop-in replacement that works with capybera, if so, will be fairly easy....

j-ro avatar Jan 17 '18 18:01 j-ro

Update here -- we have chromedriver running, but it's probably not quite ready for prime time. It works, but seeing some hard to debug timeout errors, and it's missing some features like blacklists. We're going to run it as an optional switch for certain yamls since it helps in some cases, but we're not going to entirely switch. If there's large appetitive for the code we can put together a PR, but it's very much a WIP.

j-ro avatar Jan 23 '18 22:01 j-ro

Hey @j-ro, this is becoming more important for us. Have you found a solution you like?

k-stewart avatar May 10 '18 21:05 k-stewart

No, we're still with phantom. Chromedriver works but not as consistently, and it doesn't have many hooks and options to debug and tune. We haven't looked at it since January, maybe that's changed, but we're not planning a switch.

j-ro avatar May 10 '18 22:05 j-ro

Ok, thanks for the insight. I'll see if anything's changed since then.

k-stewart avatar May 13 '18 03:05 k-stewart

Worth a shot -- it didn't really take us very long at all to drop in Chromedriver -- the hard part was getting it to work reliably.

j-ro avatar May 13 '18 03:05 j-ro

I'll chime in with my experience as I have worked with puppeteer, and phantomjs, and various selenium webdriver implementations like chromedriver and geckodriver. Puppeteer provides a high level API that is quite easy to work with for basic scraping. They publish extensive documentation as well. If needing to get something done quick, I think this is a strong contender. It is a JavaScript only API as far as I know. Selenium webdriver implementations give you more flexibility with the browser you run the automation in but require more programming and configuration to get working. The API is also implemented in different programming languages. Firefox's headless documentation also recommends using selenium webdriver for testing automation.

ghost avatar Jul 19 '19 16:07 ghost

Just discovered @k-stewart 's work in #141 as well.

ghost avatar Jul 19 '19 17:07 ghost

Hi @efx. Our contact-congress work has moved over to EFForg/congress_forms_api to fix this and other issues. Sorry we didn't properly archive this repo -- I'm going to do that now.

wioux avatar Jul 19 '19 17:07 wioux

Thanks @wioux. I had found this repository from EFF's homepage, so we should probably update those link(s) as well.

ghost avatar Jul 22 '19 15:07 ghost

Hi @efx. Our contact-congress work has moved over to EFForg/congress_forms_api to fix this and other issues. Sorry we didn't properly archive this repo -- I'm going to do that now.

This repo is still not archived. We were about to roll out a system we have been working on for a while based on phantom of the capitol before noticing your comment :(

danielmroberts avatar Jun 02 '23 14:06 danielmroberts