Claudiu

Results 30 comments of Claudiu

Hmm! Interesting. Thanks to @bitstein there's an option to pick which webdriver to use. I tried it briefly with PhantomJS and it didn't seem to work - maybe HtmlUnit will...

Ah nice! I wasn't aware of WARC. I'm not entirely sure it's appropriate as right now I'm only recording the messages and not the entire HTML contents. I could modify...

Hmm it should work, yes, although I haven't tested it. But if it works off the file system, it should work off a hosted site. At the link you gave,...

Nice, glad it worked out, and thanks for figuring that bug out. I've updated `master` branch with the fix.

Ah so in the process of building the static site, the renderer attempts to get the raw email data (as in the bytes sent out by a mail client) and...

Ah wow, that's due to a very special "From" field, namely: ``` "kelly <[email protected]>" <[email protected]> ``` Or, when unescaped: ``` "kelly " ``` It didn't expect to see nested emails...

Hmm haven't seen this one. What does the browser window look like at that point in time? It's expecting the JSON data which the browser renders with a pre tag....

Hmm try putting a sleep() or an input() right before the offending line: ``` File "/home//yahoo-groups-backup/yahoo_groups_backup/scraper.py", line 77, in _load_json_url return json.loads(self.br.find_by_tag("pre")[0].text) ``` That should leave it open so you...

Ooh, that's unfortunate. That's the thing about hacks, they only work for so long . . . this will indeed have to be fixed in the code. I probably won't...

Is the pretrained model available?