Aron Ahmadia
Aron Ahmadia
I'll close this when the commits are landed to the docker branch and pushed.
punting to 0.5 - this is fixed on explorer.continuum.io
Looks like this is still on the to-do?
I agree with Brittain, this looks symptomatic of JAVA_HOME not being set. I think this is something where Nutch itself could be more robust.
Discussion with Katrina in Flowdock. Ideal would be one index per project, and then a `crawl_id` field. I don't think Nutch can do the latter, but I'll look at what...
I'm going to punt this to 0.5 since we can't control Ache for this yet and it's a little late in 0.4 to be mucking with our data model.
Nutch and ACHE both have the ability to take the output of their crawls and re-output in the common ElasticSearch schema. I don't think this is a high priority feature.
Another example. Error codes are not properly passed through these shell scripts.
@brittainhard - can you point me to the conda recipe for elasticnutch? We can raise an issue to move this over to a Salt install.
Thanks, found it. @brittainhard - did you keep a record of how this is transformed over from a nutch install? Your recipe doesn't point to a specific commit from the...