Chris Mattmann
Chris Mattmann
Hey @allenpope not directly in Tika, but we could develop something at the workshop that combines Tika and datavis interactively - that would be awesome! I smell another session (with...
OK this sounds like we are centering on some concrete goals for this session (or sub-goals at least if @lewismc agrees as the lead proposer): 1. get a source of...
Yep agree @allenpope. Well get there it may start out as static though. I am going to do some pre hacking this week
Thanks @allenpope good points. We can start simple with Wordies/clouds, and then move to something more quantitative. I'll do some research on this.
Awesome, love it @allenpope @curtlisle check it out too. I just added Apache OCW to the list (cc @lewismc )
Thanks @snowangelwmy please contact @pzimdars to get your ACADIS Nutch crawler deployed on AWS, ok?
beginning by downloading Nutch.
NASA AMD: http://gcmd.gsfc.nasa.gov/KeywordSearch/Keywords.do?Portal=amd&KeywordPath=Parameters%7CCRYOSPHERE&MetadataType=0&lbnode=mdlb2 NSF ACADIS:https://www.aoncadis.org/home.htm NSIDC Arctic Data Explorer: http://nsidc.org/acadis/search/
Update properties in conf/nutch-default.xml: ``` http.agent.name = NSF DataViz Hackathon Crawler [email protected] http.agent.host=localhost http.content.limit=-1 plugin.includes delete indexer-solr ```
``` ./bin/crawl urls/ crawl http://localhost 3 ```