Shinsuke Sugaya

Results 201 comments of Shinsuke Sugaya

Did you try the following example? https://github.com/codelibs/elasticsearch-river-web#register-crawl-data > How to check whether crawl data inserted into elastic search index or not? Please use Elasticsearch's Search API.

It's better to check Elasticsearch's log files.

Could you attach the river registration command you ran?

> "includeFilter" : ["https://www.google.com/.*"], How about the following setting? ``` "includeFilter" : ["https://www.google.com.*"], ```

> But i got connection refused. is any proxy need to set? I think it depends on your network environment. If google checks UserAgent, your crawling request may be refused.

Is "url" field "not_analyzed" in a mapping? See #14.

Could you provide info to reproduce it? ``` $ curl -XGET 'localhost:9200/_river/[RIVER_NAME]/_meta' ```

Please check if url is not_analyzed field.

Please check the actual mapping, not dynamic_templates.

River Web supports XML file of [sitemaps.org](http://www.sitemaps.org/).