elasticsearch-mocksolrplugin
elasticsearch-mocksolrplugin copied to clipboard
Use Solr clients/tools with ElasticSearch
h1. ElasticSearch Mock Solr Plugin
|.Mock Solr Plugin|.elasticsearch|_.Lucene/Solr| |master|0.20.2 -> 0.20.X|3.6.2| |1.1.4|0.20.2 -> 0.20.X|3.6.2| |1.1.3|0.19.3 -> 0.20.1|3.6.0| |1.1.2|0.19.0 -> 0.19.2|3.5.0| |1.1.1|0.18.6 -> 0.18.7|3.5.0| |1.1.0|0.18.0 -> 0.18.5|3.5.0|
h2. Use Solr clients/tools with ElasticSearch
This plugin will allow you to use tools that were built to interact with Solr with ElasticSearch.
The idea for this plugin came when I wanted to use Nutch with ElasticSearch. Instead of extending Nutch itself, I thought it would be nice to use any Solr clients with ElasticSearch. Some projects we can now use are Nutch, Apache ManifoldCF, and any tool using SolrJ. It should be possible to use non-java tools that write to Solr using the XML update and request handlers as well.
h3. Supported Solr features
- Update handlers ** XML Update Handler (ie. /update) ** JavaBin Update Handler (ie. /update/javabin)
- Search handler (ie. /select) ** Basic lucene queries using the q paramter ** start, rows, and fl parameters ** sorting ** filter queries (fq parameters) ** hit highlighting (hl, hl.fl, hl.snippets, hl.fragsize, hl.simple.pre, hl.simple.post) ** faceting (facet, facet.field, facet.query, facet.sort, facet.limit)
- XML and JavaBin request and response formats
h3. How do you build this plugin?
Use maven to build the package
mvn package
Then install the plugin
# if you've built it locally $ES_HOME/bin/plugin -url file:./target/releases/elasticsearch-mocksolrplugin-*.zip -install mocksolrplugin
h3. How to use this plugin.
Just point your Solr client/tool to your ElasticSearch instance and appending /_solr to the url.
http://localhost:9200/${index}/${type}/_solr
${index} - the ES index you want to index/search against. Default "solr". ${type} - the ES type you want to index/search against. Default "docs".
Example paths:
// Will search/index against index "solr" and type "docs" http://localhost:9200/_solr // Will search/index against index "testindex" and type "docs" http://localhost:9200/testindex/_solr // Will search/index against index "testindex" and type "testtype" http://localhost:9200/testindex/testtype/_solr
Use the client/tool as you would with Solr.
h3. Example SolrJ Indexing
CommonsHttpSolrServer server = new CommonsHttpSolrServer("http://localhost:9200/testindex/testtype/_solr");
server.setRequestWriter(new BinaryRequestWriter());
// we support both xml and SolrBin response writers
//server.setParser(new XMLResponseParser());
SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField( "id", "id1", 1.0f );
doc1.addField( "name", "doc1", 1.0f );
doc1.addField( "price", 10 );
SolrInputDocument doc2 = new SolrInputDocument();
doc2.addField( "id", "id2", 1.0f );
doc2.addField( "name", "doc2", 1.0f );
doc2.addField( "price", 20 );
Collection docs = new ArrayList();
docs.add( doc1 );
docs.add( doc2 );
server.add( docs );
server.commit();
// deletes work as well
//server.deleteById("id2");
//server.commit();
Perform a search and verify the documents were indexed.
h3. Example SolrJ Searching
CommonsHttpSolrServer server = new CommonsHttpSolrServer("http://localhost:9200/testindex/testtype/_solr");
String qstr = "id:[* TO *]";
SolrQuery query = new SolrQuery();
query.setQuery(qstr);
QueryResponse response = server.query(query);
for (SolrDocument doc : response.getResults()) {
for (String field : doc.getFieldNames()) {
System.out.println(field + " = " + doc.getFieldValue(field));
}
System.out.println();
}
h3. Example using Nutch
At a minimum, use the following type mapping for ElasticSearch.
curl -XPUT 'http://localhost:9200/testindex'
curl -XPUT 'http://localhost:9200/testindex/testtype/_mapping' -d '{
"testtype" : {
"properties" : {
"id" : {
"type" : "string",
"store": "yes"
},
"digest" : {
"type" : "string",
"store" : "yes",
"index" : "no"
},
"boost" : {
"type" : "float",
"store" : "yes",
"index" : "no"
},
"tstamp" : {
"type" : "date",
"store" : "yes",
"index" : "no"
}
}
}
}'
Follow the nutch tutorial at http://wiki.apache.org/nutch/NutchTutorial
- Follow steps 1 though 3.1
- For step 3.1 use:
bin/nutch crawl urls -solr http://localhost:9200/testindex/testtype/_solr -depth 3 -topN 5
h3. Notes
ElasticSearch does not require a schema and all the data you send to Solr will be indexed by default. You Can use the ElasticSearch PUT Mapping API to define your field types, what should be stored, analyzed, etc. All data that is indexed via the mock XML Update Handler will most likely be detected by ElasticSearch as strings, thus it is a good idea to mimic your Solr schema with an ElasticSearch type mapping.