spark-solr
spark-solr copied to clipboard
Unable to push data to Solr using the connector on an existing schema
So, we have an existing Solr collection with a predefined schema for it. Most of the fields have the stored
parameter set to true
, but there are certain fields where we explicitly set stored=false
. When we try to push data to Solr using the spark-solr connector, we get the following error-
org.apache.solr.api.ApiBag$ExceptionWithErrObject: error processing commands, errors: [{add-field={name=taxonomy, indexed=true, multiValued=true, docValues=true, stored=true, type=string}, errorMessages=[Field 'item_id_channel' already exists.
]}],
at org.apache.solr.handler.SchemaHandler.handleRequestBody(SchemaHandler.java:92)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2541)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:709)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:515)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:377)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:323)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
The error says the item_id_channel
already exists, but this error is only raised for fields for which we have defined stored=false
(in the Solr schema). I get that the connector wishes to create the schema again for some reason, but it sets the stored
parameter to true
which clashes with the predefined schema definition on Solr for this field.
My question is - Is there a way to tell the connector (probably through some option?) that we want the stored
to be set to true
for certain fields? And probably a generic way to define other solr parameters for the fields?
Good point. Or, maybe, a way to disable Schema Update: Issue 246 And let us manage them by ourselves.