kafka-connect-solr icon indicating copy to clipboard operation
kafka-connect-solr copied to clipboard

Destination Solr Document fails to preserve field order

Open cwsusa opened this issue 4 years ago • 1 comments

Examining a topic in kafka shows field ordering as desired for Solr update.

Yet this connector generates a randomly (not front to back or back to front) SolrInputDocument field list.

User Impact: SolrDocuments in the destination cloud are inconsistent with design specs.

Performance Impact: Solr Cloud will suffer performance impacts from misordered document schema. High performance Inverted indexes are frequently designed to minimize time to find a field occurrence of the term when doing constrained field based queries.

Suggested Remedy: Not sure about the code involved, the SinkRecord(?) needs to retrieve topic fields in order into an ordered map. JSON objects are notorious for ordering randomness. LinkedHashMap's are one pattern that can be used to preserve original topic field ordering into a SolrInputDocument.

cwsusa avatar Nov 03 '19 20:11 cwsusa