cassandra-lucene-index icon indicating copy to clipboard operation
cassandra-lucene-index copied to clipboard

Lucene Index Empty after upgrade

Open hagir7 opened this issue 5 years ago • 0 comments

I appreciate some help as I an trying to upgrade Cassandra from 2.1.11(plugin version 2.1.11.2) to 2.1.19 (plugin version 2.1.19.1) and have a lucene index that comes with this upgrade. I couldn't find compatibility info between these 2 versions and I was losing the index on upgrade. So I resorted to dropping index, upgrading, then recreate index. However, the index is always empty after upgrade:

This is my keyspace:

CREATE KEYSPACE mwl WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '1'}  AND durable_writes = true;
CREATE TABLE mwl.mwl (
    anum text,
    anum_universal_id text,
    iid text,
    id_universal_id text,
    version int,
    current_version int static,
    event text,
    event_type text,
    fully_qualified_anum text,
    fully_qualified_id text,
    isr text,
    lucene text,
    birth_date text,
    name text,
    requested_procedure_ids set<text>,
    version_uuid timeuuid,
    PRIMARY KEY ((anum, anum_universal_id, id, id_universal_id), version)
) WITH CLUSTERING ORDER BY (version DESC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';
CREATE CUSTOM INDEX mwl_lucene_idx ON mwl.mwl (lucene) USING 'com.stratio.cassandra.lucene.Index';

Then I create lucene after upgrade to 2.1.19 by running:

I tested this with a single cassandra node.

  1. In 2.1.11: Created the keyspace and index. Inserted some data. Index populated. I check using SELECT count(*) FROM mwl.mwl WHERE lucene='{filter:{type:"wildcard",field:"fully_qualified_anum",value:"acn**"},refresh:true}'; , a few values are returned

  2. I drop lucene in 2.1.11 by running: drop index mwl.mwl_lucene_idx;

  3. I upgrade my single node: nodetool upgradesstables, then nodetool drain, then stop cassandra, replace with new version and plugin, start cassandra, nodetool upgradesstables, and finally nodetool version/status to verify all looks good.

  4. Once my single node is up and running, I can verify table is populated. Then I run:

CREATE CUSTOM INDEX IF NOT EXISTS ON mwl.mwl (lucene) USING 'com.stratio.cassandra.lucene.Index' WITH OPTIONS = 
{'refresh_seconds':'60',
'indexing_threads':'0',
'indexing_queues_size':'50',
'schema':'{fields:{anum:{type:"string",indexed:true,case_sensitive:false},
anum_universal_id:{type:"string",indexed:true,case_sensitive:false},
iid:{type:"string",indexed:true,case_sensitive:false},
id_universal_id:{type:"string",indexed:true,case_sensitive:false},
name:{type:"string",indexed:true,case_sensitive:false},
fully_qualified_id:{type:"string",indexed:true,case_sensitive:false},
fully_qualified_anum:{type:"string",indexed:true,case_sensitive:false},
event_type:{type:"string",indexed:true,case_sensitive:false},
event:{type:"string",indexed:true,case_sensitive:false},
requested_procedure_ids:{type:"string",indexed:true,case_sensitive:false},
birth_date:{type:"string",indexed:true}}}'};
  1. Now index is empty SELECT count(*) FROM mwl.mwl WHERE lucene='{filter:{type:"wildcard",field:"fully_qualified_anum",value:"acn**"},refresh:true}'; , 0 is returned

I turned debugging for <logger name="com.stratio.cassandra" level="DEBUG"/> and I am getting no error there. In fact, I see rows being added but still an empty index at the end. I am not sure when or why it gets lost. I also have some other regular cassandra indices and are not affected. Seems that only data affected is the lucene index.

Any help is much appreciated.

hagir7 avatar Mar 28 '19 17:03 hagir7