elassandra
elassandra copied to clipboard
Inconsistent Data Querying ElasticSearch
Look this test I´ve performed in Elassandra with Python.
I created a function to query data using Cassandra driver:
def process_query_cassandra(query, fetch_size = 5000, consistency_level=ConsistencyLevel.LOCAL_ONE): start = timer() paging_state = None rows = [] while True: statement = SimpleStatement(query, fetch_size = fetch_size, consistency_level=consistency_level) results = session.execute(statement, paging_state=paging_state) paging_state = results.paging_state for row in results.current_rows: rows.append(row) if paging_state == None: break df = pd.DataFrame(rows) end = timer() return df, timedelta(seconds=end-start)
Table f0101 has 872390 rows.
When I query using CQL only, results are OK:
query1 = """ select * from "dlfinjdep"."f0101" ALLOW FILTERING """
Running Cassandra #1 (22-06-01 12:43) Rows: 872390 seconds: 0:03:17.609349 Running Cassandra #2 (22-06-01 12:46) Rows: 872390 seconds: 0:03:04.289089
However, when I use the option to query ElasticSearch index through CQL, I get different results:
query2 = """
select *
from "dlfinjdep"."f0101"
WHERE es_query='{"query":{"match_all":{}}}'
AND es_options='indices=dlfinjdep-f0101-index'
ALLOW FILTERING
"""
Running Elastic #1 (22-06-01 12:50) Rows: 841350 seconds: 0:03:49.136313 Running Elastic #2 (22-06-01 12:54) Rows: 834372 seconds: 0:03:33.985948
Which version of elassandra are you using ?