arcadedb icon indicating copy to clipboard operation
arcadedb copied to clipboard

WHERE clause in SELECT does not work with property called "status"

Open vic0824 opened this issue 2 years ago • 8 comments

ArcadeDB Version: v23.1.1-SNAPSHOT (build d46511029703bb08716a959c4b137e4a57dea931/1673614865087/main)

JDK Version: openjdk version "11.0.12" 2021-07-20 LTS

OpenJDK Runtime Environment Corretto-11.0.12.7.1 (build 11.0.12+7-LTS) OpenJDK 64-Bit Server VM Corretto-11.0.12.7.1 (build 11.0.12+7-LTS, mixed mode)

OS: CentOS release 6.9 (Final)

Expected behavior

It is possible to assign the name "status" to a property, because it is not a reserved keyword. A property called "status" can be used in the WHERE clause of a SELECT statement.

Actual behavior

A property called "status" can be created in a Document Type, and documents can be inserted, however a SELECT which uses the "status" property in the WHERE clause does not return any record even when it should.

Steps to reproduce

  • create a database
  • create a document type containing a property called "status" and a property called "my_status"
  • insert a document that has a value for both properties
  • execute a select which contains a WHERE clause that selects the "status" value inserted previously
  • verify that no records are returned
  • execute a select which contains a WHERE clause that selects the "my_status" value inserted previously
  • verify that one records is returned

The attached test class produces the following output (irrelevant ArcadeDB start-up warnings omitted):

2023-01-16 10:39:54.646 INFO  [ArcadeDBServer] <ArcadeDB_0> ArcadeDB Server v23.1.1-SNAPSHOT (build d46511029703bb08716a959c4b137e4a57dea931/1673614865087/main) is starting up...
2023-01-16 10:39:54.653 INFO  [ArcadeDBServer] <ArcadeDB_0> Starting ArcadeDB Server with plugins [] ...
2023-01-16 10:39:54.742 INFO  [ArcadeDBServer] <ArcadeDB_0> - JMX Metrics Started...
[...]
2023-01-16 10:39:55.660 INFO  [HttpServer] <ArcadeDB_0> - Starting HTTP Server (host=0.0.0.0 port=2480-2489)...
2023-01-16 10:39:55.693 INFO  [undertow] starting server: Undertow - 2.3.2.Final
2023-01-16 10:39:55.700 INFO  [xnio] XNIO version 3.8.8.Final
2023-01-16 10:39:55.707 INFO  [nio] XNIO NIO Implementation Version 3.8.8.Final
2023-01-16 10:39:55.977 INFO  [threads] JBoss Threads version 3.5.0.Final
2023-01-16 10:39:56.009 INFO  [HttpServer] <ArcadeDB_0> - HTTP Server started (host=0.0.0.0 port=2480)
[...]

2023-01-16 10:39:56.025 INFO  [ArcadeDBServer] <ArcadeDB_0> Available query languages: [java, js, sql]
2023-01-16 10:39:56.027 INFO  [ArcadeDBServer] <ArcadeDB_0> ArcadeDB Server started in 'development' mode (CPUs=8 MAXRAM=3.93GB)
2023-01-16 10:39:56.027 INFO  [ArcadeDBServer] <ArcadeDB_0> Studio web tool available at http://localhost:2480 
insert record: sqlString = UPDATE Order SET processor = ?, vstart = ?, vstop = ?, status = ?, my_status = ? UPSERT RETURN AFTER WHERE processor = ? AND vstart = ? AND vstop = ?, result = {"processor":"SIR1LRM-7.1","vstart":"20220319_002905.423534","vstop":"20220319_003419.571172","status":"PENDING","my_status":"PENDING","@cat":"d","@type":"Order","@rid":"#3:0"}
Test 1: select with status = 'PENDING' using prepared statement
sqlString = SELECT from Order WHERE status = ? ORDER BY @rid ASC LIMIT 10, parameter = PENDING
result: 
Test 2: select with my_status = 'PENDING' using prepared statement
sqlString = SELECT from Order WHERE my_status = ? ORDER BY @rid ASC LIMIT 10, parameter = PENDING
result: @rid = #3:0, processor = SIR1LRM-7.1, vstart = 20220319_002905.423534, vstop = 20220319_003419.571172, pstart = null, pstop = null, status = PENDING, my_status = PENDING, node = null


2023-01-16 10:39:56.153 INFO  [ArcadeDBServer] <ArcadeDB_0> Shutting down ArcadeDB Server...
2023-01-16 10:39:56.154 INFO  [undertow] stopping server: Undertow - 2.3.2.Final
2023-01-16 10:39:56.460 INFO  [ArcadeDBServer] <ArcadeDB_0> - Stop JMX Metrics
2023-01-16 10:39:56.460 INFO  [ArcadeDBServer] <ArcadeDB_0> ArcadeDB Server is down

ArcadeDBTest.zip

vic0824 avatar Jan 16 '23 09:01 vic0824

I can't reproduce it with the last version

I followed all your steps:

create vertex type TestStatus
create property TestStatus.status string
create property TestStatus.my_status string
create vertex TestStatus set status = 'active', my_status = 'active'
select from TestStatus where status = 'active'

And the record is returned.

Can you try with backtick on status?

select from TestStatus where `status` = 'active'

By the way, STATUS was an old keyword in OrientDB grammar we inherited, but it's not used in ArcadeDB so I removed it.

lvca avatar Jan 16 '23 15:01 lvca

It does not work with backtick either. I'll pull the latest version and repeat the test.

vic0824 avatar Jan 16 '23 16:01 vic0824

Hold on, if I repeat your test, it passes, so maybe it is something specific to prepared statements.

vic0824 avatar Jan 16 '23 16:01 vic0824

Finally got to the bottom of this, it was driving me crazy. The problem does not occur because of the property name, but because the where clause uses an indexed property. The index is associated to the type, so if the record is inserted into a bucket, apparently it is not indexed. If a select is performed on the type, using the indexed property in the where clause, it returns only the records that were inserted into the type, it does not return any record that was inserted into a specific bucket. If a select is performed without a where clause, or with a where clause that uses a non-indexed property, it forces the engine to perform a full scan of all the buckets, and therefore returns all the expected records.

I attach an updated test case and this is the output:

insert record into bucket: INSERT INTO BUCKET:O202203 SET p1 = 'PENDING', p2 = 'PENDING', result = {"p1":"PENDING","p2":"PENDING","@cat":"d","@type":"Order","@rid":"#3:0"}
insert record into type: INSERT INTO Order SET p1 = 'PENDING', p2 = 'PENDING', result = {"p1":"PENDING","p2":"PENDING","@cat":"d","@type":"Order","@rid":"#1:0"}
Test 1: select from type with p1 = 'PENDING'
sqlString = SELECT from Order WHERE p1 = 'PENDING', result: @rid = #1:0, p1 = PENDING, p2 = PENDING, 
Test 2: select from bucket with p1 = 'PENDING'
sqlString = SELECT from bucket:O202203 WHERE p1 = 'PENDING', result: 
Test 3: select from type with p2 = 'PENDING'
sqlString = SELECT from Order WHERE p2 = 'PENDING', result: @rid = #1:0, p1 = PENDING, p2 = PENDING, @rid = #3:0, p1 = PENDING, p2 = PENDING, 

Is this the expected behavior? I'm afraid I have completely misunderstood how ArcadeDB indexes work. I am creating buckets to partition the records based on pre-defined ranges of a property (date-time), and I create an index to speed up those queries for which it is not possible to know the target bucket a priori. If this is not possible, I'll have to remove the index and test the performances of the second type of query, but I imagine the performances will degrade when the size of database will increase, because the engine has to perform a full scan every time, whereas with the buckets the vertical scalability is assured, because every query requires at most the full scan of a bucket, which has a size that does not depend on how many other records are present in other buckets. ArcadeDBTest.zip

vic0824 avatar Jan 17 '23 10:01 vic0824

@vic0824 first of all, thanks for the test case. I was able to reproduce the same your result. I haven't debugged it but I can imagine when you created the index, ArcadeDB created 1 index per bucket. When you added a new bucket, there wasn't an index associated with it. That's why it's not working.

Now, I'm not sure if this should be automatic: adding a new bucket to a type with indexes, should create new bucket-indexes and attach to the schema, so it's all transparent. Or maybe you want to have separate behavior with some buckets not indexed?

I recreated the index after you added the bucket with:

      Bucket bucket = database.getSchema().createBucket("O202203");
      database.getSchema().getType("Order").addBucket(bucket);

      database.getSchema().dropIndex("Order[p1]");
      database.getSchema().getType("Order").createTypeIndex(Schema.INDEX_TYPE.LSM_TREE, false, "p1");

And works. This is the result.

insert record into bucket: INSERT INTO BUCKET:O202203 SET p1 = 'PENDING', p2 = 'PENDING', result = {"p1":"PENDING","p2":"PENDING","@cat":"d","@type":"Order","@rid":"#3:0"}
insert record into type: INSERT INTO Order SET p1 = 'PENDING', p2 = 'PENDING', result = {"p1":"PENDING","p2":"PENDING","@cat":"d","@type":"Order","@rid":"#1:0"}
Test 1: select from type with p1 = 'PENDING'
sqlString = SELECT from Order WHERE p1 = 'PENDING', result: @rid = #3:0, p1 = PENDING, p2 = PENDING, @rid = #1:0, p1 = PENDING, p2 = PENDING, 
Test 2: select from bucket with p1 = 'PENDING'
sqlString = SELECT from bucket:O202203 WHERE p1 = 'PENDING', result: @rid = #3:0, p1 = PENDING, p2 = PENDING, 
Test 3: select from type with p2 = 'PENDING'
sqlString = SELECT from Order WHERE p2 = 'PENDING', result: @rid = #1:0, p1 = PENDING, p2 = PENDING, @rid = #3:0, p1 = PENDING, p2 = PENDING, 

lvca avatar Jan 18 '23 03:01 lvca

There is no API to add a bucket-index to a type index, so I think this should be 100% automatic: when you add a new bucket to a type, all the indexes must be updated.

lvca avatar Jan 18 '23 03:01 lvca

Same thing for the dropBucket(): it must handle the indexes correctly.

lvca avatar Jan 18 '23 03:01 lvca

I agree, it should be 100% automatic.

vic0824 avatar Jan 18 '23 07:01 vic0824