OpenMLDB icon indicating copy to clipboard operation
OpenMLDB copied to clipboard

Window union will process the future data in window for online request mode

Open tobegit3hub opened this issue 1 year ago • 1 comments

Bug Description

Now we use online request mode to process data and get incorrect result.

Steps to Reproduce

Create the test database and tables.

CREATE DATABASE IF NOT EXISTS db1;

CREATE TABLE IF NOT EXISTS db1.t1(
    ingestionTime timestamp,
    new_user_id string,
    index(key=(new_user_id), ttl=(0m, 0), ttl_type=absandlat, ts=`ingestionTime`)
);

CREATE TABLE IF NOT EXISTS db1.t2(
    ingestionTime timestamp,
    new_user_id string,
    index(key=(new_user_id), ttl=(0m, 0), ttl_type=absandlat, ts=`ingestionTime`)
);

Insert the test data.

set @@execute_mode='online';

insert into db1.t2 values (100, "abc");
insert into db1.t2 values (101, "abc");

Deploy the SQL.

use db1;

deploy demo1 select
    new_user_id as new_user_id,
    count(`new_user_id`) over w as window_count,
from
    db1.t1
    window w as (
UNION (select * from db1.t2) partition by `new_user_id` order by `ingestionTime` rows between 10 preceding and current row INSTANCE_NOT_IN_WINDOW);

Now we use online request mode with the data (99, "abc"). Since the data in t2 is larger than request row, we should get count of window is 1.

curl http://127.0.0.1:9080/dbs/db1/deployments/demo1 -X POST -d '{ "input": [ [ 99, "abc" ] ] }'
{"code":0,"msg":"ok","data":{"data":[["abc",1]]}}

Test with other data and always get result of 3.

curl http://127.0.0.1:9080/dbs/db1/deployments/demo1 -X POST -d '{ "input": [ [ 200, "abc" ] ] }'
{"code":0,"msg":"ok","data":{"data":[["abc",3]]}}

Currently we can not reproduce the issue with command line to create table with indexes.

tobegit3hub avatar Feb 06 '24 10:02 tobegit3hub

I can't reproduce the issue with this case:

  - id: 41
    inputs:
      - columns: ["id int", "val string", "ts timestamp"]
        indexs: ["index1:val:ts"]
        rows:
          - [3, 'abc', 99]
          - [4, 'abc', 100]
          - [5, 'abc', 101]
      - columns: ["id int", "val string", "ts timestamp"]
        indexs: ["index1:val:ts"]
        rows:
          - [1, 'abc', 100]
          - [2, 'abc', 101]
    sql: |
      select id, val, count(val) over w as agg
      from {0}
      window w as (
        union (select * from {1})
        partition by val order by ts
        rows between 2 preceding and current row
        instance_not_in_window
      )
    expect:
      order: id
      columns: ["id int","val string","agg int64"]
      rows:
        - [3, 'abc', 1]
        - [4, 'abc', 2]
        - [5, 'abc', 3]

Any extra info ?

aceforeverd avatar Feb 06 '24 14:02 aceforeverd

May be the issue in macos when index is incorrect.

tobegit3hub avatar Jul 03 '24 03:07 tobegit3hub