cnosdb icon indicating copy to clipboard operation
cnosdb copied to clipboard

Executing the join statement causes the container to stop

Open IvanGao01 opened this issue 3 years ago • 6 comments

Describe the bug

Executing the join statement causes the container to stop

To Reproduce Steps to reproduce the behavior:

  1. Setup docker container.
    docker run --name cnosdb -d  --env cpu=2 --env memory=4 -p 31007:31007 cnosdb/cnosdb:v2.0.1
    
  2. Load data to container.
    # The sample data is in the attachment
    docker cp oceanic_station_20w.txt cnosdb:/
    
  3. Enter the container and import the data.
    docker exec -it cnosdb sh
    cnosdb-cli
    create database oceanic_station;
    \c oceanic_station
    \w oceanic_station_20w.txt
    
  4. Execute sql.
    SELECT * FROM air INNER JOIN sea ON sea.station = air.station;
    
  5. The container will stop after waiting for a period of time.

Expected behavior If it is correct, the result should be returned without causing the container to stop. example:

+---------------------+------------+------------+-------------+----------+---------------------+-------------+-------------+
| time                | station    | visibility | temperature | pressure | time                | station     | temperature |
+---------------------+------------+------------+-------------+----------+---------------------+-------------+-------------+
| 2022-01-28 13:24:00 | XiaoMaiDao | 50         | 65          | 66       | 2022-01-28 13:30:00 | XiaoMaiDao  | 78          |
| 2022-01-28 13:24:00 | XiaoMaiDao | 50         | 34          | 56       | 2022-01-28 13:33:00 | XiaoMaiDao  | 76          |
| 2022-01-28 13:30:00 | XiaoMaiDao | 65         | 79          | 77       | 2022-01-28 13:39:00 | XiaoMaiDao  | 79          |
|      ... ...        |   ... ...  |  ... ...   |  ... ...    | ... ...  |       ... ...       |   ... ...   |   ... ...   |
+---------------------+------------+------------+-------------+----------+---------------------+-------------+-------------+

Additional context environment: MacBook Pro (13-inch, M1, 2020) The attachment is here: oceanic_station_20w.txt

IvanGao01 avatar Nov 24 '22 09:11 IvanGao01

I was running in debug mode and it was running for 1 hour. In release mode, set the number of partitions to 8 and run for 100 seconds, and set the number of partitions to 16 and run for 90 seconds.

ZuoTiJia avatar Nov 29 '22 03:11 ZuoTiJia

@IvanGao01 @ZuoTiJia Please confirm whether there is still a problem with the current issue, if there is no problem, I will close it

yukkit avatar Apr 14 '23 11:04 yukkit

version: bae0cd09f1a0adf1849000d13909d296930ef379

machine: 192.168.0.55 memory: 16G

start mode: nohup ./target/release/cnosdb run --config config_8902.toml -M singleton &

data:

https://github.com/cnosdb/cnosdb/files/10083169/oceanic_station_20w.txt
\w oceanic_station_20w.txt

sql: SELECT * FROM air INNER JOIN sea ON sea.station = air.station;

error:

error sending request for url (http://localhost:8902/api/v1/sql?tenant=cnosdb&db=public&chunked=false): connection closed before message completed

Caused by:
    connection closed before message completed

sys log:

dmesg|grep cnosdb
[5270386.180177] Out of memory: Kill process 8211 (cnosdb) score 931 or sacrifice child
[5270386.180541] Killed process 8211 (cnosdb), UID 0, total-vm:52215152kB, anon-rss:15586780kB, file-rss:0kB, shmem-rss:0kB

lutengda avatar Aug 01 '23 06:08 lutengda

When datafusion is executed, the memory pool will be checked. But still can't prevent OOM. The chunked mode of http currently has no effect.

ZuoTiJia avatar Oct 24 '23 10:10 ZuoTiJia

arrow-rs version: 42.0.0

arrow_select::take::take_bytes

    if array.null_count() == 0 && indices.null_count() == 0 {
        for (i, offset) in offsets.iter_mut().skip(1).enumerate() {
            let index = indices.value(i).to_usize().ok_or_else(|| {
                ArrowError::ComputeError("Cast to usize failed".to_string())
            })?;

            let s = array.value(index);

            let s: &[u8] = s.as_ref();
            length_so_far += T::Offset::from_usize(s.len()).unwrap();
            values.extend_from_slice(s);
            *offset = length_so_far;
        }
        nulls = None
    } else if indices.null_count() == 0 {
            length_so_far += T::Offset::from_usize(s.len()).unwrap();

In the case of overflow-checks = true, panic may be caused, and in the case of overflow-checks = false, unexpected exceptions may be caused, such as OOM.

yukkit avatar Oct 26 '23 03:10 yukkit

link https://github.com/apache/arrow-datafusion/issues/7931

yukkit avatar Oct 26 '23 04:10 yukkit