starrocks
starrocks copied to clipboard
[Feature] Introduce a JNI connector
What type of PR is this:
- [ ] bug
- [x] feature
- [ ] enhancement
- [ ] refactor
- [ ] others
introduce a JNI connector In order to quickly access the data source of the java ecosystem.
This feature provides an interface to help developers mainly focus on their own java reader, no need to care about annoying JNI and just need to write a little cpp wrapper code.
Notes for review: Reviewer should read comments in OffHeapTable.java to understand the memory layout design used to interact with C++ in BE.
/**
* We use off-heap memory to save Hudi MOR table data
* and a custom memory layout to be parsed by Starrocks BE written in C++.
*
* Off-heap table memory layout details:
* 1. A single data column is stored continuously in off-heap memory.
* 2. Different data columns are stored in different locations in off-heap memory.
* 3. Introduce null indicator columns to determine if a row of the related data column is empty.
* 4. Introduce a meta column to save the memory addresses of different data columns,
* the memory addresses of null indicator columns and number of rows.
*
* Meta column layout:
* Meta column start address: | number of rows |
* | null indicator start address of fixed length column-A |
* | data column start address of the fixed length column-A |
* | ... |
* | null indicator start address of variable length column-B |
* | offset column start address of the variable length column-B |
* | length column start address of the variable length column-B |
* | data column start address of the variable length column-B |
* | ... |
*
* Null indicator column layout:
* Null column start address: | 1-byte boolean | 1-byte boolean | 1-byte boolean | ... |
* Row index: -------row 0-------------row 1------------row 2----- ... -
*
* Data column layout:
* Data columns are divided into two storage types: fixed length column and variable length.
*
* For fixed length column like BOOLEAN/INT/LONG, we use first-level index addressing method.
* (1) Get data column start address from meta column.
* (2) Use column start address to read the data of fixed length.
* Fixed length column memory layout:
* Data column start address of fixed length column: | X-bytes | X-bytes | X-bytes | ... |
* INT column of 4 bytes for example:
* Fixed length column start address: | 4-bytes INT | 4-bytes INT | 4-bytes INT | ... |
* Row index: ----row 0---------row 1---------row 2----- ... -
*
*
* For variable length column like STRING/DECIMAL, we use secondary-level index addressing method.
* (1) Get offset column start address and length column start address from meta column.
* (2) Get the field start memory address from offset column at a row index.
* (3) Get the field length from length column at a row index.
* (4) Use the field start address and the field length to read the data of variable length.
* Variable length column memory layout:
* Offset column start address of variable length column: : | 4-bytes INT | 4-bytes INT | 4-bytes INT | ... |
* Length column start address of variable length column: : | 4-bytes INT | 4-bytes INT | 4-bytes INT | ... |
* Data column start address of variable length column: | X-bytes | Y-bytes | Z-bytes | ... |
* STRING column for example:
* Offset column start address: | 4-bytes INT | 4-bytes INT | 4-bytes INT | ... |
* Row index: ----row 0---------row 1---------row 2----- ... -
* Length column start address: | 4-bytes INT | 4-bytes INT | 4-bytes INT | ... |
* Row index: ----row 0---------row 1---------row 2----- ... -
* Variable length column start address: | (length of row 0)-bytes | (length of row 1)-bytes | ... |
* | |
* column start address + offset of row 0 column start address + offset of row 0 + length of row 0
*/
- suggest rename java-utils to java-native-utils
- suggest split this PR to multi stage.
we should add java-utils-xxx.jar and jni-connector-xxx.jar to JNI class path
void JVMFunctionHelper::_init() {
std::string home = getenv("STARROCKS_HOME");
std::vector<std::string> class_paths = {home + "/lib/udf-extensions-jar-with-dependencies.jar",
home + "/lib/starrocks-jdbc-bridge-jar-with-dependencies.jar"};
for (auto path : class_paths) {
_add_class_path(path);
}
run starrocks_admit_test
run starrocks_admit_test







