carbondata
carbondata copied to clipboard
[TEMP] [WIP] Transaction manager support
Why is this PR needed?
Based on the discussion and design document from https://issues.apache.org/jira/browse/CARBONDATA-4171
What changes were proposed in this PR?
- Added a new ACID module.
- Defined a transaction manager interface (can support db and table) and implemented a file-based transaction manager to write the transaction log file.
- Defined a segment store interface (can support db and table) and implemented a file-based segment store to write the table status with the transaction id (multiple table status file)
- Handled the basic transaction conflict scenario for the file-based segment store.
- Lock support for transaction log file
TODO:
- Write wrappers for the segment store interface (segment status updater and reader) and replace them in the current code.
- Remove table status lock everywhere and changing segment id logic
- Adding segment id as the return value of each transaction operation to fill the affected segments
- similar to segment store interface (table status), need to handle for update status file and segment file also
- Try to decouple data and metadata operation for each transaction command (may be write new classes), so metadata can be supported for other open formats also.
- Test optimistic concurrency by running parallel transactions (update, compation) and cover all the scenarios. On top of these, we can provide time travel SQL support and multi-format support. Across table transaction support
Does this PR introduce any user interface change?
- No
Is any new test case added?
- No
- Yes
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3706/
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5450/