paimon
paimon copied to clipboard
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
### Purpose `PaimonSplitScan` is built for internal scan with update/delete/mergeinto. It is used to generate deletion vector, collect touched files, etc. The main usage is to select some metadata columns...
### Purpose Field id should not be changed at any time - Add immutable id for system field - Make all fileds immutable even in calculating internally ### Tests ###...
Bumps [com.google.protobuf:protobuf-java](https://github.com/protocolbuffers/protobuf) from 3.19.6 to 3.25.5. Release notes Sourced from com.google.protobuf:protobuf-java's releases. Protocol Buffers v3.20.3 Java Refactoring java full runtime to reuse sub-message builders and prepare to migrate parsing logic...
### Purpose The scan.push-down makes source codes very complicate, it should be supported by Flink SQL instead of Paimon. In this PR, we remove it, if the user have requirement,...
### Purpose Support format table creation, and add an implementation of format table for SparkCatalog. ### Tests Added `SparkCatalogWithHiveTest` ### API and Format No API changed. ### Documentation https://cwiki.apache.org/confluence/display/PAIMON/PIP-27%3A+Introduce+Format+Table+in+Paimon+Catalog
### Purpose Add Table.uuid method for table uniq id, this can be used like some RBAC support. - for filesystem catalog, uuid is db + tableName, it is not a...
### Purpose Currently Paimon writer operators in Flink have states, which record the `commitUser` (for all writers) and the list of active buckets (for lookup or full-compaction changelog producer). These...
### Purpose - change param `older_than` type from TimeStampType to StringType, TimeStampType is not easy to understand and use. - add new param `time_retained`, which can call `expire_snapshots` periodically. `older_than`...
### Purpose Flink search files table takes too long, files collect by single executor may be out-of-memory too. It's better to execute distributedly. ### Tests ### API and Format ###...
### Search before asking - [X] I searched in the [issues](https://github.com/apache/paimon/issues) and found nothing similar. ### Motivation Flink SQL supports the create table like syntax to create a new paimon...