spark icon indicating copy to clipboard operation
spark copied to clipboard

[SS][SPARK-46928] Add support for ListState in Arbitrary State API v2.

Open sahnib opened this issue 2 years ago • 1 comments

What changes were proposed in this pull request?

This PR adds changes for ListState implementation in State Api v2. As a list contains multiple values for a single key, we utilize RocksDB merge operator to persist multiple values.

Changes include

  1. A new encoder/decoder to encode multiple values inside a single byte[] array (stored in RocksDB). The encoding scheme is compatible with RocksDB StringAppendOperator merge operator.
  2. Support merge operations in ChangelogCheckpointing v2.
  3. Extend StateStore to support merge operation, and read multiple values for a single key (via a Iterator). Note that these changes are only supported for RocksDB currently.

Why are the changes needed?

These changes are needed to support list values in the State Store. The changes are part of the work around adding new stateful streaming operator for arbitrary state mgmt that provides a bunch of new features listed in the SPIP JIRA here - https://issues.apache.org/jira/browse/SPARK-45939

Does this PR introduce any user-facing change?

Yes This PR introduces a new state type (ListState) that users can use in their Spark streaming queries.

How was this patch tested?

  1. Added a new test suite for ListState to ensure the state produces correct results.
  2. Added additional testcases for input validation.
  3. Added tests for merge operator with RocksDB.
  4. Added tests for changelog checkpointing merge operator.
  5. Added tests for reading merged values in RocksDBStateStore.

Was this patch authored or co-authored using generative AI tooling?

No

sahnib avatar Jan 31 '24 01:01 sahnib

cc: @HeartSaVioR PTA, thanks!

sahnib avatar Feb 06 '24 21:02 sahnib

CI failed from Run / Build modules: pyspark-sql, pyspark-resource, pyspark-testing but none of test failures are relevant to this change.

HeartSaVioR avatar Feb 21 '24 05:02 HeartSaVioR

Thanks! Merging to master.

HeartSaVioR avatar Feb 21 '24 05:02 HeartSaVioR