spark icon indicating copy to clipboard operation
spark copied to clipboard

[SPARK-38909][CORE][YARN] Encapsulate `LevelDB` used to store remote/external shuffle state as `DB`

Open LuciferYang opened this issue 2 years ago • 12 comments

What changes were proposed in this pull request?

ExternalShuffleBlockResolver, YarnShuffleService and RemoteBlockPushResolver use LevelDB directly, this is not conducive to extending the use of RocksDB in this scenario. This pr is encapsulated for expansibility. It will be the pre-work of SPARK-38888, the main change as follows:

  • Introduce DB interface and implementation of corresponding LevelDB now, RocksDB implementation will be involved in the future
  • Introduce DBBackend enum and new config SHUFFLE_SERVICE_DB_BACKEND to specify a disk-based store used in shuffle service local db, only LEVELDB is supported now, RocksDB implementation will be involved in the future
  • Introduce DBIterator to traverse the items in the DB, only LevelDBIterator is supported now, RocksDBIterator implementation will be involved in the future
  • Introduce DBProvider to initialization DB, it is only used to generate LevelDB now, RocksDB implementation will be involved in the future
  • Change StoreVersion to top-level class, it will be reused by RocksDB implementation
  • Replace the directly use of LevelDB in ExternalShuffleBlockResolver, YarnShuffleService and RemoteBlockPushResolver with DB
  • fix corresponding UTs and test tools

Why are the changes needed?

This is pre work of SPARK-38888

Does this PR introduce any user-facing change?

  • Add new config SHUFFLE_SERVICE_DB_BACKEND(spark.shuffle.service.db.backend), the new config use to use to specify a disk-based store used in shuffle service local db. Only LEVELDB is supported after this pr.

How was this patch tested?

Pass GA

LuciferYang avatar Apr 14 '22 13:04 LuciferYang

Although I couldn't take a look at this, you can ask YARN committers' help if you want to make this proceed, @LuciferYang .

dongjoon-hyun avatar Aug 02 '22 00:08 dongjoon-hyun

thanks @dongjoon-hyun

LuciferYang avatar Aug 02 '22 02:08 LuciferYang

I found RemoteBlockPushResolver begin to use LevelDB, more refactor work is required before reopen this pr

LuciferYang avatar Aug 02 '22 06:08 LuciferYang

Got it. That's bad.

dongjoon-hyun avatar Aug 02 '22 06:08 dongjoon-hyun

This is a recent change, to add missing functionality for preserving merge data across restarts. Please tag me when you have an updated PR @LuciferYang, will help review it.

mridulm avatar Aug 02 '22 06:08 mridulm

Merge with master and re-trigger GA

LuciferYang avatar Aug 04 '22 03:08 LuciferYang

This is a recent change, to add missing functionality for preserving merge data across restarts. Please tag me when you have an updated PR @LuciferYang, will help review it.

@dongjoon-hyun @mridulm Conflict is resolved and GA passed. But I think we should make sure that SPARK-38888 is worth to do before reviewing, because this pr is the pre-work of SPARK-38888.

LuciferYang avatar Aug 04 '22 12:08 LuciferYang

@dongjoon-hyun @mridulm Conflict is resolved and GA passed. But I think we should make sure that SPARK-38888 is worth to do before reviewing, because this pr is the pre-work of SPARK-38888.

I agree with @dongjoon-hyun's assessment in the jira, and I dont have strong opinion. If rest of the codebase is moving towards an abstraction for db, I would prefer we did the same for yarn shuffle as well though from a consistency point of view.

mridulm avatar Aug 04 '22 19:08 mridulm

Actually, instead of providing wrapper layer, the final goal is the removal of LevelDB in favor of RocksDB.

dongjoon-hyun avatar Aug 04 '22 20:08 dongjoon-hyun

@dongjoon-hyun

Actually, instead of providing wrapper layer, the final goal is the removal of LevelDB in favor of RocksDB.

I am sure in future, we might have another SPIP to remove RocksDB in favor of something else :-) If we can abstract it away, the future evolution might be easier !

mridulm avatar Aug 09 '22 04:08 mridulm

nit: Btw, from a completeness point of view, DB.get makes sense to include :-)

mridulm avatar Aug 09 '22 05:08 mridulm

nit: Btw, from a completeness point of view, DB.get makes sense to include :-)

OK ~

LuciferYang avatar Aug 09 '22 05:08 LuciferYang

Also cc @tgravescs do you have time to help review this pr? Thanks ~

LuciferYang avatar Aug 15 '22 04:08 LuciferYang

changes look fine to me

tgravescs avatar Aug 18 '22 14:08 tgravescs

Merged to master. Thanks for working on this @LuciferYang ! Thanks for reviews @dongjoon-hyun, @zhouyejoe and @tgravescs :-)

mridulm avatar Aug 18 '22 15:08 mridulm

Thanks @mridulm @tgravescs @zhouyejoe @dongjoon-hyun !!! I will do the RocksDB related work next week due to some other work needs to be completed this week ~

LuciferYang avatar Aug 18 '22 15:08 LuciferYang