elasticdl icon indicating copy to clipboard operation
elasticdl copied to clipboard

some refactor work of parameter server

Open QiJune opened this issue 6 years ago • 0 comments

Step 1:

  • [ ] embedding table Python data structure
  • [ ] tensor proto message and Python data structure
  • [ ] a new PS, which binds to a k8s service
  • [ ] parameter sharding hash function

Step 2:

  • [ ] KVStore Python data structure
  • [ ] define PS service interface in proto
  • [ ] refine current OptimizerWrapper to support KVStore

Step 3:

  • [ ] refine worker related code to cooperate with new PS design
  • [ ] evaluation support
  • [ ] checkpoint support

Step 4:

  • [ ] extend one-PS node to multi-PS node
  • [ ] PS replica support

Step 5:

  • [ ] fault tolerate feature and test

QiJune avatar Sep 17 '19 11:09 QiJune