Platform-Independent Model Serialization - Port existing code
After the new serialization technique has been implemented, the old technique must be phased out. Both methods should work for a period of time to give client code a chance to switch over. Users need enough time to use a conversion method to re-serialize their models using the new protocol.
Details and sample PRs here: https://github.com/numenta/nupic/wiki/Serialization#new-format
- [x] Use newer pycapnp dependency and clean up README
- [x] Enable passing pycapnp builders/readers to C++
- [x] Port misc. components (sparse matrix, RNG, etc)
- Some of these are done, like sparse matrix implementations and RNG but there are likely others. I think it makes most sense to do these when we need them for one of the algorithms.
- [x] Integrate Cap'n Proto serialization with encoders
- [x] Port spatial pooling
- [x] Port Python temporal memory
- [x] Port C++ temporal memory
- [x] Port TP.py
- [ ] Port C++ Cells4 and Python TP10X2.py
- [x] Port C++ CLA classifier
- [x] Port Python CLA classifier
- [x] Port C++ Network API (core components, not region implementations) - numenta/nupic.core#441
- [x] Port Python Network API (core components, not region implementations) - #2241
- [x] Port C++ Region implementations (separate for each region) - numenta/nupic.core#458
- [ ] Port Python Region implementations (separate for each region)
- [ ] Port CLAModel
- [ ] Port KNN classifier
- [ ] Create conversion utility to switch from old model serialization format to new one
- [ ] Make old serialization method obviously deprecated
- [ ] Remove old serialization function
- [x] Centralize all .capnp files to same directory (in nupic.core)
@scottpurdy I think we talked about you putting together some kind of instructions on how NuPIC contributors could help port code to use capnp. Didn't you mention there was a wiki or README somewhere?
@rhyolight - added link to wiki page in issue description.
Yep, there is a lot of work to do.. But a relief that I feel is that once a c++ module is ported, the porting of its python contrapart gets easier (and vice versa).. I said this because I think the most dificult job (for those not familiar with some components) is try figure out which fields should be serialiazed.. I, for example, I'll pick "Port Python CLA classifier" to get started because I can check which fields were chosen in C++ CLA classifier serialization (although some subtle differences due to optimizations in C++).
@david-ragazzi - good observation! It is important that we create some unit tests that validate that nothing changes after serialization. As long as you are confident in the tests it should be easy to ensure that you have all of the right fields serialized.
@scottpurdy @oxtopus While I work on Python serialization, could you guys work paralelly on C++ components serialization? So when a C++ module is ready, I only port it to python, and when a Python module is ready, you port to C++. This will optimize and agilize the work a lot!
@david-ragazzi - I'm not sure what resources we will have for this in the immediate future (I'm going on vacation Thursday) but that is great if you just want to do the Python implementation. For cases where there are both Python and C++ versions that will share serialization schema, make sure that if you create the schema you put it in nupic.core (it is fine to submit a PR with just the schema and no implementation yet if you are just doing the Python).
@david-ragazzi - For cases where there are both Python and C++ versions that will share serialization schema, make sure that if you create the schema you put it in nupic.core (it is fine to submit a PR with just the schema and no implementation yet if you are just doing the Python).
Thanks. I will do this.
