utensor_cgen
utensor_cgen copied to clipboard
F/no more cstrings
Replaces all cstrings (except for those used by sdtensor) with integers.
Massive savings, much wow.
@dboyliao can you suggest a cleaner way to add this functionality to all ops? It's pretty messy at the moment, but I think we can inject a simple tensor_name transformer that decorates each op with a "generate sref func" that gets called on snippet render.
@neil-tan @dboyliao @Knight-X
This PR uses static const uint32_t
in place of #define
s. The former has the implicit benefit of debug-ability since compilation with debug will output named symbols, and furthermore takes up the same size and number of instructions as the latter.
I think your approach is a good start except that I prefer it's implemented as an optimization pass. The reason for this is:
- it make the ugraph easier to debug since the tensor name will be the same as original model file before the opt pass.
- A good case for developing a code generator planner or a better way to store meta data (such as this tensor name -> int mapping)
However, I'll work on auto testing code generator first before I dig into this.
BTW, I think ubit
has been merged into develop
AFAIK.
Can you change the merge base to develop
?
@mbartling what's the run time to go with this? I'm getting a runtime error with this repo:
foundSimple MNIST end-to-end uTensor cli example (device)
[Error] ./uTensor/src/uTensor/core/context.cpp:70 @push Tensor "-122657359" not found
Checked out to this PR and generated the code
@mbartling As discussed, using the setup above:
- Add
#include "models/deep_mlp_weight.hpp"
to main.cpp or the model file - In main.cpp
ctx.get("y_pred:0")
becomesctx.get(sref_y_pred_0)
- In the model file:
ctx.add(input_0, "x:0", 2)
becomesctx.add(input_0, sref_x_0, 2)
works with current runtime dev branch.
@dboyliao Do we already have these changes in the current dev branch? If so we can close this PR
AFAIK, not yet.
But I think you can still merge it because it's targeting ubit
branch if it will make your work easier somehow. I can approve this PR.
As for develop
, since it went through a major refactoring, I think this PR need extra work to be merged into develop
.
Besides, it's using legacy api.
Not of high priority considering we are moving toward rearch.
How do you think?
I still think this idea should be implemented for the reach, even if it means we close this PR. It will generate the absolute smallest models and is really a way to differentiate between debug/release build.
Thoughts?
Sure, I agree with you. But the new codegen code base has changed a lot comparing this PR. I need some time to think about how to integrate this PR to the next release code base. Probably will do it after next release.
Actually, on second thought it really isn't necessary for the rearch since param names are bound to the operators in the input out name enums. So this is more a debug convenience. I think we can put it off for a while
Feel free to close :)