proposed numpy accelerated version
First of all, thanks for this great implementation!
When playing with it I realized that my computer RAM was getting quite full and that every call to encodeBin or decodeBin will have to be casted by PyBind11, which introduced lots of overhead in memory and computation.
The following proposed modifications focus on:
- Move most of the computation to C++ side for speedup (for example move the for loops in encode and decode, which will result in faster loops and less calls to pybind)
- Avoid copying data from and to C++. This is done by sending references to Numpy arrays rather than having to cast Python variables to C++ and viceversa.
I've added a file called Changes_and_performance_comparison.md in which this is depicted and some statistics are presented, showing a huge improve in computation and memory use. Also some demos to easily replicate.
Hope you like this modifications!!
Hi @a-mruiz, thank you very much for filing this PR. Sadly, we cannot accept it yet. We still see the following issues:
- This line limits the number of contexts to exactly two. Moreover, the user is not free to build his own context models anymore, since you are always selecting the context just based on the last encoded bin. In practice, context models are much more complex than this simple model.
- We know that tracing is resource extensive. However, there is the
RWTH_ENABLE_TRACINGmacro allowing to switch it off. In larger simulations, switching it off definitely makes sense. For debugging/learning/modeling purposes, the flag allows to get a closer look at the internal behavior.
To solve our main issue mentioned in 1., one could think of passing a function handle for context modeling taking the last encoded $n$ bits as input and yielding a context for the bin to be encoded as output. This function handle has to be passed to both, encoder and decoder. Further, it needs to be asserted that the function does not use any future bin for context modeling, as this would definitely not work at the decoder side.
If you are further interested in the CABAC functionality, we could provide other literature such as this paper.
cc: @christianrohlfing
I will close this now, as the discussion is not active anymore.