Myungchul Keum comments

Results 28 comments of


Myungchul Keum

conda-forge packaging

https://github.com/conda-forge/staged-recipes/pull/19495 I made PR to conda-forge.

PR for conda-forge is kind of rejected for now. Actually, python-soxr is using static link for easy installation, so does not need separate libsoxr . Conda-forge maintainer is requesting (optional)...

conda-forge packaging

@bmcfee Yes, that's right. I'm just quite lazy for implement it. 😉 I'll handle this in near future.

conda-forge packaging

Python-SoXR is now on conda-forge. 🎉 https://anaconda.org/conda-forge/soxr-python https://github.com/conda-forge/soxr-python-feedstock Note: The Conda package has been named to `soxr-python` to follow conda-forge's naming scheme.

Known issues

I added more error handling at 5a980b3. I hope this is sufficient.

Known issues

*-manylinux_i686 build added at 20c26bb (v0.2.5). pp37-win_amd64 build added at 5b8e32e (v0.2.7). Problem was fixed with PIP 21.3 update. https://github.com/pypa/cibuildwheel/issues/817

Make RNN faster by changing loop order

I just reordered loop to access memory sequentially, and utilize auto-vectorization more. It's about 2% speedup for opus encoding. On MSVC, Using `/fp:fast`(option like `-ffast-math` on GCC) results assembly like...

Make RNN faster by changing loop order

After merging 9791b22b2c83980f6b4386c870cad58557c78007, I tested it again. ### Profiling on MSVC 2015. ↓ Before swapping 2 loops in gemm_accum() ![img003](https://user-images.githubusercontent.com/8174871/50731785-b29c1300-11b0-11e9-99c9-d35131cce61a.png) Computing NN takes 2~3% of whole encoding time. ↓ After...

Make RNN faster by changing loop order

And... `gemm_accum()` should be renamed to `gemv_accum()` because it's matrix-vector product.

Make RNN faster by changing loop order

Isn't intended to be fixed point after all? I'd rather try to implement fixed-point RNN instead.... Is there anything already done with fixed-point implementation?