peyer
peyer
Thanks for your great work, but I am newer to rust, do you have idea to provide c++/python version of wrapper of amx?
**I'm submitting a ...** (check one with "x“) 我使用使用vs2013+win7编译EasyPR.sln,想避免手动配置头文件和依赖库,请问使用nuget怎么自动下载opencv,并安装好对应的库文件和头文件 ``` [ ] bug report [x ] help wanted [ ] feature request ``` **Current behavior** **Expected/desired behavior** **Reproduction of the...
我看目前mace提供的APU库文件,只有mt67xx的只有6779(P90)和6785(G90)的库文件,想问下6771(P60)支持APU吗?支持的话也是需要使用apu-frontend和apu-platform的方式吗?
从高通官方博客[blog](https://developer.qualcomm.com/blog/accelerate-your-models-our-opencl-ml-sdk)上得知,从adreno 660开始,高通官方增加了cl_qcom_ml_ops扩展,速度比目前开源实现的opencl 代码都要快一些;我从小米11的手机上adb pull /system/vendor/lib64/libOpenCL.so,本地nm -D看过,确实有clQueryMLInterfaceVersionsQCOM的符号表,但是调用以后会报CL_OUT_OF_HOST_MEMORY的错误,请问是小米还没有更新adreno的驱动导致的吗?
Thanks for your great work! I just wanna to compare the latency of running convolution layer between anakin_sass and cudnn, and I have successfully ran convolution layer with cudnn. However,...
For convenience, I just take an instance below. If the shape of input tensor of conv op is [1, 32, 110, 94] ([N, C, H, W] order), the shape of...
@Superjomn Hi, I just want to compare the performance of anakin_sass with the performance of [email protected] for convolution operator, and I have successfully compared on fp32 data-format. However, when I...
我在测试bolt的opencl时发现一个bug; 由于bolt采用了NCHW / NCHWC4等数据排布混用、针对opencl在层与层之间的blob混用了buffer、image1d、image2d、image3d,同时内存分配上还采用了内存复用,可能导致了我的一个模型在depth2space_ocl层触发了一个bug,就是depth2space对应的kernel的arg里是写的是buffer的输入类型,但是内存复用以后,对应arg传入了一个image3d的数据类型,导致set_arg报错CL_INVALID_MEM_OBJ,我还在定位是内存复用的代码
@BUG1989 Thanks for your great work ! I have test with mtcnn which the performance just drop a litte, but speed up about 4x, amazing! However, when I test with...
@clancylian Thanks for your great work! Could your share your retrained mxnet model?