Gemfield issues

Results 16 issues of


Gemfield

[BUG] 转换的TNN模型在华为麒麟处理器上opencl（GPU）比cpu速度慢

**bug描述** 我们将ESP网络转换成TNN模型部署在华为手机上和骁龙处理器手机上。在骁龙处理器手机上，GPU/opencl 是cpu模型推理速度的2倍，而在华为麒麟980手机上，GPU/opencl 比cpu模型推理速度还要慢（从13fps下降到10fps）。 **如何复现** 复现步骤: 1. config.py 中配置ESP网络，打开转换TNN模型的开关； 2. 运行test.py输出TNN模型； 3. 集成到安卓项目中，然后安装到华为麒麟980手机上； 4. 测试摄像头输入时的fps速度。 **预期结果** 华为麒麟980手机上，GPU/opencl 应该大于等于cpu的推理速度。 **截图** 如果有必要的话，请添加截图。 **如果使用的是MLab HomePod，请填写** - 宿主机 cpu/ram/cuda设备： intel i9-9820X/32GB/RTX2080ti - 宿主机操作系统/内核版本/GPU驱动：ubuntu 20.04/5.4.0-74-generic/460.80...

coreml转换器在numpy 1.20上会报错

coreml转换器在numpy 1.20上会报错

tensorrt转换器报错

当打开tensorrt转换器开关后，转换逻辑报错。

开启tensorboard后，deepvac日志会打印2份

开启tensorboard后，deepvac日志会打印2份。这应该是tensorboard包的log handler和deepvac包的log handler冲突了。

由上游PyTorch引入的问题

DeepVAC把这些问题划分为两类： - 阻塞性问题； - 可以绕过的问题。 # 阻塞性问题 - 在DDP模式中，训练任务不支持再开启trace和script。解决方案：等待上游PyTorch添加新功能； - 量化感知训练（QAT）不支持图模式，因此需要手工修改网络，参考https://zhuanlan.zhihu.com/p/349019936 所述。解决方案：等待上游PyTorch添加新功能； - 开启script_model_dir + static_quantize_dir得到的量化模型,在运行时报错(trace_model_dir + static_quantize_dir似乎没有问题)。解决方案：等待上游PyTorch的fix; - 图模式量化下，emit upsample的问题； # 可以绕过的问题 - 静态库没有安装到install目录下的问题； - nccl_static、kineto库的问题； - 静态编译下，导出变量不能包含cuda共享库的问题；

The latest nvidia-container-toolkit caused inconsistent cuda version and 804 error.

### 1. Issue or feature description docker image gemfield/homepod:2.0-pro (Dockerfile: https://github.com/DeepVAC/MLab/blob/main/docker/homepod/Dockerfile.pro) installed official pytorch conda package on nvidia/cuda:11.3.1-cudnn8-devel-ubuntu20.04. With below nvidia-container-toolkit version: ```bash gemfield@ai01:~$ dpkg -l | grep container |...

Can we install the PAI based on existed k8s cluster?

If so, where can find the document?

Fix torch tensor usage on newer pytorch version

This fix can help pytorch_classifiers run on 0.4.1 version.

[新功能] 自MLab HomePod 2.0 pro以来pytorch的更新

- 0dc40474fe Peter Bell Tue Jul 6 19:05:39 2021 -0700 Migrate glu from the THC to ATen (CUDA) (#61153);备注：glu是GatedLinearUnit； - a69e947ffd Freey0 Wed Jul 7 07:42:49 2021 -0700 avg_pool3d_backward: Port...

Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired

模型推理时被warning信息刷屏： - 环境 MLab HomePod 2.0 pro - 错误信息： ```bash [W ___torch_mangle_579.py:74] Warning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old...