xingjinglu

Results 6 comments of xingjinglu

> @zdx3145 @zhangscth @DolbyYu @glbfor @kimy0909 @onexuan @homwen 上面所讨论的方法我都试过了,还是没有解决,不知道大家有没有其他方法 如果是多个module调用OpenMP版本的ncnn,这时如果是静态链接OpenMP,会导致OpenMP的多次初始化,导致abort。所以建议改成动态链接OpenMP库,一般会解决该问题。

Hope there are some docs about the implementation of matmul on Volta.

@demiguo I have same issue, had you solved the problem?

解决了,修改一下 project/ios/ xcode项目的配置即可,具体如下: 1)# 预处理编译选项 Preprocessor Macros $(inherited) MNN_CODEGEN_REGISTER=1 MNN_METAL_ENABLED=1 ENABLE_ARMV82=1 MNN_COREML_ENABLED=1 USE_LZ4_FLAG=1 MNN_USE_SPARSE_COMPUTE=1 MNN_LOW_MEMORY=1 2) 添加下面文件 source/backend/arm82/asm/arm64/low_memory/ source/backend/cpu/arm/arm64/low_memory 3)针对下面的上面的文件设置编译选项:-march=armv8.2-a+fp16 具体可以参考 https://github.com/xingjinglu/MNN/tree/memory_compress/project/ios/MNN.xcodeproj

I havve encountered the same problem and sovled it. The reason, for the main branch of triton, the the default version of ptxas, cuobjdump,nvdisasm in triton is cuda-12.x(which is set...

> > upgrading the nvidia driver > > could you give a link about how to upgrade driver there is solution in the follow issue https://github.com/openai/triton/issues/1955#issuecomment-1929908209