spir-v fastmath mode
As op absval Before
; SPIR-V
; Version: 1.3
; Generator: Khronos Glslang Reference Front End; 11
; Bound: 54
; Schema: 0
OpCapability Shader
%1 = OpExtInstImport "GLSL.std.450"
OpMemoryModel Logical GLSL450
OpEntryPoint GLCompute %main "main" %gl_GlobalInvocationID
OpExecutionMode %main LocalSize 32 1 1
OpSource GLSL 450
OpSourceExtension "GL_EXT_shader_8bit_storage"
OpSourceExtension "GL_EXT_shader_explicit_arithmetic_types_int64"
OpName %main "main"
OpName %gi "gi"
OpName %gl_GlobalInvocationID "gl_GlobalInvocationID"
OpName %n "n"
OpName %parameter "parameter"
OpMemberName %parameter 0 "n"
OpName %p "p"
OpName %v "v"
OpName %bottom_top_blob "bottom_top_blob"
OpMemberName %bottom_top_blob 0 "bottom_top_blob_data"
OpName %_ ""
OpDecorate %gl_GlobalInvocationID BuiltIn GlobalInvocationId
OpDecorate %n SpecId 0
OpDecorate %parameter Block
OpMemberDecorate %parameter 0 Offset 0
OpDecorate %_runtimearr_v4float ArrayStride 16
OpDecorate %bottom_top_blob Block
OpMemberDecorate %bottom_top_blob 0 Offset 0
OpDecorate %_ Binding 0
OpDecorate %_ DescriptorSet 0
%void = OpTypeVoid
%3 = OpTypeFunction %void
%uint = OpTypeInt 32 0
%_ptr_Function_uint = OpTypePointer Function %uint
%v3uint = OpTypeVector %uint 3
%_ptr_Input_v3uint = OpTypePointer Input %v3uint
%gl_GlobalInvocationID = OpVariable %_ptr_Input_v3uint Input
%uint_0 = OpConstant %uint 0
%_ptr_Input_uint = OpTypePointer Input %uint
%n = OpSpecConstant %uint 0
%bool = OpTypeBool
%19 = OpSpecConstantOp %bool IEqual %n %uint_0
%parameter = OpTypeStruct %uint
%_ptr_PushConstant_parameter = OpTypePointer PushConstant %parameter
%p = OpVariable %_ptr_PushConstant_parameter PushConstant
%int = OpTypeInt 32 1
%int_0 = OpConstant %int 0
%_ptr_PushConstant_uint = OpTypePointer PushConstant %uint
%float = OpTypeFloat 32
%v4float = OpTypeVector %float 4
%_ptr_Function_v4float = OpTypePointer Function %v4float
%_runtimearr_v4float = OpTypeRuntimeArray %v4float
%bottom_top_blob = OpTypeStruct %_runtimearr_v4float
%_ptr_StorageBuffer_bottom_top_blob = OpTypePointer StorageBuffer %bottom_top_blob
%_ = OpVariable %_ptr_StorageBuffer_bottom_top_blob StorageBuffer
%_ptr_StorageBuffer_v4float = OpTypePointer StorageBuffer %v4float
%main = OpFunction %void None %3
%5 = OpLabel
%gi = OpVariable %_ptr_Function_uint Function
%20 = OpVariable %_ptr_Function_uint Function
%v = OpVariable %_ptr_Function_v4float Function
%14 = OpAccessChain %_ptr_Input_uint %gl_GlobalInvocationID %uint_0
%15 = OpLoad %uint %14
OpStore %gi %15
%16 = OpLoad %uint %gi
OpSelectionMerge %22 None
OpBranchConditional %19 %21 %31
%21 = OpLabel
%29 = OpAccessChain %_ptr_PushConstant_uint %p %int_0
%30 = OpLoad %uint %29
OpStore %20 %30
OpBranch %22
%31 = OpLabel
OpStore %20 %n
OpBranch %22
%22 = OpLabel
%32 = OpLoad %uint %20
%33 = OpUGreaterThanEqual %bool %16 %32
OpSelectionMerge %35 None
OpBranchConditional %33 %34 %35
%34 = OpLabel
OpReturn
%35 = OpLabel
%45 = OpLoad %uint %gi
%47 = OpAccessChain %_ptr_StorageBuffer_v4float %_ %int_0 %45
%48 = OpLoad %v4float %47
OpStore %v %48
%49 = OpLoad %v4float %v
%50 = OpExtInst %v4float %1 FAbs %49
OpStore %v %50
%51 = OpLoad %uint %gi
%52 = OpLoad %v4float %v
%53 = OpAccessChain %_ptr_StorageBuffer_v4float %_ %int_0 %51
OpStore %53 %52
OpReturn
OpFunctionEnd
After
; SPIR-V
; Version: 1.3
; Generator: Khronos Glslang Reference Front End; 11
; Bound: 55
; Schema: 0
OpCapability Shader
%1 = OpExtInstImport "GLSL.std.450"
OpMemoryModel Logical GLSL450
OpCapability FloatControls2
OpExtension "SPV_KHR_float_controls2"
OpEntryPoint GLCompute %main "main" %gl_GlobalInvocationID
OpExecutionMode %main FPFastMathDefault %float %uint_458752
OpExecutionMode %main LocalSize 32 1 1
OpSource GLSL 450
OpSourceExtension "GL_EXT_shader_8bit_storage"
OpSourceExtension "GL_EXT_shader_explicit_arithmetic_types_int64"
OpName %main "main"
OpName %gi "gi"
OpName %gl_GlobalInvocationID "gl_GlobalInvocationID"
OpName %n "n"
OpName %parameter "parameter"
OpMemberName %parameter 0 "n"
OpName %p "p"
OpName %v "v"
OpName %bottom_top_blob "bottom_top_blob"
OpMemberName %bottom_top_blob 0 "bottom_top_blob_data"
OpName %_ ""
OpDecorate %gl_GlobalInvocationID BuiltIn GlobalInvocationId
OpDecorate %n SpecId 0
OpDecorate %parameter Block
OpMemberDecorate %parameter 0 Offset 0
OpDecorate %_runtimearr_v4float ArrayStride 16
OpDecorate %bottom_top_blob Block
OpMemberDecorate %bottom_top_blob 0 Offset 0
OpDecorate %_ Binding 0
OpDecorate %_ DescriptorSet 0
%void = OpTypeVoid
%3 = OpTypeFunction %void
%uint = OpTypeInt 32 0
%_ptr_Function_uint = OpTypePointer Function %uint
%v3uint = OpTypeVector %uint 3
%_ptr_Input_v3uint = OpTypePointer Input %v3uint
%gl_GlobalInvocationID = OpVariable %_ptr_Input_v3uint Input
%uint_0 = OpConstant %uint 0
%_ptr_Input_uint = OpTypePointer Input %uint
%n = OpSpecConstant %uint 0
%bool = OpTypeBool
%19 = OpSpecConstantOp %bool IEqual %n %uint_0
%parameter = OpTypeStruct %uint
%_ptr_PushConstant_parameter = OpTypePointer PushConstant %parameter
%p = OpVariable %_ptr_PushConstant_parameter PushConstant
%int = OpTypeInt 32 1
%int_0 = OpConstant %int 0
%_ptr_PushConstant_uint = OpTypePointer PushConstant %uint
%float = OpTypeFloat 32
%v4float = OpTypeVector %float 4
%_ptr_Function_v4float = OpTypePointer Function %v4float
%_runtimearr_v4float = OpTypeRuntimeArray %v4float
%bottom_top_blob = OpTypeStruct %_runtimearr_v4float
%_ptr_StorageBuffer_bottom_top_blob = OpTypePointer StorageBuffer %bottom_top_blob
%_ = OpVariable %_ptr_StorageBuffer_bottom_top_blob StorageBuffer
%_ptr_StorageBuffer_v4float = OpTypePointer StorageBuffer %v4float
%uint_458752 = OpConstant %uint 458752
%main = OpFunction %void None %3
%5 = OpLabel
%gi = OpVariable %_ptr_Function_uint Function
%20 = OpVariable %_ptr_Function_uint Function
%v = OpVariable %_ptr_Function_v4float Function
%14 = OpAccessChain %_ptr_Input_uint %gl_GlobalInvocationID %uint_0
%15 = OpLoad %uint %14
OpStore %gi %15
%16 = OpLoad %uint %gi
OpSelectionMerge %22 None
OpBranchConditional %19 %21 %31
%21 = OpLabel
%29 = OpAccessChain %_ptr_PushConstant_uint %p %int_0
%30 = OpLoad %uint %29
OpStore %20 %30
OpBranch %22
%31 = OpLabel
OpStore %20 %n
OpBranch %22
%22 = OpLabel
%32 = OpLoad %uint %20
%33 = OpUGreaterThanEqual %bool %16 %32
OpSelectionMerge %35 None
OpBranchConditional %33 %34 %35
%34 = OpLabel
OpReturn
%35 = OpLabel
%45 = OpLoad %uint %gi
%47 = OpAccessChain %_ptr_StorageBuffer_v4float %_ %int_0 %45
%48 = OpLoad %v4float %47
OpStore %v %48
%49 = OpLoad %v4float %v
%50 = OpExtInst %v4float %1 FAbs %49
OpStore %v %50
%51 = OpLoad %uint %gi
%52 = OpLoad %v4float %v
%53 = OpAccessChain %_ptr_StorageBuffer_v4float %_ %int_0 %51
OpStore %53 %52
OpReturn
OpFunctionEnd
Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.
:white_check_mark: futz12
:x: nihui
You have signed the CLA already but the status is still pending? Let us recheck it.
Codecov Report
:x: Patch coverage is 81.90476% with 19 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 95.59%. Comparing base (a514cf5) to head (157ff17).
:warning: Report is 2 commits behind head on master.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/pipelinecache.cpp | 33.33% | 10 Missing :warning: |
| src/gpu.cpp | 91.95% | 7 Missing :warning: |
| src/pipeline.cpp | 0.00% | 2 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #6223 +/- ##
==========================================
- Coverage 95.89% 95.59% -0.30%
==========================================
Files 837 838 +1
Lines 264994 265097 +103
==========================================
- Hits 254105 253424 -681
- Misses 10889 11673 +784
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
The binary size change of libncnn.so (bytes)
| architecture | base size | pr size | difference |
|---|---|---|---|
| x86_64 | 15124728 | 15133360 | +8632 :warning: |
| armhf | 6155744 | 6160304 | +4560 :warning: |
| aarch64 | 9453192 | 9453856 | +664 :warning: |
感谢你的工作,请将你在实现中的笔记和心得,遇到的困难和解决方法等,记录成文章,发表在discussion分区,这将作为知识总结 https://github.com/Tencent/ncnn/discussions
Thank you for your work. Please record your notes and experience in the implementation, difficulties encountered and solutions, etc. into an article and publish it in the discussion section. This will serve as a knowledge summary. https://github.com/Tencent/ncnn/discussions