FillArrays.jl
FillArrays.jl copied to clipboard
Fix some Base.@propagate_inbounds annotations
I noticed the compiler wasn't quite reducing things as it should, so here are some edits to (hopefully) improve performance. Thanks to @mbauman and @YingboMa for Slack discussion.
Suppose we have
julia> Z = Zeros(10)
10-element Zeros{Float64}
julia> f(z,j) = @inbounds z[j]
f (generic function with 1 method)
Then
Before
julia> @code_llvm f(Z,3)
; @ REPL[5]:1 within `f`
define double @julia_f_2043([1 x [1 x [1 x i64]]]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i64 signext %1) #0 {
top:
%2 = alloca [1 x i64], align 8
; ┌ @ /home/chad/git/FillArrays.jl/src/FillArrays.jl:42 within `getindex`
; │┌ @ /home/chad/git/FillArrays.jl/src/FillArrays.jl:37 within `_fill_getindex`
%3 = getelementptr inbounds [1 x i64], [1 x i64]* %2, i64 0, i64 0
store i64 %1, i64* %3, align 8
; ││┌ @ abstractarray.jl:659 within `checkbounds` @ abstractarray.jl:644
; │││┌ @ abstractarray.jl:718 within `checkindex`
; ││││┌ @ int.jl:471 within `<=`
%4 = icmp slt i64 %1, 1
; ││││└
; ││││┌ @ range.jl:686 within `last`
; │││││┌ @ Base.jl:42 within `getproperty`
%5 = getelementptr inbounds [1 x [1 x [1 x i64]]], [1 x [1 x [1 x i64]]]* %0, i64 0, i64 0, i64 0, i64 0
; ││││└└
; ││││┌ @ int.jl:471 within `<=`
%6 = load i64, i64* %5, align 8
%7 = icmp slt i64 %6, %1
; │││└└
; │││ @ abstractarray.jl:659 within `checkbounds`
%8 = or i1 %4, %7
br i1 %8, label %L12, label %L17
L12: ; preds = %top
%9 = call nonnull {}* @j_throw_boundserror_2045([1 x [1 x [1 x i64]]]* nocapture nonnull readonly %0, [1 x i64]* nocapture readonly %2) #0
call void @llvm.trap()
unreachable
L17: ; preds = %top
; └└└
ret double 0.000000e+00
}
After
julia> @code_llvm f(Z,3)
; @ REPL[5]:1 within `f`
define double @julia_f_1017([1 x [1 x [1 x i64]]]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i64 signext %1) #0 {
top:
ret double 0.000000e+00
}
Thanks! Any chance you can add tests?
Codecov Report
Merging #155 (9878b6a) into master (269fd41) will not change coverage. The diff coverage is
100.00%.
@@ Coverage Diff @@
## master #155 +/- ##
=======================================
Coverage 99.21% 99.21%
=======================================
Files 4 4
Lines 633 633
=======================================
Hits 628 628
Misses 5 5
| Impacted Files | Coverage Δ | |
|---|---|---|
| src/FillArrays.jl | 99.68% <100.00%> (ø) |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
Only way I can think of offhand is to redirect stdout, capture it in a string and check its length. But I'm not sure how to do that. Any ideas?
I mean, there's
help?> redirect_stdout
search: redirect_stdout redirect_stdio redirect_stdin redirect_stderr
redirect_stdout([stream]) -> stream
Create a pipe to which all C and Julia level stdout output will be redirected. Return a stream representing the pipe ends. Data written to
stdout may now be read from the rd end of the pipe.
│ Note
│
│ stream must be a compatible objects, such as an IOStream, TTY, Pipe, socket, or devnull.
See also redirect_stdio.
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
redirect_stdout(f::Function, stream)
Run the function f while redirecting stdout to stream. Upon completion, stdout is restored to its prior setting.
TTY looks like closest, but I have no idea how to create one
help?> Base.TTY
No documentation found.
Summary
≡≡≡≡≡≡≡≡≡
mutable struct Base.TTY
Fields
≡≡≡≡≡≡≡≡
handle :: Ptr{Nothing}
status :: Int64
buffer :: IOBuffer
cond :: Base.GenericCondition{Base.Threads.SpinLock}
readerror :: Any
sendbuf :: Union{Nothing, IOBuffer}
lock :: ReentrantLock
throttle :: Int64
Supertype Hierarchy
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
Base.TTY <: Base.LibuvStream <: IO <: Any
@dlfivefifty Maybe something like this?
julia> using FillArrays, Test
julia> function llvm_lines(a)
f(x,j) = @inbounds x[j]
io = IOBuffer()
code_llvm(io, f, (typeof(a), Int))
# countlines(io)
count(==('\n'), String(take!(io)))
end
llvm_lines (generic function with 1 method)
julia> @test llvm_lines(Zeros(10)) < 10
Test Passed
Expression: llvm_lines(Zeros(10)) < 10
Evaluated: 5 < 10
julia> @test llvm_lines(Ones(10)) < 10
Test Passed
Expression: llvm_lines(Ones(10)) < 10
Evaluated: 5 < 10
julia> @test llvm_lines(Fill(2.0,10)) < 10
Test Failed at REPL[65]:1
Expression: llvm_lines(Fill(2.0, 10)) < 10
Evaluated: 12 < 10
ERROR: There was an error during testing
Need to check on that last one again.
EDIT: Mostly it's just comments, looks ok
julia> f(z,j) = @inbounds z[j]
f (generic function with 1 method)
julia> @code_llvm f(Fill(2.0, 10), 3)
; @ REPL[66]:1 within `f`
define double @julia_f_1329({ double, [1 x [1 x i64]] }* nocapture nonnull readonly align 8 dereferenceable(16) %0, i64 signext %1) #0 {
top:
; ┌ @ /home/chad/git/FillArrays.jl/src/FillArrays.jl:42 within `getindex`
; │┌ @ /home/chad/git/FillArrays.jl/src/FillArrays.jl:38 within `_fill_getindex`
; ││┌ @ /home/chad/git/FillArrays.jl/src/FillArrays.jl:127 within `getindex_value`
; │││┌ @ Base.jl:42 within `getproperty`
%2 = getelementptr inbounds { double, [1 x [1 x i64]] }, { double, [1 x [1 x i64]] }* %0, i64 0, i32 0
; └└└└
%3 = load double, double* %2, align 8
ret double %3
}
Oh and I can make debuginfo=:none, then we get 6 lines at most.
Hmm, this works interactively but fails in the test environment with
Inbounds optimization: Test Failed at /home/chad/git/FillArrays.jl/test/runtests.jl:1337
Expression: llvm_lines(Zeros(10)) < 10
Evaluated: 20 < 10
Stacktrace:
[1] macro expansion
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:445 [inlined]
[2] macro expansion
@ ~/git/FillArrays.jl/test/runtests.jl:1337 [inlined]
[3] macro expansion
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:1282 [inlined]
[4] top-level scope
@ ~/git/FillArrays.jl/test/runtests.jl:1329
Inbounds optimization: Test Failed at /home/chad/git/FillArrays.jl/test/runtests.jl:1338
Expression: llvm_lines(Ones(10)) < 10
Evaluated: 20 < 10
Stacktrace:
[1] macro expansion
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:445 [inlined]
[2] macro expansion
@ ~/git/FillArrays.jl/test/runtests.jl:1338 [inlined]
[3] macro expansion
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:1282 [inlined]
[4] top-level scope
@ ~/git/FillArrays.jl/test/runtests.jl:1329
Inbounds optimization: Test Failed at /home/chad/git/FillArrays.jl/test/runtests.jl:1339
Expression: llvm_lines(Fill(2.0, 10)) < 10
Evaluated: 22 < 10
Stacktrace:
[1] macro expansion
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:445 [inlined]
[2] macro expansion
@ ~/git/FillArrays.jl/test/runtests.jl:1339 [inlined]
[3] macro expansion
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:1282 [inlined]
[4] top-level scope
@ ~/git/FillArrays.jl/test/runtests.jl:1329
Test Summary: | Fail Total
Inbounds optimization | 3 3
ERROR: LoadError: Some tests did not pass: 0 passed, 3 failed, 0 errored, 0 broken.
in expression starting at /home/chad/git/FillArrays.jl/test/runtests.jl:1328
ERROR: Package FillArrays errored during testing
I guess ]test must use a lower optimization setting? Not sure how we can test in that case.
You'd need PerformanceTestTools.jl to actually disable bounds checks in test.
Maybe it's not worth the hassle and we just merge?
Let me take one more pass at it. If this doesn't work I'll roll it back to just the code updates
And thanks @YingboMa , I hadn't heard of that one :)
Nope, no luck. I left the code commented out in case we can come back to it
That's not how you use PerformanceTestTools.jl. See https://github.com/YingboMa/FastBroadcast.jl/blob/master/test/runtests.jl#L84