FillArrays.jl icon indicating copy to clipboard operation
FillArrays.jl copied to clipboard

Fix some Base.@propagate_inbounds annotations

Open cscherrer opened this issue 4 years ago • 12 comments

I noticed the compiler wasn't quite reducing things as it should, so here are some edits to (hopefully) improve performance. Thanks to @mbauman and @YingboMa for Slack discussion.

Suppose we have

julia> Z = Zeros(10)
10-element Zeros{Float64}

julia> f(z,j) = @inbounds z[j]
f (generic function with 1 method)

Then

Before

julia> @code_llvm f(Z,3)
;  @ REPL[5]:1 within `f`
define double @julia_f_2043([1 x [1 x [1 x i64]]]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i64 signext %1) #0 {
top:
  %2 = alloca [1 x i64], align 8
; ┌ @ /home/chad/git/FillArrays.jl/src/FillArrays.jl:42 within `getindex`
; │┌ @ /home/chad/git/FillArrays.jl/src/FillArrays.jl:37 within `_fill_getindex`
    %3 = getelementptr inbounds [1 x i64], [1 x i64]* %2, i64 0, i64 0
    store i64 %1, i64* %3, align 8
; ││┌ @ abstractarray.jl:659 within `checkbounds` @ abstractarray.jl:644
; │││┌ @ abstractarray.jl:718 within `checkindex`
; ││││┌ @ int.jl:471 within `<=`
       %4 = icmp slt i64 %1, 1
; ││││└
; ││││┌ @ range.jl:686 within `last`
; │││││┌ @ Base.jl:42 within `getproperty`
        %5 = getelementptr inbounds [1 x [1 x [1 x i64]]], [1 x [1 x [1 x i64]]]* %0, i64 0, i64 0, i64 0, i64 0
; ││││└└
; ││││┌ @ int.jl:471 within `<=`
       %6 = load i64, i64* %5, align 8
       %7 = icmp slt i64 %6, %1
; │││└└
; │││ @ abstractarray.jl:659 within `checkbounds`
     %8 = or i1 %4, %7
     br i1 %8, label %L12, label %L17

L12:                                              ; preds = %top
     %9 = call nonnull {}* @j_throw_boundserror_2045([1 x [1 x [1 x i64]]]* nocapture nonnull readonly %0, [1 x i64]* nocapture readonly %2) #0
     call void @llvm.trap()
     unreachable

L17:                                              ; preds = %top
; └└└
  ret double 0.000000e+00
}

After

julia> @code_llvm f(Z,3)
;  @ REPL[5]:1 within `f`
define double @julia_f_1017([1 x [1 x [1 x i64]]]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i64 signext %1) #0 {
top:
  ret double 0.000000e+00
}

cscherrer avatar Aug 11 '21 23:08 cscherrer

Thanks! Any chance you can add tests?

dlfivefifty avatar Aug 12 '21 01:08 dlfivefifty

Codecov Report

Merging #155 (9878b6a) into master (269fd41) will not change coverage. The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #155   +/-   ##
=======================================
  Coverage   99.21%   99.21%           
=======================================
  Files           4        4           
  Lines         633      633           
=======================================
  Hits          628      628           
  Misses          5        5           
Impacted Files Coverage Δ
src/FillArrays.jl 99.68% <100.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Aug 12 '21 01:08 codecov[bot]

Only way I can think of offhand is to redirect stdout, capture it in a string and check its length. But I'm not sure how to do that. Any ideas?

I mean, there's

help?> redirect_stdout
search: redirect_stdout redirect_stdio redirect_stdin redirect_stderr

  redirect_stdout([stream]) -> stream

  Create a pipe to which all C and Julia level stdout output will be redirected. Return a stream representing the pipe ends. Data written to
  stdout may now be read from the rd end of the pipe.

  │ Note
  │
  │  stream must be a compatible objects, such as an IOStream, TTY, Pipe, socket, or devnull.

  See also redirect_stdio.

  ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

  redirect_stdout(f::Function, stream)

  Run the function f while redirecting stdout to stream. Upon completion, stdout is restored to its prior setting.

TTY looks like closest, but I have no idea how to create one

help?> Base.TTY
  No documentation found.

  Summary
  ≡≡≡≡≡≡≡≡≡

  mutable struct Base.TTY

  Fields
  ≡≡≡≡≡≡≡≡

  handle    :: Ptr{Nothing}
  status    :: Int64
  buffer    :: IOBuffer
  cond      :: Base.GenericCondition{Base.Threads.SpinLock}
  readerror :: Any
  sendbuf   :: Union{Nothing, IOBuffer}
  lock      :: ReentrantLock
  throttle  :: Int64

  Supertype Hierarchy
  ≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡

  Base.TTY <: Base.LibuvStream <: IO <: Any

cscherrer avatar Aug 12 '21 01:08 cscherrer

@dlfivefifty Maybe something like this?

julia> using FillArrays, Test

julia> function llvm_lines(a)
           f(x,j) = @inbounds x[j]
           io = IOBuffer()
           code_llvm(io, f, (typeof(a), Int))
           # countlines(io)
           count(==('\n'), String(take!(io)))
       end
llvm_lines (generic function with 1 method)

julia> @test llvm_lines(Zeros(10)) < 10
Test Passed
  Expression: llvm_lines(Zeros(10)) < 10
   Evaluated: 5 < 10

julia> @test llvm_lines(Ones(10)) < 10
Test Passed
  Expression: llvm_lines(Ones(10)) < 10
   Evaluated: 5 < 10

julia> @test llvm_lines(Fill(2.0,10)) < 10
Test Failed at REPL[65]:1
  Expression: llvm_lines(Fill(2.0, 10)) < 10
   Evaluated: 12 < 10
ERROR: There was an error during testing

Need to check on that last one again.

EDIT: Mostly it's just comments, looks ok

julia> f(z,j) = @inbounds z[j]
f (generic function with 1 method)

julia> @code_llvm f(Fill(2.0, 10), 3)
;  @ REPL[66]:1 within `f`
define double @julia_f_1329({ double, [1 x [1 x i64]] }* nocapture nonnull readonly align 8 dereferenceable(16) %0, i64 signext %1) #0 {
top:
; ┌ @ /home/chad/git/FillArrays.jl/src/FillArrays.jl:42 within `getindex`
; │┌ @ /home/chad/git/FillArrays.jl/src/FillArrays.jl:38 within `_fill_getindex`
; ││┌ @ /home/chad/git/FillArrays.jl/src/FillArrays.jl:127 within `getindex_value`
; │││┌ @ Base.jl:42 within `getproperty`
      %2 = getelementptr inbounds { double, [1 x [1 x i64]] }, { double, [1 x [1 x i64]] }* %0, i64 0, i32 0
; └└└└
  %3 = load double, double* %2, align 8
  ret double %3
}

cscherrer avatar Aug 12 '21 14:08 cscherrer

Oh and I can make debuginfo=:none, then we get 6 lines at most.

cscherrer avatar Aug 12 '21 14:08 cscherrer

Hmm, this works interactively but fails in the test environment with

Inbounds optimization: Test Failed at /home/chad/git/FillArrays.jl/test/runtests.jl:1337
  Expression: llvm_lines(Zeros(10)) < 10
   Evaluated: 20 < 10
Stacktrace:
 [1] macro expansion
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:445 [inlined]
 [2] macro expansion
   @ ~/git/FillArrays.jl/test/runtests.jl:1337 [inlined]
 [3] macro expansion
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:1282 [inlined]
 [4] top-level scope
   @ ~/git/FillArrays.jl/test/runtests.jl:1329
Inbounds optimization: Test Failed at /home/chad/git/FillArrays.jl/test/runtests.jl:1338
  Expression: llvm_lines(Ones(10)) < 10
   Evaluated: 20 < 10
Stacktrace:
 [1] macro expansion
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:445 [inlined]
 [2] macro expansion
   @ ~/git/FillArrays.jl/test/runtests.jl:1338 [inlined]
 [3] macro expansion
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:1282 [inlined]
 [4] top-level scope
   @ ~/git/FillArrays.jl/test/runtests.jl:1329
Inbounds optimization: Test Failed at /home/chad/git/FillArrays.jl/test/runtests.jl:1339
  Expression: llvm_lines(Fill(2.0, 10)) < 10
   Evaluated: 22 < 10
Stacktrace:
 [1] macro expansion
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:445 [inlined]
 [2] macro expansion
   @ ~/git/FillArrays.jl/test/runtests.jl:1339 [inlined]
 [3] macro expansion
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:1282 [inlined]
 [4] top-level scope
   @ ~/git/FillArrays.jl/test/runtests.jl:1329
Test Summary:         | Fail  Total
Inbounds optimization |    3      3
ERROR: LoadError: Some tests did not pass: 0 passed, 3 failed, 0 errored, 0 broken.
in expression starting at /home/chad/git/FillArrays.jl/test/runtests.jl:1328
ERROR: Package FillArrays errored during testing

I guess ]test must use a lower optimization setting? Not sure how we can test in that case.

cscherrer avatar Aug 12 '21 14:08 cscherrer

You'd need PerformanceTestTools.jl to actually disable bounds checks in test.

YingboMa avatar Aug 12 '21 15:08 YingboMa

Maybe it's not worth the hassle and we just merge?

dlfivefifty avatar Aug 12 '21 16:08 dlfivefifty

Let me take one more pass at it. If this doesn't work I'll roll it back to just the code updates

cscherrer avatar Aug 12 '21 16:08 cscherrer

And thanks @YingboMa , I hadn't heard of that one :)

cscherrer avatar Aug 12 '21 16:08 cscherrer

Nope, no luck. I left the code commented out in case we can come back to it

cscherrer avatar Aug 12 '21 16:08 cscherrer

That's not how you use PerformanceTestTools.jl. See https://github.com/YingboMa/FastBroadcast.jl/blob/master/test/runtests.jl#L84

YingboMa avatar Aug 12 '21 16:08 YingboMa