FaceDetection.jl
FaceDetection.jl copied to clipboard
Optimise memory usage: GPU array storage and other small changes
#24205 in julia lang https://github.com/JuliaLang/julia/issues/
Namely after "smart" feature selection
Done in commit e9116987.
#25 addresses this broader issue of memory efficiency
Commit 61b03826 addresses this
Points to consider and possibly implement:
- [ ] GPU processing
- [ ] Use
ArrayFire.jl'sload_image - [ ] Use
CUDA.jl
- [ ] Use
- [ ] Use 4 x ones or -ones using
Int8(ask @dmipeck) - [ ]
pmap - [ ]
Mmap - [ ] Use
SharedArrays
Commit 70d7d74d addresses this
Some overview thus-far.
These benchmarking results are from tests since we changed the algorithm to run sequentially.
| Commit | Benchmark Time of Tests (seconds) | % Time Improvement Since Previous Listed Commit | Number of Allocations | Memory Allocation | % Memory Improvement Since Previous Listed Commit |
|---|---|---|---|---|---|
| a4689195 | 30.689 | — a | 371318354 | 6.38 GiB | — a |
| da9c833e | 7.768 | 74.69 | 104464112 | 2.51 GiB | 60.66 |
| b3aec6b8 | 5.025 | 35.31 | 28589987 | 713.89 MiB | 71.56 |
| 3e9be4ad | 5.242 | -4.32 b | 46688538 | 990.05 MiB | -38.68 b |
a I did not benchmark prior to this, though it probably wouldn't be hard to checkout and rewrite some tests with an older version.
b In this commit, I had to change the output of the get_vote function from Int8 to Float64 for correctness. As a result, this is why we have a decrease in benchmarks since the previous listed commit.
NB—: time improvement since last commit can be calculated very easily:
julia> improvement(a, b) = ((a - b) / a) * 100
improvement (generic function with 1 method)
julia> improvement(30.689, 7.768)
74.68799895728111
That is to say, there was a 74.7% improvement between times 30.689 s and 7.768 s.