CNN-on-flash
CNN-on-flash copied to clipboard

→

Metadata

CNN functions for dense matrices resident in flash storage

Readme
Issues

CNN-on-flash

The goal of this project is to run Convolutional Neural Network layers for flash-resident-matrices. Now gemm using NEON for ARM CPU is implemented.

References

This project is implemented based on BLAS-on-flash and run using Arm Compute Library.

BLAS-on-flash https://github.com/microsoft/BLAS-on-flash
Arm Compute Library https://github.com/ARM-software/ComputeLibrary

Requirements

Ubuntu 16.04
Arm Compute Library 19.02
- built with neon option turned on

Setting options

Set CMakeFiles options as you want.

vim CMakeFiles

PROGRAM_BUDGET Memory budget of the gemm with byte size
GEMM_BLK_SIZE The number of rows and cols of submatrices
N_IO_THR The number of IO threads
N_COMPUTE_THR The number of compute threads

Build instructions

git clone
vim CMakeLists.txt
- modify set (ACL_ROOT [arm_compute_library_path])
mkdir bin && cd bin
cmake ..
make
cd ..

Execution

gemm execution

cd misc
chmod +x gemm.sh
./exec.sh [A_row] [B_row] [B_col]

Example experiment result

Example case with

size of inputs and output matrices = 4096x4096
GEMM_BLK_SIZE = 512
and various memory budget
run on Odroid-XU4 having Exynos5422 Inference time and maximum memory usage is shown on following graph.

imagename

More detailed explanation for method and results can be found in BLAS-on-flash paper and this paper.

License

CNN-on-flash is open-sourced software licensed under the MIT license.

About

CNN functions for dense matrices resident in flash storage

machine-learning

cnn

memory-management

blas

inference-engine

23

Stars

16

Forks

Watchers

Owner

← Metadata

23

Stars

16

Forks

Watchers

Owner

Metadata

CNN functions for dense matrices resident in flash storage