ONE [onert] User Scenario of On-Device Compiler on ONERT

Let's consider user scenario with on-device compiler on onert.

May 17 '22 07:05 chunseoklee

Q. What is the functionality that ONE's users(developer) expect from nnfw api(including config file) ?

(allow users to) enable/disable On-Device Compilation(ODC) in app
(allow users to) choose the time when ODC will be triggered
choose the way of ODC
partition infomation(?)

May 27 '22 01:05 chunseoklee

Successful Workflow

flowchart TD
   subgraph "nnpkg"
       n1(F32 circle)
   end
   subgraph "nnpkg2"
       n2(Quantized circle)
   end
   subgraph "nnpkg3"
       n3(tvn)
	  
   end

   
   nnpkg-->|1. collect rep. data|nnpkg-->| 2. ondevice quantization|nnpkg2 -->|3. ondevice compilation|nnpkg3

what if ?
- accuracy drop
- long compilation time
- unable to compile due to memory consumption or unsupported op ... etc.

May 27 '22 01:05 chunseoklee

Conclusion on offline discussion

Let's minimize user level API
- enable ODC via config file. Thus, disable ODC w/o this flag.

May 27 '22 03:05 chunseoklee

Here is a user scenario plus ONE's internal workflow in detail with quantized circle :

Assumption A1. model.q8.circle is compilable circle with (trix-compatible) q8 quantized
User's app implement app with nnfw api as usual. But nnpackage for this app is like :

nnpackage
├── metadata
│   ├── MANIFEST
│   └── config.cfg
└── model.circle

config.cfg

...
OnDeviceCompilation 1
...

Compile model.circle into model.tvn 2-1. compilation is done before first run
Reconstruct nnpkg with new tvn binary 3-1. Reconstruction Spec : where to place new tvn binary, how to update MANIFEST ?

May 30 '22 08:05 chunseoklee

Apple is on-device compiling in iOS ? : https://developer.apple.com/documentation/coreml/mlmodel/3931181-compilemodel

Need to investigate more

Jun 23 '22 07:06 chunseoklee

MLModel.CompileModel(NSUrl, NSError) Method (CoreML) | Microsoft Docs gives more info that Apple :)

Jun 23 '22 07:06 chunseoklee

After investigation via web docs, IMHO compilation in coreml is like compilation(nnfw_prepare in nnfw.h) in onert, not on-device compilation for npu.

https://github.com/hollance/neural-engine repo shows many details(though they are just guess). According to the repo, Coreml generates execution plan(e.g. which part of mlmodel will be run on gpu/cpu/npu) during compilation. It is similar to backend assignment in ONERT.

Jun 23 '22 13:06 chunseoklee

ONE ONE copied to clipboard

[onert] User Scenario of On-Device Compiler on ONERT

ONE
ONE copied to clipboard