nn-Meter icon indicating copy to clipboard operation
nn-Meter copied to clipboard

Roadmap

Open mydmdm opened this issue 3 years ago • 5 comments

nn-Meter is not only a latency predictor but also a critical component in the hardware-aware model design. It empowers existing NAS (neural architecture search) and other efficient model design tasks to be specialized for the target hardware platform.

There are multiple aspects will be covered in this and related repo, including:

  • latency prediction and pre-trained predictors
    • the IR converter, kernel detection tools
    • builtin kernel predictors and pre-trained weights
  • algorithm integration (mainly in NNI), the integration of latency prediction in existing NAS and compression algorithms.
  • model latency dataset, the collected latencies of thousands of model architectures. Also includes data loaders and an improved GNN predictor.

Release Plan

version 1.0-alpha

  • Date: 2021 August
  • Latency prediction
    • [x] basic framework and utilities for latency prediction (e.g., config management, artifacts downloading, builtin predictors)
    • [x] basic CI workflow with integrated test
    • [x] documentation and examples
  • Algorithm integration
    • [x] initial multi-trial NAS example

version 1.0-beta

  • Date: 2021 November
  • Algorithm integration
    • [x] SPOS / Proxyless NAS in NNI
    • [x] ~~SPOS: first integrate nn-meter in the evolution search~~ (move to 2.0)
    • [x] Proxyless NAS: predict the block latency in the search space, provide the lookup table
  • Dataset
    • [x] make model-latency dataset public
    • [x] reference design of an improved GNN latency predictor

version 2.0

  • Date: 2021 ~~November~~ December
  • Algorithm integration
    • [x] SPOS: first integrate nn-meter in the evolution search
  • latency predictor building tools
    • [x] fusion rule detecton
    • [x] adaptive data sampler

mydmdm avatar Aug 15 '21 09:08 mydmdm

Hello,

the paper mentions methods for:

  1. detecting the fusion rules on a device
  2. adaptive sampling for creating the latency dataset

Will these be added to the repository ? If they will be added: Do you have a rough time frame for when they will be available ?

gmimsgit avatar Aug 26 '21 12:08 gmimsgit

Hello,

the paper mentions methods for:

  1. detecting the fusion rules on a device
  2. adaptive sampling for creating the latency dataset

Will these be added to the repository ? If they will be added: Do you have a rough time frame for when they will be available ?

@gmimsgt, Hi, we plan to add the fusion rule detection and adaptive sampling algorithms. We will start after version 1.0-beta finishes.

Lynazhang avatar Aug 27 '21 06:08 Lynazhang

@Lynazhang Thanks for the quick answer.

I appreciate the effort put into polishing the code base as it allowed me to get started quickly. Especially the fusion rule detection and adaptive sampling are very interesting as I am currently trying to predict/benchmark a new device. The paper has been very helpful in this regard and I would love to try out the implementation.

If it is not an inconvenience is it possible to get the current state of the code?

gmimsgit avatar Aug 27 '21 08:08 gmimsgit

Hi, I'm wondering if you would share your modification code to TFLite, which implements the GPU operator-level profiling?

liuyibox avatar Sep 20 '21 05:09 liuyibox

Hi, I'm wondering if you would share your modification code to TFLite, which implements the GPU operator-level profiling?

Hi, @liuyibox, we will soon share a patch about the GPU operator-level profiling.

Lynazhang avatar Oct 30 '21 08:10 Lynazhang