Hyeongseok Oh issues

Results 51 issues of


                                            Hyeongseok Oh

[onert] Revisit IOTensor

This commit revisits IOTensor - OperandInfo such as shape, type, dynamic is maintained on its own "_info" field - Remove orig_info() and orig_layout() getter - setTensor methods update its own...

PR/ready for review

approval: 2

Draft: type-aware input & output buffer setting for quantized model

- Revisit IOTensor and Executor interface - Revisit IPortableTensor - APIs for input/output - Run with float input/output buffer

PR/NO MERGE

Draft: Revisit Tensor struct and Executor I/O setting

- Revise Tensor struct: IPortableTensor, IOTensor, UserTensor - Revise Executor I/O setting - Merge execute method: use option, input & output tensor - Revise MultiModelExecutor for type-aware - Remove IOTensor...

PR/NO MERGE

[onert] Hidden switching mechanism for on-device compiler

### What? We can think `hidden switching mechanism`: allocate backend automatically for user requirement such as best performance or best memory usage. Then runtime need assumption mechanism to allocate backend...

area/onert

[onert] Type-aware quantized model input/output buffer setting

### What? If user load quantized model, user can set float type input buffer and data, output buffer. Then runtime can quantize input to read input data and dequantize output...

area/onert

PoC: Block weight quantize tool for LLM [skip ci]

- Block quantization for LLM: FullyConnected, Gather - Decide quantize type by circle-quantizer parameter: `--block_quantize_weights` (Q4_0, Q8_0) - Skip quantization by circle-quantizer parameter: `--skipsize_block_quantize` (default: 0) --- Caution: It's for...

PR/NO TEST

[onert/infra] Workaround on ARMCompute backends on v24.07

By #14009, some fully connected tests are skipped. Maybe there is issue to support hybrid quantization on fully connected kernel.

area/onert

type/issue

Hyeongseok Oh

[onert] Revisit IOTensor

Draft: type-aware input & output buffer setting for quantized model

Draft: Revisit Tensor struct and Executor I/O setting

[onert] Hidden switching mechanism for on-device compiler

[onert] Type-aware quantized model input/output buffer setting

PoC: Block weight quantize tool for LLM [skip ci]

[onert/infra] Workaround on ARMCompute backends on v24.07

Draft: Add Permute OP for layout and data type conversion

Draft: Revise CompilerOptions (WIP)

[onert] Support CompileOptions for multi model