Bruce Lai

Results 10 issues of Bruce Lai

### 1. System information - OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04.5 LTS - TensorFlow installation (pip package or built from source): pip - TensorFlow library (version,...

Support RVV F32-IGEMM with MR=1 & 7, NR=4v.

Support RVV F32-GEMM with MR=1 & 7, NR=4v.

RVV F32-vbinary `vopc(one tensor + one scalar)` u-kernels have been provided in the previous pr. This PR is to further enable RVV F32-vbinary `vop` u-kernels which use two tensor inputs.

I find there are several RVV kernels in this project. However, they aren't enabled by default. E.g: XNNPACK has RVV H-swish [here](https://github.com/google/XNNPACK/blob/master/src/f32-vhswish/rvv.c.in), but it isn't enabled by default https://github.com/google/XNNPACK/blob/9c65f03020c7fe960314d96fd33b1bbda24361b8/src/configs/unary-elementwise-config.c#L818-L821 Is...

`check_regression_llvm-cpu_lowering_config.mlir` can run correctly in riscv now, but fails in 60 seconds timeout. As discussed in https://github.com/openxla/iree/issues/10462, this PR reduces the test case. Time reduction in my machine: 252 ->...

### Goal Enable x32-packw to speed up dynamic fully connected layer for LLM model. ### Background GEMM u-kernel uses input and packed_weight(weight and bias) to calculate output value. Our GEMM...

## Background I compiled my models to deploy on platforms with 2 CPUs. The vmfb ran with `--task_topology_group_count=2`. ## Observation Two workers equally share the dispatch tasks. If one worker...

This PR aims to enable RVV GEMM/IGEMM/X32-PACKW in GEMM config. It leads to enabling RVV implementation in operator API.

This PR is based on @JerryShih's PR and @terryheo's last comment on https://github.com/tensorflow/tensorflow/pull/48099#issuecomment-809005824. We add the document `tensorflow/lite/g3doc/guide/build_cmake_riscv.md` to demonstrate: - how to download prebuilt toolchain - how to build...

awaiting review
comp:lite
size:M