JiangHao

Results 15 comments of JiangHao

I tried to integrate with WebGPU backend of TF.js : You have to build Tf.js and then replace the [path](https://github.com/NALLEIN/webml-polyfill/blob/WebGPU-backend-test/package.json#L49) in package.json with your own path.This branch could run in...

Conv2d with relu6 will be wrong when running the image classification model. I will reproduce the error to Xu Xin and tfjs. |Failed test case in CTS | | ----------...

> When debugging the handpose model, I found that (fused) conv2d may work not properly, with certain shape/stride/padding. > > With below two PRs applied, certain cases may works: >...

> It's great to see that. Thanks @NALLEIN. > Two comments: > 1), “used Conv2DMMProgram” ? Or used Conv2DNaiveProgram? Conv2DNaiveProgram should be the slow way. > I used Conv2DMMProgram and...

After using “Conv2DNaiveProgram” to perform fusedConv operation, most of the models can be run correctly in webml-polyfill. Fusedconv2d relu bias and prelu still works incorrectly. There are some problems :...

> For the speed, after merge [tensorflow/tfjs#3095](https://github.com/tensorflow/tfjs/issues/3095) and [tensorflow/tfjs#3049](https://github.com/tensorflow/tfjs/pull/3049), I found that the handpose model runs faster. > > The fusedConv2D may worked correctly with [tensorflow/tfjs#3095](https://github.com/tensorflow/tfjs/issues/3095), so if you mannually...

I test the inference time of image-classification models with workload for 200 iterations and the result is as follows: TFlite Model | Inference Time of WebGL (ms) | Inference Time...

> @NALLEIN said there is a new update of https://www.npmjs.com/package/webgpu , please investigate it and update the latest status. The current version of tfjs-backend-webgpu is [0.0.1-alpha.0](https://www.npmjs.com/package/@tensorflow/tfjs-backend-webgpu)

> @NALLEIN , I just checked with @axinging , if you have any new ops implementation for TF.js WebGPU backend, please feel free to submit your PR to TF.js repo....

I fixed the problem that relu with multiple outputs may caused the following ops can't find corresponding input tensor. Relu needs to be executed seperately and there are continuous relu...