Yang, Bo

Results 342 comments of Yang, Bo

2018-04-02 22:27 GMT+08:00 raver119 : > I see. > > This benchmark isn't comparing apples to apples. Thanks for your time. > You are wrong. The benchmark is comparing apples...

2018-04-03 0:41 GMT+08:00 raver119 : > You are wrong. The benchmark is comparing apples to apples: immutable > operations vs immutable operations > > Few messages above you've said that...

2018-04-03 0:49 GMT+08:00 raver119 : > It's not about ND4j implementation. It's about what YOU've implemented. > Suppose a data scientist Alice read a paper and want to reproduce an...

OK, today I learnt that any one who read the ND4J documentation should never use immutable operators. I am so curious how fast a.muli(b).addi(c) is. It must be super faster...

@raver119 The inplace version of ND4J operation is indeed super fast. It is 1.44 times faster than the ND4J's immutable version when performing `a * b + c` 100 times...

Ubuntu 16.04 and CUDA 8.0 from this docker image: https://github.com/ThoughtWorksInc/scala-cuda/tree/sbt-openjdk8-cuda8.0-opencl-ubuntu16.04

`blockingAwait` is marked red in IntelliJ, which is a bug in IntelliJ typer. The bug does not affect actual compilation.

The reason why I was using 0.8 is the CUDA backend of ND4J 0.9.x is broken in sbt, even when compiling from a clean docker image. https://github.com/deeplearning4j/nd4j/issues/2767

``` shell sbt 'benchmarks/Jmh/run Issue137' ``` The first run of the command may be fail due to `sbt-jmh`'s bug. But retry would be good. Run `sbt 'benchmarks/Jmh/run -help` for more...

2018-04-03 5:13 GMT+08:00 Justin Long : > What's different on your local branch? > There were different workarounds for different OpenCL bugs from different vendors, but we now detect vendor...