convexbrain comments

Results 8 comments of


                                            convexbrain

F32CUDA seems too slow

A benchmark result of LP * https://github.com/convexbrain/Totsu/tree/1f5200599ffd8bdf15e6ce672bcc1c2f0bbc11bb/experimental/benchmark_lp * **F32CUDA is faster than FloatGeneric.** ![Benchmark of LP](https://user-images.githubusercontent.com/16474706/193438486-6b2f0c5c-580f-4566-81f6-9878bfebc05b.png) * CPU * Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz * RAM: 32.0 GB *...

F32CUDA seems too slow

A benchmark result of QP * https://github.com/convexbrain/Totsu/tree/884e36b4fd32d696ddca046af755ad8a2d120a61/experimental/benchmark_qp * **F32CUDA is slower than FloatGeneric.** 😭 ![Benchmark of QP](https://user-images.githubusercontent.com/16474706/194068170-91ce4b85-cf11-4317-97bd-bd499ec06ef9.png) Proceed to profiling using this benchmark.

F32CUDA seems too slow

A profiling result of QP benchmark * Many memory accesses are occurring when projecting onto the cone. ![FetJabgUcAIqKNY](https://user-images.githubusercontent.com/16474706/195871011-a3002208-9335-4110-9026-938f0c6efdb8.png)

F32CUDA seems too slow

https://github.com/convexbrain/Totsu/tree/b56407463b691a3f2418510bc43e8a72d5186fc1/experimental/benchmark_qp * CUDA-izing projection onto cones as much as possible. * 200 vars (100 primals, 100 duals). ![a](https://user-images.githubusercontent.com/16474706/210599953-0c610bcb-9247-44e7-bdd2-fe19db8d2071.png)

F32CUDA seems too slow

* 400 vars (200 primals, 200 duals). ![a](https://user-images.githubusercontent.com/16474706/210610777-ec4a10cd-d9cd-484c-a9ff-1bc15c4f403d.png)

F32CUDA seems too slow

https://github.com/convexbrain/Totsu/tree/77f0e5cc10e7a2d29567352f88135a99ed620be1/experimental/benchmark_qp * FxHashMap instead of HashMap. * 200 vars (100 primals, 100 duals). ![a](https://user-images.githubusercontent.com/16474706/210729549-00b10064-d448-462e-bd5c-69877932000d.png)

F32CUDA seems too slow

https://github.com/convexbrain/Totsu/tree/13b8d378f79445c53b9c9f77fbf4389029423d12/experimental/benchmark_qp * Intermittent criteria checks. * 200 vars (100 primals, 100 duals). ![a](https://user-images.githubusercontent.com/16474706/210838542-306c7eda-993f-4bfa-9ade-1140450fe187.png)

F32CUDA seems too slow

![Benchmark of QP (1)](https://user-images.githubusercontent.com/16474706/211132353-716083a5-cf93-455d-b61e-4441720918da.png) * The effect of CUDA comes out from about 800 variables. * The number of iterations is not monotonically increasing; probably because those QPs are generated...