Pavan Yalamanchili
Pavan Yalamanchili
user reported this issue on the forums: https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/arrayfire-users/yaVHXY_SQOc/mJ1Xmh1JGgAJ Currently only reported on R9 380.
Do this inside a for loop with offsets. This will be needed for GFOR ------------------------------------------ CUDA - [x] matmul - ~[ ] dot~ - [ ] lu - [ ]...
Right now OpenCL kernels are launched with 256 threads per group. Try to get supported sizes from `DeviceManager` and use the values accordingly. Add the following queryable information to opencl::DeviceManager...
This is splitting off from the issue mentioned here: https://github.com/arrayfire/arrayfire/issues/1656 The problems mentioned in the issue include: - no standard fp16 data type - performance issues for certain hardware supporting...
Put the C functions in af/device.h Some options to consider: - **OPTION 1**: Use af_set_stream like af_set_device. i.e. all functions after calling af_set_stream would use the same stream. - **OPTION...
Directories to port: - [x] benchmarks - [ ] computer_vision - [x] financial - [ ] getting_started - [x] graphics - [x] helloworld - [ ] image_processing - [ ]...
As seen here, the results can be order of magnitudes off if the scaling happens upfront. https://gist.github.com/pavanky/dc64fc83e0fa298e942a3e0ca07ccd15
Add optional parameters to install arrayfire libraries and / or print error messages if arrayfire libs are not found.
This can either be implemented in the wrapper or handled upstream or ideally a combination of both. This should help the integration with other python modules.