kokkos-kernels icon indicating copy to clipboard operation
kokkos-kernels copied to clipboard

ArmPL TPL support for KokkosBlas and KokkosBatched

Open e10harvey opened this issue 4 years ago • 3 comments

  • [x] Check for ArmPL version 21 TPL installations
  • [x] Add CMake TPL support. (@e10harvey) - #880
  • [x] Add KokkosBlas TPL support: gemm, iamax, scal, copy (Kokkos copy). (@vqd8a) - #880
    • [ ] Collect performance in Adelus
      • [x] gemm
      • [ ] iamax
      • [ ] scal
      • [ ] copy
  • [ ] Add KokkosBatched TPL support to cover routines used by https://github.com/kokkos/kokkos-kernels/blob/master/perf_test/batched/KokkosBatched_Test_BlockTridiagDirect.cpp. (@e10harvey, @vqd8a)
    • [ ] LU (@vqd8a)
    • [ ] TRSM (@vqd8a)
    • [x] GEMM (@e10harvey) - #1256
    • [ ] TRSV (@e10harvey)
    • [ ] GEMV (@e10harvey)
  • [ ] Collect BlockTridiagDirect performance in KokkosKernels

e10harvey avatar Jan 12 '21 22:01 e10harvey

@e10harvey Thanks for adding the CMake for ARMPL. I would like to add two comments:

  1. For ARMPL's BLAS, could you please also enable KOKKOSKERNELS_ENABLE_TPL_BLAS in the KokkosKernels_config.h when KOKKOSKERNELS_ENABLE_TPL_ARMPL is defined, so that we can use the current BLAS TPL support in Kokkos Kernels?
  2. It looks to me that single-threaded ARMPL (libarmpl.so) can be found
 =======================
KokkosKernels ETI Types
   Devices:  <OpenMP,HostSpace>
   Scalars:  double
   Ordinals: int
   Offsets:  int;size_t
   Layouts:  LayoutLeft

KokkosKernels TPLs
   ARMPL:       /opt/arm/armpl-20.3.0_A64FX_RHEL-8_gcc_aarch64-linux/lib/libamath.so;/opt/arm/armpl-20.3.0_A64FX_RHEL-8_gcc_aarch64-linux/lib/libarmpl.so
=======================

Can we also find multi-threaded ARMPL (libarmpl_mp.so) when OpenMP is enabled?

vqd8a avatar Feb 03 '21 18:02 vqd8a

1. For ARMPL's BLAS, could you please also enable `KOKKOSKERNELS_ENABLE_TPL_BLAS` in the `KokkosKernels_config.h` when `KOKKOSKERNELS_ENABLE_TPL_ARMPL` is defined, so that we can use the current BLAS TPL support in Kokkos Kernels?

Yes.

Can we also find multi-threaded ARMPL (libarmpl_mp.so) when OpenMP is enabled?

Yes.

I will flag you in the PR for this.

e10harvey avatar Feb 03 '21 18:02 e10harvey

@e10harvey Thanks, Evan.

vqd8a avatar Feb 03 '21 21:02 vqd8a