returnn icon indicating copy to clipboard operation
returnn copied to clipboard

OpMaker uses BLAS from `numpy`

Open JackTemaki opened this issue 3 years ago • 7 comments

When running the compile_native_op.sh you will always get the BLAS lib from numpy linked into the binary.

  • When not adding any flag, it will load the BLAS lib from numpy.
  • When specifying --no_search_for_numpy_blas it will still use the one from numpy via find_sgemm_libs_from_runtime()

Despite giving a specific path to a lib, there are no alternatives.

In order to use the e.g. intel_mkl.so that Tensorflow is linked to with RASR, we could:

  • link RASR against the Tensorflow MKL and introduce a flag to not do any linking in the native op
  • add a functionality to search for the BLAS implementation used in Tensorflow and automatically link against that.

JackTemaki avatar Feb 14 '22 15:02 JackTemaki

Note that find_sgemm_libs_from_runtime does not necessarily take the BLAS from NumPy. I assume if multiple libs in the runtime provide BLAS functions (e.g. TensorFlow loaded MKL + NumPy loaded sth else), it is not well defined.

albertz avatar Feb 16 '22 10:02 albertz

Why do you want to mix BLAS anyway? Why not just MKL consistently everywhere, including in NumPy? This is probably better anyway and also would solve your problem here.

albertz avatar Feb 16 '22 10:02 albertz

You mentioned another problem with RASR, that RASR also loads both lapack + MKL, and that also causes non-deterministic behavior which BLAS function will be executed. Does it matter in this case what BLAS you link to for the native op?

albertz avatar Feb 16 '22 10:02 albertz

Note that find_sgemm_libs_from_runtime does not necessarily take the BLAS from NumPy. I assume if multiple libs in the runtime provide BLAS functions (e.g. TensorFlow loaded MKL + NumPy loaded sth else), it is not well defined.

It does, because the code is written that way:

libs = find_sgemm_libs_from_runtime()
      if libs:
        numpy_libs = [fn for fn in libs if "/numpy/.libs/" in fn]
        if numpy_libs:
          # Prefer Numpy; move to front.
          libs = numpy_libs + [fn for fn in libs if fn not in numpy_libs]

You can not run the script without having numpy installed, as it is a dependency of Tensorflow, so it will always list the numpy version first.

Why do you want to mix BLAS anyway? Why not just MKL consistently everywhere, including in NumPy? This is probably better anyway and also would solve your problem here.

Then you would need to compile NumPy from source, as there is no official package with MKL support.

https://pypi.org/project/numpy-mkl/ and https://pypi.org/project/intel-numpy/

are both outdated, so the official approach would be this: https://www.intel.com/content/www/us/en/developer/articles/technical/build-numpy-with-mkl-and-icc.html

You mentioned another problem with RASR, that RASR also loads both lapack + MKL, and that also causes non-deterministic behavior which BLAS function will be executed. Does it matter in this case what BLAS you link to for the native op?

@curufinwe reported this, but for me it definitely mattered which BLAS I linked for the native op.

JackTemaki avatar Feb 16 '22 13:02 JackTemaki

Note that find_sgemm_libs_from_runtime does not necessarily take the BLAS from NumPy. I assume if multiple libs in the runtime provide BLAS functions (e.g. TensorFlow loaded MKL + NumPy loaded sth else), it is not well defined.

It does, because the code is written that way: ...

No, find_sgemm_libs_from_runtime does not do this:

def find_sgemm_libs_from_runtime():
  """
  Looks through all libs via :func:`collect_proc_maps_exec_files`,
  and searches for all which have the ``sgemm`` symbol.
  Currently only works on Linux (because collect_proc_maps_exec_files).

  :return: list of libs (their path)
  :rtype: list[str]
  """
  if not os.path.exists("/proc"):
    return None
  global _find_sgemm_lib_from_runtime_cached
  if _find_sgemm_lib_from_runtime_cached is not None:
    return _find_sgemm_lib_from_runtime_cached
  dummy_numpy_gemm_call()  # make sure that Numpy is loaded and Numpy sgemm is available
  fns = collect_proc_maps_exec_files()
  fns_with_sgemm = []
  for fn in fns:
    out = find_sym_in_exec(fn, "sgemm_")
    if out:
      fns_with_sgemm.append(fn)
  _find_sgemm_lib_from_runtime_cached = fns_with_sgemm
  return fns_with_sgemm

What you say is independent from find_sgemm_libs_from_runtime.

albertz avatar Feb 16 '22 13:02 albertz

However, the code you are referring to, which moves NumPy libs to front (# Prefer Numpy; move to front.), this maybe should be changed.

Did you try to do this? Does it select the right BLAS then?

albertz avatar Feb 16 '22 13:02 albertz

Then you would need to compile NumPy from source ...

Why is this a problem? You also compile RASR and TensorFlow from source to get MKL support. It is just consistent.

albertz avatar Feb 16 '22 13:02 albertz