cutlass [BUG] tensor.hpp must be included before mma

Describe the bug I find that <cute/tensor.hpp> must be included before <cute/atom/mma_atom.hpp>, otherwise we cannot compile.

Steps/Code to reproduce bug

Compile with nvcc test.cu -I include/ -std=c++17

#include <cute/atom/mma_atom.hpp>
// or we can only include <cute/atom/mma_atom.hpp> here
#include <cute/tensor.hpp>

int main() {}

the output errors:

include/cute/algorithm/gemm.hpp(83): error: "gemm" has already been declared in the current scope
  gemm(MMA_Atom<MMA> const& mma,
  ^

include/cute/algorithm/gemm.hpp(81): warning #1835-D: attribute "always_inline" does not apply here
  __inline__ __attribute__((always_inline)) __attribute__((host)) __attribute__((device))
                            ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

include/cute/algorithm/gemm.hpp(81): warning #1835-D: attribute "__host__" does not apply here
  __inline__ __attribute__((always_inline)) __attribute__((host)) __attribute__((device))
                                                           ^

include/cute/algorithm/gemm.hpp(83): error: incomplete type is not allowed
  gemm(MMA_Atom<MMA> const& mma,
  ^

include/cute/algorithm/gemm.hpp(83): error: identifier "MMA_Atom" is undefined
  gemm(MMA_Atom<MMA> const& mma,
       ^

include/cute/algorithm/gemm.hpp(83): error: type name is not allowed
  gemm(MMA_Atom<MMA> const& mma,
                ^

include/cute/algorithm/gemm.hpp(83): error: expected an expression
  gemm(MMA_Atom<MMA> const& mma,
                     ^

include/cute/algorithm/gemm.hpp(84): error: type name is not allowed
       Tensor<TA, ALayout> const& A,
       ^

include/cute/algorithm/gemm.hpp(84): error: expected a ")"
       Tensor<TA, ALayout> const& A,
                           ^

include/cute/algorithm/gemm.hpp(87): error: expected a ";"
  {
  ^

At end of source: error: expected a ";"

At end of source: error: expected a "}"
include/cute/algorithm/gemm.hpp(59): note #3196-D: to match this "{"
  {
  ^

10 errors detected in the compilation of "test.cu".

Exchange the two header files and we can compile normally

Expected behavior

It seems mma_atom.hpp already includes tensor.hpp and these two files are protected by #pragma once. So I think we don't need to include <cute/tensor.hpp> explicitly before <cute/atom/mma_atom.hpp>.

https://github.com/NVIDIA/cutlass/blob/a75b4ac483166189a45290783cb0a18af5ff0ea5/include/cute/atom/mma_atom.hpp#L36

Environment details (please complete the following information): nvcc: V12.2.128 cutlass: a75b4ac483 (tag:v3.3.0)

Dec 07 '23 07:12 mammoth831

PR please?

@mhoemmen @thakkarV

Dec 07 '23 14:12 hwu36

@ezhulenev

Dec 07 '23 14:12 hwu36

It seems mma_atom.hpp already include[s] tensor.hpp and these two files are protected by #pragma once. So I think we don't need to ... include <cute/tensor.hpp> [explicitly] before <cute/atom/mma_atom.hpp>.

I agree, but something else might be wrong here. The point of #pragma once is to prevent multiple includes of the same header. It looks like some other header might not have correct #pragma once protection. Note how the multiple definition errors are for functions in the header file include/cute/algorithm/gemm.hpp.

Dec 07 '23 15:12 mhoemmen

CC @ccecka

Dec 08 '23 16:12 thakkarV

Same issue

Dec 11 '23 04:12 andylolu2

the dependency can be summarized as the following table:

File	Includes	Depends	Define	End includes
tensor.hpp	NA	NA	Tensor	gemm.hpp
gemm.hpp	tensor.hpp/mma_atom.hpp	Tensor/MMA_Atom/ThrMMA	gemm func	NA
mma_atom.hpp	tensor.hpp	Tensor	MMA_Atom/ThrMMA	NA

when we include tensor.hpp, after the preprocess, the code as:

Tensor;
// include <gemm.hpp>
//  include <mma_atom.hpp> =>
    MMA_Atom;
    ThrMMA;
// end include <mma_atom.hpp>
gemm(MMA_Atom);
gemm(ThrMMA);

when we include mma_atom.hpp, after the preprocess, the code as:

# include <tensor.hpp> =>
Tensor;
// include <gemm.hpp> =>
// since <mma_atom.hpp> have already include by #pragma once, it will not include
// MMA_Atom;
// ThrMMA;
gemm(MMA_Atom); // fail for MMA_Atom not defined
gemm(ThrMMA);  // fail for ThrMMA not defined
MMA_Atom;
ThrMMA;

Solution1: we can simply forward declare the MMA_Atom/ThrMMA in gemm.hpp to make it compiled. but it may hurt. Because when we update MMA_Atom/ThrMMA's template parameter we have to update the forward declaration(in seperate files). Solution2: make sure gemm function is included after Tensor and MMA_Atom/ThrMMA:

include gemm.hpp at the end of mma_atom.hpp
include mma_atom.hpp at the end of tensor.hpp

Dec 23 '23 13:12 reed-lau

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

Jan 22 '24 14:01 github-actions[bot]

@reed-lau It sounds like you would prefer Solution 2; is that right?

Jan 22 '24 17:01 mhoemmen

@reed-lau It sounds like you would prefer Solution 2; is that right?

yes.

Jan 31 '24 02:01 reed-lau

cutlass
cutlass copied to clipboard

[BUG] tensor.hpp must be included before mma_atom.hpp

cutlass cutlass copied to clipboard

[BUG] tensor.hpp must be included before mma_atom.hpp

cutlass
cutlass copied to clipboard