composable_kernel
composable_kernel copied to clipboard
Generate add_device_xxxxx_instances() declarations by CMake or script
While adding new type of device operator instances, we also have to add corresponding add_device_xxxx_instances() declarations in the header. It's error-prone and time consuming.
// file: library/include/ck/library/tensor_operation_instance/gpu/gemm.hpp
namespace ck {
namespace tensor_operation {
namespace device {
namespace instance {
void add_device_gemm_dl_f16_f16_f16_km_kn_mn_instances(
std::vector<std::unique_ptr<
DeviceGemm<Col, Row, Row, F16, F16, F16, PassThrough, PassThrough, PassThrough>>>&
instances);
void add_device_gemm_dl_f16_f16_f16_km_nk_mn_instances(
std::vector<std::unique_ptr<
DeviceGemm<Col, Col, Row, F16, F16, F16, PassThrough, PassThrough, PassThrough>>>&
instances);
void add_device_gemm_dl_f16_f16_f16_mk_kn_mn_instances(
std::vector<std::unique_ptr<
DeviceGemm<Row, Row, Row, F16, F16, F16, PassThrough, PassThrough, PassThrough>>>&
instances);
Because the source file names are highly associated with variable & function names. I suggest to generate add_device_xxxx_instances() declarations (probably definitions too) by CMake or script during configuration time.