He Jia
He Jia
@cloudhan Thanks for reply. May I ask what toolchain configurations need to be added? You mean add more template in rules_cuda?
Have some questions about how to support RayCompiled Graphs. Reinforcement Learning is a very dynamic process. Google's MPMD framework (Pathway) was statically processed when designing IFRT IR, resulting in Pathway...
According to the demo, rebuilding the model with mpi_rank=0 and mpi_size=1. https://github.com/tensorflow/recommenders-addons/blob/653330c2bf08a670dd0ebfd38e309936051f8c47/demo/dynamic_embedding/movielens-1m-keras-with-horovod/movielens-1m-keras-with-horovod.py#L670
Have you tried rebuilding model and enabling inference mode? If it's OK, I would close this issue.
This issue was solved by commit 1a5dfca.
Here is the simple demo code, nothing special. Could be the problem that Kube host machine cuda driver version(535) is too old? ```python from dataclasses import dataclass from functools import...
@hauturier Usually this is caused by the absl compiled with in the system that is inconsistent with tensorflow. Please rebuild TFRA from source code.
从报错信息上看可以检查一下是不是内存不足被系统killed
@LeiWang1999 Thank you for replying, where could I include the header. I checked tilelang/jit/adapter/libgen.py, and there is only lib path. Is that COMPOSABLE_KERNEL_INCLUDE_DIR?
@LeiWang1999 Header is not necessary. As long as I have any method that can run the corresponding __global__ function, that's fine. Can I find the symbol of the generated function...