OccupyMars2025 comments

Results 33 comments of


                                            OccupyMars2025

[Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape

## 似乎是 `sp_out.to_dense().numpy()` 导致了报错，而 `sp_out = paddle.incubate.sparse.reshape(sp_x, new_shape) `似乎能计算，测试一下

[Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape

## How to fix it ? ## The Chinese comment may be the cause of the error. So I translate the Chinese comment into English https://www.cnblogs.com/VVingerfly/p/13751289.html

[Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape

## but in aistudio, dense tensor has no problems, so maybe the cause is that paddle.reshape and paddle.incubate.sparse.reshape are operating on the same paddle tensor. ## How to fix it...

[Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape

## You need to add `if paddle.is_compiled_with_cuda():`

[Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape

# It may seem that numpy() method of a cuda dense tensor reports the error, but I found that actually `sp_out = paddle.incubate.sparse.reshape(sp_x, new_shape)` causes the error in which `sp_x`...

[Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape

# The following picture shows that at least the forward coo kernel of sparse reshape on cpu can works in a right way.

[Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape

## this is the reason for ` dense_x.grad.numpy() * mask` ```python dense_out.backward() sp_out.backward() np.testing.assert_allclose(sp_x.grad.to_dense().numpy(), dense_x.grad.numpy() * mask, # dense_x.grad.numpy(), rtol=1e-05) ```

[Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape

## There seems to be numerical unstability when doing backward computation on cpu. Run the test case multiple times. Then sometimes the two grad tensors have same values and sometimes...

[Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape

## All test cases for cpu forward coo kernel are successful, so my forward computation logic is correct, but gpu forward coo kernel doesn't work.

[Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape

## [飞桨高可复用算子库 PHI 设计文档](https://github.com/PaddlePaddle/docs/blob/develop/docs/design/phi/design_cn.md) 提到的需要注意的要点：判断是否要进行跨设备数据拷贝按训练和推理场景拆分编译例如：推理不编译反向相关 kernel，也不编译带有 Intermediate 输出的前向 kernel 长线上支持跨设备 kernel 的写法统一需求，并且直观易用，不引入不必要的模板参数解释：算子库下层还有 Kernel Primitive API 模块，其长线愿景是每个运算，只用一个 kernel，就能够适应多种设备，真正区分设备的代码，仅在 Kernel Primitive API 实现中；未来复用 kernel 传入较复杂的模板参数时，需要限制参数尽可能地简洁 * For Tensor,...