Paddle [Typing][debug] 临时 PR 用于监测全量类型标注，请勿合入

PR Category

Others

PR Types

Others

Description

临时 PR 用于监测全量类型标注，请勿合入

关联 PR https://github.com/PaddlePaddle/Paddle/issues/65008

@SigureMo

Jun 24 '24 03:06 megemini

你的PR提交成功，感谢你对开源项目的贡献! 请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。 Your PR has been submitted. Thanks for your contribution! Please wait for the result of CI firstly. See Paddle CI Manual for details.

Jun 24 '24 03:06 paddle-bot[bot]

问题 1： `EagerParamBase` 该如何处理

参考如下代码：

            >>> from paddle import LazyGuard
            >>> from paddle.nn import Linear

            >>> with LazyGuard():
            ...     # w and b are initialized lazily and have no memory.
            ...     net = Linear(10, 10)
            ...
            >>> for param in net.parameters():
            ...     # Initialize param and allocate memory explicitly.
            ...     param.initialize()

此时 type checking 错误，因为：

In [7]: type(param)
Out[7]: paddle.base.framework.EagerParamBase

再看 Layer 中 parameters 的一般用法：

    def parameters(self, include_sublayers: bool = True) -> list[Tensor]:

也就是说，with LazyGuard 改变（增加）了 Tensor 的属性～

目前想到两个解决方案：

在 tensor.prototype.pyi 中增加 EagerParamBase 多出来的那几个属性
parameters 返回 list[Tensor] | list[EagerParamBase]

p.s. 后面在这个 PR 把一些问题捡出来～

Jun 24 '24 09:06 megemini

问题2：`[abstract]` 错误如何处理？

发现问题在 python/paddle/distributed/fleet/utils/fs.py 中：

class LocalFS(FS) 继承了 FS 但是没有实现所有抽象方法，如 "cat", "download", "upload" and "upload_dir"

mypy 错误提示：

<string>:2:12: error: Cannot instantiate abstract class "LocalFS" with abstract attributes "cat", "download", "upload" and "upload_dir"  [abstract]

解决方案：

在配置文件中 pyproject.toml 增加 disable_error_code = "abstract"
修改 LocalFS 源代码，实现上述方法

Jun 24 '24 10:06 megemini

问题3：不能在 `@property` 与 `@xxx.setter` 中插入其他方法

以下测试代码发现问题：

                >>> import paddle.distributed.fleet as fleet
                >>> strategy = fleet.DistributedStrategy()
                >>> strategy.dgc = True
                >>> strategy.recompute = True
                >>> strategy.recompute_configs = {"checkpoints": ["x"]}
                >>> strategy.save_to_prototxt("dist_strategy.prototxt")

                >>> strategy.load_from_prototxt("dist_strategy.prototxt")

报错：

<string>:4:1: error: Property "recompute" defined in "DistributedStrategy" is read-only  [misc]

经定位发现，mypy 检查 property 的时候：

from __future__ import annotations

import decorator # type: ignore

from typing import TYPE_CHECKING, Callable, TypeVar
from typing_extensions import ParamSpec

non_auto_func_called = True

_InputT = ParamSpec("_InputT")
_RetT = TypeVar("_RetT")
_RetT1 = TypeVar("_RetT1")
_RetT2 = TypeVar("_RetT2")


def __non_auto_func_called__(
    func: Callable[_InputT, _RetT]
) -> Callable[_InputT, _RetT]:
    def __impl__(*args: _InputT.args, **kwargs: _InputT.kwargs) -> _RetT:
        global non_auto_func_called
        non_auto_func_called = False
        return func(*args, **kwargs)

    return __impl__

def wrap_decorator(
    decorator_func: Callable[[Callable[_InputT, _RetT1]], Callable[_InputT, _RetT2]]
) -> Callable[[Callable[_InputT, _RetT1]], Callable[_InputT, _RetT2]]:
    @decorator.decorator
    def __impl__(
        func: Callable[_InputT, _RetT1], *args: _InputT.args, **kwargs: _InputT.kwargs
    ) -> _RetT2:
        wrapped_func = decorator_func(func)
        return wrapped_func(*args, **kwargs)

    return __impl__


is_strict_auto = wrap_decorator(__non_auto_func_called__)


class A:
    def __init__(self, rec: int) -> None:
        self._rec = rec

    @property
    def recompute(self) -> int:
        return self._rec

    @recompute.setter
    @is_strict_auto
    def recompute(self, rec: int) -> None:
        self._rec = rec

class B:
    def __init__(self, rec: int) -> None:
        self._rec = rec

    @property
    def recompute(self) -> int:
        return self._rec

    def tmp(self):
        """ 不能在 `recompute` 中间插一个其他方法
        test_func_wrap.py:70:6: error: Name "recompute" already defined on line 59  [no-redef]
        test_func_wrap.py:70:6: error: "Callable[[B], int]" has no attribute "setter"  [attr-defined]
        """
        pass

    @recompute.setter
    def recompute(self, rec: int) -> None:
        self._rec = rec

a = A(1)
a.recompute = 3

如上述代码中的 class B ，不能插入 def tmp 方法～

而，python/paddle/distributed/fleet/base/distributed_strategy.py 中的 DistributedStrategy 有很多 property ，与相对应的 @xxx.setter 都是分开的，导致出错，如：

    @property
    def recompute(self):
...
    @recompute.setter
    @is_strict_auto
    def recompute(self, flag):

经测试，如果把两者放到一起，问题消失～

解决方案：

修改源代码中 @property 与 @xxx.setter 的顺序

Jun 24 '24 10:06 megemini

在 tensor.prototype.pyi 中增加 EagerParamBase 多出来的那几个属性

是否可以利用「组合」？Protocol 本身就非常契合组合的概念，类似于 Java、TypeScript 的 Interface 和 Rust 的 Trait

class Lazyable(Protocal): # 刚刚的一些独有方法
    def initialize(self): ...

class IrValue(Protocal): # Value 的相关方法
    def is_dense_tensor_type(self): ...

class TensorBase(Protocal):
    def xxx(self): ... # 现有的全部方法

class Tensor(TensorBase, Lazyable, IrValue): ... # 三者组合

Jun 24 '24 10:06 SigureMo

而，python/paddle/distributed/fleet/base/distributed_strategy.py 中的 DistributedStrategy 有很多 property ，与相对应的 @xxx.setter 都是分开的，导致出错，如：

离谱，离大谱……不过这个可以改代码就是了，改完其实更易读些，但还是不得不说，mypy 太拉了……

Jun 24 '24 11:06 SigureMo

在 tensor.prototype.pyi 中增加 EagerParamBase 多出来的那几个属性

是否可以利用「组合」？Protocol 本身就非常契合组合的概念，类似于 Java、TypeScript 的 Interface 和 Rust 的 Trait
class Lazyable(Protocal): # 刚刚的一些独有方法
    def initialize(self): ...

class IrValue(Protocal): # Value 的相关方法
    def is_dense_tensor_type(self): ...

class TensorBase(Protocal):
    def xxx(self): ... # 现有的全部方法

class Tensor(TensorBase, Lazyable, IrValue): ... # 三者组合

可以～效果应该一样～不过都没法解决：何时暴露 EagerParamBase 独有的方法，这个问题～毕竟这东西是动态绑定的～那我改一下 tensor.prototype.pyi ～

Jun 24 '24 12:06 megemini

何时暴露 EagerParamBase 独有的方法

这个没办法，静态类型不可能将所有运行时的奇技淫巧都覆盖到，我们只需要确保需要的方法都能正确提示就好了～

Jun 24 '24 12:06 SigureMo

这个还是蛮重要的，周末在想干点啥的时候，就想本地跑下全量确保问题是收敛的，因为从整体上来看，其他任务是可以让开发者来平稳推进的，但推进过程难免会影响那些监控不到的示例

（虽然我本地跑了下发现挂了就没后续了就是了……）

对于未来的监控，我觉得完善的监控是：

如果修改示例代码，那么对该示例代码跑 mypy 检查
如果修改 API（含类型提示），那么对全量示例代码跑 mypy 检查，当然前提是时间可控，10 min 我觉得是可接受的，因为修改 API 的情况非常少见

不过现阶段的话，我们可以一起推进解决下这里的报错问题～

Jun 24 '24 17:06 SigureMo

问题4：`dtype` 是否要支持 `float`

参考示例：

import paddle
tensor = paddle.randn([512, 512, 512], "float")

这里是不是根据当时的运行环境决定 float 为 float32、float64 或者其他类型？

目前 DTypeLike 只有 floatXX ～

解决方案：

DTypeLike 增加 float
修改示例代码

Jun 25 '24 07:06 megemini

问题4：dtype 是否要支持 float

只有一个地方使用是么？而且并不是测相关 case 的，那我觉得没必要加，修改一下 case 吧，我觉得不应该推荐使用这种语义不明确的用法

Jun 25 '24 09:06 SigureMo

问题5：`distributed` 有较多错误示例代码

如：

2024-06-26 01:14:21 --------------------
2024-06-26 01:14:21 >>> Type hints with api paddle.distributed.sharding.save_group_sharded_model:1 start ...
2024-06-26 01:14:21 import paddle
2024-06-26 01:14:21 from paddle.nn import Linear
2024-06-26 01:14:21 from paddle.distributed import fleet
2024-06-26 01:14:21 from paddle.distributed.sharding import group_sharded_parallel, save_group_sharded_model
2024-06-26 01:14:21 fleet.init(is_collective=True)
2024-06-26 01:14:21 group = paddle.distributed.new_group([0, 1])
2024-06-26 01:14:21 model = Linear(1000, 1000)
2024-06-26 01:14:21 clip = paddle.nn.ClipGradByGlobalNorm(clip_norm=1.0)
2024-06-26 01:14:21 optimizer = paddle.optimizer.AdamW(learning_rate=0.001, parameters=model.parameters(), weight_decay=0.00001, grad_clip=clip)
2024-06-26 01:14:21 model, optimizer, scaler = group_sharded_parallel(model, optimizer, "p_g", scaler=scaler)
2024-06-26 01:14:21 img, label = data
2024-06-26 01:14:21 label.stop_gradient = True
2024-06-26 01:14:21 img.stop_gradient = True
2024-06-26 01:14:21 out = model(img)
2024-06-26 01:14:21 loss = paddle.nn.functional.cross_entropy(input=out, label=label)
2024-06-26 01:14:21 loss.backward()
2024-06-26 01:14:21 optimizer.step()
2024-06-26 01:14:21 optimizer.clear_grad()
2024-06-26 01:14:21 save_group_sharded_model(model, optimizer, output=output_dir)
2024-06-26 01:14:21 >>> Results ...
2024-06-26 01:14:21 >>> mypy normal_report is ...
2024-06-26 01:14:21 <string>:10:83: error: Cannot determine type of "scaler"  [has-type]
2024-06-26 01:14:21 <string>:11:14: error: Name "data" is not defined  [name-defined]
2024-06-26 01:14:21 <string>:19:1: error: "save_group_sharded_model" gets multiple values for keyword argument "output"  [misc]
2024-06-26 01:14:21 <string>:19:51: error: Name "output_dir" is not defined  [name-defined]
2024-06-26 01:14:21 Found 4 errors in 1 file (checked 1 source file)

由于没有环境验证示例代码（CI 上面好像也没测试），这里较多错误，该如何处理？

另外，问题2 是不是漏了？🫠

Jun 26 '24 12:06 megemini

由于没有环境验证示例代码（CI 上面好像也没测试），这里较多错误，该如何处理？

能修就修，不能修就这样吧，或者整个黑名单机制，部分 API 先不管吧

Jun 26 '24 12:06 SigureMo

问题2：[abstract] 错误如何处理？

喔喔，没注意这里需要决策

在配置文件中 pyproject.toml 增加 disable_error_code = "abstract"

支持 file level 么？只是文件级别禁用我觉得是比较合适的

Jun 26 '24 12:06 SigureMo

支持 file level 么？只是文件级别禁用我觉得是比较合适的

https://github.com/PaddlePaddle/Paddle/pull/65496 在 fs.py 中加了 ignore ～本地测试 OK ～

Jun 26 '24 13:06 megemini

问题6：`weight_attr` 是否标注为多个类型？

以下示例：

import paddle
import paddle.nn as nn
linear = nn.Linear(2, 4, weight_attr=nn.initializer.KaimingNormal())
data = paddle.rand([2, 1, 2], dtype='float32')
res = linear(data)
print(res)

nn.Linear(2, 4, weight_attr=nn.initializer.KaimingNormal()) 的 weight_attr 实际上可以接收 Initializer 或其他多个类型，如：

# 方法1,使用 Initializer
import paddle
import paddle.nn as nn
linear = nn.Linear(2, 4, weight_attr=nn.initializer.KaimingNormal())
data = paddle.rand([2, 1, 2], dtype='float32')
res = linear(data)
print(res)

# 方法2，使用 ParamAttr
import paddle
import paddle.nn as nn
from paddle import ParamAttr
weight_attr = ParamAttr(initializer=nn.initializer.KaimingNormal())
linear = nn.Linear(2, 4, weight_attr=weight_attr)
data = paddle.rand([2, 1, 2], dtype='float32')
res = linear(data)
print(res)

# 方法3,使用 str
import paddle
import paddle.nn as nn
from paddle import ParamAttr
linear = nn.Linear(2, 4, weight_attr='weight')
data = paddle.rand([2, 1, 2], dtype='float32')
res = linear(data)
print(res)

但，Layer 相关的 weight_attr 标注都只是写的 ParamAttr ～

解决方案：

重新标注 Layer 相关的 weight_attr 为 Union
统一在示例中改为上述 方法2 的形式，即，先转换为 ParamAttr 后传入

Jun 28 '24 06:06 megemini

重新标注 Layer 相关的 weight_attr 为 Union

这种吧，看样子是个 undocumented behavior，但用的蛮多的

Jun 28 '24 06:06 SigureMo

重新标注 Layer 相关的 weight_attr 为 Union

这种吧，看样子是个 undocumented behavior，但用的蛮多的

嗯～现在主要问题是，不清楚到底哪些可以这样用 ... ...

Jun 28 '24 06:06 megemini

Update 20240703

测试 mypy == 1.10.1 版本

Mypy 1.10.1 Fix error reporting on cached run after uninstallation of third party library (Shantanu, PR https://github.com/python/mypy/pull/17420)

无明显变化，不推送主版本～

Jul 02 '24 17:07 megemini

[Typing][debug] 临时 PR 用于监测全量类型标注，请勿合入

PR Category

PR Types

Description

问题 1： EagerParamBase 该如何处理

问题2：[abstract] 错误如何处理？

问题3：不能在 @property 与 @xxx.setter 中插入其他方法

问题4：dtype 是否要支持 float

问题5：distributed 有较多错误示例代码

问题6：weight_attr 是否标注为多个类型？

Update 20240703

问题 1： `EagerParamBase` 该如何处理

问题2：`[abstract]` 错误如何处理？

问题3：不能在 `@property` 与 `@xxx.setter` 中插入其他方法

问题4：`dtype` 是否要支持 `float`

问题5：`distributed` 有较多错误示例代码

问题6：`weight_attr` 是否标注为多个类型？