TransformerEngine Can we only replace part of nn.Linear with te.Linear and others keep unchanged?

Can we only replace part of nn.Linear with te.Linear and others keep unchanged?

Open zigzagcai opened this issue 8 months ago • 5 comments

Mar 20 '25 11:03 zigzagcai

I'm not sure what you mean - if you want to run some Linear layers in fp8 and the rest in higher precision, or you want to run for example forward in fp8 and backward in high precision. Both of this scenarios will be possible when this PR will be merged (hopefully this week).

Mar 20 '25 12:03 pggPL

I'm not sure what you mean - if you want to run some Linear layers in fp8 and the rest in higher precision, or you want to run for example forward in fp8 and backward in high precision. Both of this scenarios will be possible when this PR will be merged (hopefully this week).

Thank you! I mean run some layers in fp8 and other's in high precision.

Mar 20 '25 14:03 zigzagcai

Yes, you can do that. You can either just leave some layers as nn.Linear or you can nest the fp8_autocast context manager, something like this:

with fp8_autocast(enabled=True):
    y = te_linear1(x)  # will compute in FP8
    with fp8_autocast(enabled=False):
        z = te_linear2(y)  # will compute in high precision

Mar 24 '25 16:03 ptrendx

Both of this scenarios will be possible when this https://github.com/NVIDIA/TransformerEngine/pull/1441 will be merged (hopefully this week).

Hi @pggPL , looks the original PR has been closed and split into 4 PRs. May i know when can we expected these changes been merged into TE?

Mar 27 '25 17:03 lengerfulluse

I want to merge them as soon as possible, there was temporal shortage of reviewers due to other deadlines with higher priority, but I hope it will be merged soon.

Mar 31 '25 09:03 pggPL

TransformerEngine TransformerEngine copied to clipboard

Can we only replace part of nn.Linear with te.Linear and others keep unchanged?

TransformerEngine
TransformerEngine copied to clipboard