AITemplate
AITemplate copied to clipboard

Published 20 hours ago •

facebookincubator

Reame
Issues

Add log1p elementwise op

Open 22quinn opened this issue 11 months ago • 5 comments

Summary: log1p(x) is more precise than log(1+x) when x is close to 0. We utilize cuda log1pf implementation for fp32. For other precision types, input is first converted to float, then log1pf is computed, finally output is converted back to original precision.

CUDA log1pf function for float and double: https://docs.nvidia.com/cuda/cuda-math-api/group__CUDA__MATH__SINGLE.html

Differential Revision: D54176180

Feb 26 '24 16:02 22quinn

This pull request was exported from Phabricator. Differential Revision: D54176180

Feb 26 '24 16:02 facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D54176180

Mar 01 '24 01:03 facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D54176180

Mar 01 '24 01:03 facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D54176180

Mar 01 '24 01:03 facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D54176180

Mar 01 '24 05:03 facebook-github-bot