TensorRT
TensorRT copied to clipboard
feat: Implement FP8 functionality
Description
This PR adds FP8 & BF16 datatype support. It also implements converter for FP8 quantized ops.
Type of change
Please delete options that are not relevant and/or add your own.
- Bug fix (non-breaking change which fixes an issue)
- New feature (non-breaking change which adds functionality)
- Breaking change (fix or feature that would cause existing functionality to not work as expected)
- This change requires a documentation update
Checklist:
- [ ] My code follows the style guidelines of this project (You can use the linters)
- [ ] I have performed a self-review of my own code
- [ ] I have commented my code, particularly in hard-to-understand areas and hacks
- [ ] I have made corresponding changes to the documentation
- [ ] I have added tests to verify my fix or my feature
- [ ] New and existing unit tests pass locally with my changes
- [ ] I have added the relevant labels to my PR in so that relevant reviewers are notified
@peri044 I remember we've already removed cudnn dependency on the release/2.3, but it still stays here: https://github.com/pytorch/TensorRT/blob/3f6999d6ab2f9b62c63b78e8405cacf216370214/py/torch_tensorrt/init.py#L9
This causes an error when I import torch-trt:
>>> import torch_tensorrt
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/scratch.zewenl_sw/docker_workspace/TRT10/TensorRT/py/torch_tensorrt/__init__.py", line 7, in <module>
from torch_tensorrt._version import ( # noqa: F401
ImportError: cannot import name '__cudnn_version__' from 'torch_tensorrt._version' (/home/scratch.zewenl_sw/docker_workspace/TRT10/TensorRT/py/torch_tensorrt/_version.py)
Can you take a look?
@peri044 I remember we've already removed
cudnndependency on the release/2.3, but it still stays here:https://github.com/pytorch/TensorRT/blob/3f6999d6ab2f9b62c63b78e8405cacf216370214/py/torch_tensorrt/init.py#L9
This causes an error when I import torch-trt:
>>> import torch_tensorrt Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/scratch.zewenl_sw/docker_workspace/TRT10/TensorRT/py/torch_tensorrt/__init__.py", line 7, in <module> from torch_tensorrt._version import ( # noqa: F401 ImportError: cannot import name '__cudnn_version__' from 'torch_tensorrt._version' (/home/scratch.zewenl_sw/docker_workspace/TRT10/TensorRT/py/torch_tensorrt/_version.py)Can you take a look?
fixed it now
Cool thanks! And did you implement the unit test for torch.ops.trt.quantize_fp8.default?
@peri044 Thanks for the comments. I have refactored based on your suggestions.