TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

feat: Implement FP8 functionality

Open peri044 opened this issue 1 year ago • 4 comments

Description

This PR adds FP8 & BF16 datatype support. It also implements converter for FP8 quantized ops.

Type of change

Please delete options that are not relevant and/or add your own.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • [ ] My code follows the style guidelines of this project (You can use the linters)
  • [ ] I have performed a self-review of my own code
  • [ ] I have commented my code, particularly in hard-to-understand areas and hacks
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have added tests to verify my fix or my feature
  • [ ] New and existing unit tests pass locally with my changes
  • [ ] I have added the relevant labels to my PR in so that relevant reviewers are notified

peri044 avatar Apr 18 '24 23:04 peri044

@peri044 I remember we've already removed cudnn dependency on the release/2.3, but it still stays here: https://github.com/pytorch/TensorRT/blob/3f6999d6ab2f9b62c63b78e8405cacf216370214/py/torch_tensorrt/init.py#L9

This causes an error when I import torch-trt:

>>> import torch_tensorrt
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/scratch.zewenl_sw/docker_workspace/TRT10/TensorRT/py/torch_tensorrt/__init__.py", line 7, in <module>
    from torch_tensorrt._version import (  # noqa: F401
ImportError: cannot import name '__cudnn_version__' from 'torch_tensorrt._version' (/home/scratch.zewenl_sw/docker_workspace/TRT10/TensorRT/py/torch_tensorrt/_version.py)

Can you take a look?

zewenli98 avatar May 17 '24 21:05 zewenli98

@peri044 I remember we've already removed cudnn dependency on the release/2.3, but it still stays here:

https://github.com/pytorch/TensorRT/blob/3f6999d6ab2f9b62c63b78e8405cacf216370214/py/torch_tensorrt/init.py#L9

This causes an error when I import torch-trt:

>>> import torch_tensorrt
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/scratch.zewenl_sw/docker_workspace/TRT10/TensorRT/py/torch_tensorrt/__init__.py", line 7, in <module>
    from torch_tensorrt._version import (  # noqa: F401
ImportError: cannot import name '__cudnn_version__' from 'torch_tensorrt._version' (/home/scratch.zewenl_sw/docker_workspace/TRT10/TensorRT/py/torch_tensorrt/_version.py)

Can you take a look?

fixed it now

peri044 avatar May 17 '24 21:05 peri044

Cool thanks! And did you implement the unit test for torch.ops.trt.quantize_fp8.default?

zewenli98 avatar May 17 '24 21:05 zewenli98

@peri044 Thanks for the comments. I have refactored based on your suggestions.

zewenli98 avatar May 21 '24 23:05 zewenli98