InternImage
InternImage copied to clipboard
Implement CPU operator for DCNv3
Purpose
- Reduce the deployment difficulty of small-scale visual models using DCNv3 on different devices (using the same precompiled package for downgrading)
Work
- Provide CPU downgrade by modifying
dcnv3.h - Modify
setup.pyto provide CPU-Only compilation method and enable O2 optimization - Fully implement CPU operators
Effects
- Minimize intrusion into the original code and compilation methods as much as possible
- Accuracy passes tests (based on the original tests by modifying CUDA interfaces to corresponding CPU interfaces)
Issues
- In the scenario of multi-core x86 CPU supporting SIMD, the speed is only 0.27x of the PyTorch CPU version