ditto
ditto copied to clipboard
Ditto upgrades
Upgrade Ditto
Summary
This PR modernizes the Ditto codebase to work with newer PyTorch versions and improves cross-platform compatibility. The changes eliminate the dependency on Apex (which requires Visual Studio 2019 tools that can no longer be installed for free), use newer Python versions (currently using Python 3.12.11 on Windows) and leverage native PyTorch features for automatic mixed precision training.
Motivation
When working with the original Ditto implementation, I encountered several blockers:
- Apex dependency: Required Microsoft Visual Studio 2019 tools to build, which is no longer freely available, preventing the use of automatic mixed precision training
-
Limited model compatibility: Could not load newer pre-trained language models like
microsoft/deberta-v3-small -
Outdated dependencies: Newer PyTorch versions natively support features like
AdamWoptimizer and automatic mixed precision, eliminating the need for external dependencies
Changes
Cross-Platform Compatibility
- ✅ Encode loaded data using UTF-8 to handle diverse character sets
- ✅ Normalize path separators for seamless operation across Windows, Linux, and macOS
Modernized PyTorch Integration
- ✅ Import
AdamWfromtorch.optiminstead oftransformers - ✅ Replace Apex with native
torch.ampfor automatic mixed precision - ✅ Implement gradient scaling by default when AMP is enabled
Enhanced Mixed Precision Support
- ✅ Add AMP support in model evaluation for faster inference on modern GPUs
- ✅ Ensure AMP consistency between training and evaluation steps
Command-Line Interface Improvements
- ✅ Rename
--fp16argument to--ampfor clarity and accuracy - ✅ Add explicit
--use_gpuargument for GPU training control
Environment Updates
- ✅ Upgraded to Python 3.12.11
- ✅ Created
updated_requirements.txtwith modern library versions - ✅ Added
.gitignorefile for better project hygiene