mhabedank

Results 8 issues of mhabedank

## Changes - Added comprehensive testing matrix across multiple environments: - Operating systems: Ubuntu, Windows, and macOS - Python versions: 3.10, 3.11, 3.12, and 3.13 - Implemented efficient caching strategy...

This PR tries to solve the dependency problems we have when going forward to modern Python version. Things currently done: - moved from outdated build system and requirements files to...

The [BERTTokenizer](https://github.com/ludwig-ai/ludwig/blob/00c51e0a286c3fa399a07a550e48d0f3deadc57d/ludwig/utils/tokenizers.py#L1109) is using torchtext. We want to remove torchtext as a dependency so this Tokenizer has to be refactored not using it.

help wanted
dependency

The [NgramTokenizer](https://github.com/ludwig-ai/ludwig/blob/00c51e0a286c3fa399a07a550e48d0f3deadc57d/ludwig/utils/tokenizers.py#L135) is using torchtext. We want to remove torchtext as a dependency so this Tokenizer has to be refactored not using it.

help wanted
dependency

**Describe the bug** Could not install Ludwig on Kaggle to use it **To Reproduce** Steps to reproduce the behavior: 1. Create Notebook 2. run `!pip install ludwig` **Expected behavior** Ludwig...

The [CLIPTokenizer](https://github.com/ludwig-ai/ludwig/blob/00c51e0a286c3fa399a07a550e48d0f3deadc57d/ludwig/utils/tokenizers.py#L1071) is using torchtext. We want to remove torchtext as a dependency so this Tokenizer has to be refactored not using it.

help wanted
dependency

The [GPT2BPETokenizer](https://github.com/ludwig-ai/ludwig/blob/00c51e0a286c3fa399a07a550e48d0f3deadc57d/ludwig/utils/tokenizers.py#L1085) is using torchtext. We want to remove torchtext as a dependency so this Tokenizer has to be refactored not using it.

help wanted
dependency

The [SentencePieceTokenizer](https://github.com/ludwig-ai/ludwig/blob/00c51e0a286c3fa399a07a550e48d0f3deadc57d/ludwig/utils/tokenizers.py#L1014) is using torchtext. We want to remove torchtext as a dependency so this Tokenizer has to be refactored not using it.

help wanted
dependency