pass base_model insted of model_type
fix passing model type bu cce_patch
File "/workspace/ml-cross-entropy/cut_cross_entropy/transformers/patch.py", line 202, in cce_patch [202/3480]
return PATCH_FNS[model_type](model_type_or_model, patch_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/ml-cross-entropy/cut_cross_entropy/transformers/kimi_linear.py", line 106, in patch_kimi_linear
model_config = AutoConfig.from_pretrained(maybe_model, trust_remote_code=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1328, in from_pretrained
config_dict, unused_kwargs = PreTrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/transformers/configuration_utils.py", line 616, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/transformers/configuration_utils.py", line 674, in _get_config_dict
resolved_config_file = cached_file(
^^^^^^^^^^^^
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/transformers/utils/hub.py", line 326, in cached_file
file = cached_files(path_or_repo_id=path_or_repo_id, filenames=[filename], **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/transformers/utils/hub.py", line 499, in cached_files
raise OSError(
OSError: kimi_linear is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `hf auth login` or by passing `token=<your_token>`
Motivation and Context
https://github.com/axolotl-ai-cloud/ml-cross-entropy/pull/29/files https://github.com/axolotl-ai-cloud/axolotl/pull/3257/files
Summary by CodeRabbit
Release Notes
- New Features
- Added support for specifying a custom model path in the cut cross entropy integration configuration.
- Improved remote code trust handling with conditional logic for model path selection.
📝 Walkthrough
Walkthrough
The PR adds support for configurable model path selection in the Cut Cross Entropy integration, introducing a new optional configuration field and updating the patch initialization logic to conditionally use a preferred model path when available and trust remote code settings apply.
Changes
| Cohort / File(s) | Summary |
|---|---|
Cut Cross Entropy Integration src/axolotl/integrations/cut_cross_entropy/args.py, src/axolotl/integrations/cut_cross_entropy/__init__.py |
Added new optional cut_cross_entropy_model_path field to configuration args. Updated pre_model_load to compute model identifier (preferring cut_cross_entropy_model_path over base_model) and pass it to cce_patch as model_name_or_path when trust_remote_code is enabled. |
Estimated code review effort
🎯 2 (Simple) | ⏱️ ~12 minutes
- New optional configuration field is straightforward but verify it integrates properly with existing validation
- Conditional logic in
pre_model_loadis focused and easy to follow, but confirm the fallback behavior whentrust_remote_codeis false works as intended - Ensure the model identifier selection logic (preferring
cut_cross_entropy_model_pathoverbase_model) aligns with intended usage patterns
Suggested reviewers
- winglian
Pre-merge checks and finishing touches
✅ Passed checks (3 passed)
| Check name | Status | Explanation |
|---|---|---|
| Description Check | ✅ Passed | Check skipped - CodeRabbit’s high-level summary is enabled. |
| Title check | ✅ Passed | The title accurately describes the primary change: switching from passing model_type to passing base_model to the cce_patch function. |
| Docstring Coverage | ✅ Passed | Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%. |
✨ Finishing touches
- [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
- [ ] Create PR with unit tests
- [ ] Post copyable unit tests in a comment
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
Codecov Report
:x: Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| ...axolotl/integrations/cut_cross_entropy/__init__.py | 0.00% | 2 Missing :warning: |
| src/axolotl/integrations/cut_cross_entropy/args.py | 0.00% | 1 Missing :warning: |
:loudspeaker: Thoughts on this report? Let us know!