axolotl icon indicating copy to clipboard operation
axolotl copied to clipboard

pass base_model insted of model_type

Open ved1beta opened this issue 1 month ago • 2 comments

fix passing model type bu cce_patch

 File "/workspace/ml-cross-entropy/cut_cross_entropy/transformers/patch.py", line 202, in cce_patch                                                                                [202/3480]
    return PATCH_FNS[model_type](model_type_or_model, patch_options)                                                                                                                          
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                          
  File "/workspace/ml-cross-entropy/cut_cross_entropy/transformers/kimi_linear.py", line 106, in patch_kimi_linear                                                                            
    model_config = AutoConfig.from_pretrained(maybe_model, trust_remote_code=True)                                                                                                            
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                            
  File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1328, in from_pretrained                                              
    config_dict, unused_kwargs = PreTrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)                                                                                    
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                    
  File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/transformers/configuration_utils.py", line 616, in get_config_dict                                                          
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)                                                                                                       
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                       
  File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/transformers/configuration_utils.py", line 674, in _get_config_dict                                                         
    resolved_config_file = cached_file(                                                                                                                                                       
                           ^^^^^^^^^^^^                                                                                                                                                       
  File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/transformers/utils/hub.py", line 326, in cached_file                                                                        
    file = cached_files(path_or_repo_id=path_or_repo_id, filenames=[filename], **kwargs)                                                                                                      
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                      
  File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/transformers/utils/hub.py", line 499, in cached_files                                                                       
    raise OSError(                                                                                                                                                                            
OSError: kimi_linear is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'                                                                      
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `hf auth login` or by passing `token=<your_token>` 

Motivation and Context

https://github.com/axolotl-ai-cloud/ml-cross-entropy/pull/29/files https://github.com/axolotl-ai-cloud/axolotl/pull/3257/files

Summary by CodeRabbit

Release Notes

  • New Features
    • Added support for specifying a custom model path in the cut cross entropy integration configuration.
    • Improved remote code trust handling with conditional logic for model path selection.

ved1beta avatar Nov 14 '25 10:11 ved1beta

📝 Walkthrough

Walkthrough

The PR adds support for configurable model path selection in the Cut Cross Entropy integration, introducing a new optional configuration field and updating the patch initialization logic to conditionally use a preferred model path when available and trust remote code settings apply.

Changes

Cohort / File(s) Summary
Cut Cross Entropy Integration
src/axolotl/integrations/cut_cross_entropy/args.py, src/axolotl/integrations/cut_cross_entropy/__init__.py
Added new optional cut_cross_entropy_model_path field to configuration args. Updated pre_model_load to compute model identifier (preferring cut_cross_entropy_model_path over base_model) and pass it to cce_patch as model_name_or_path when trust_remote_code is enabled.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

  • New optional configuration field is straightforward but verify it integrates properly with existing validation
  • Conditional logic in pre_model_load is focused and easy to follow, but confirm the fallback behavior when trust_remote_code is false works as intended
  • Ensure the model identifier selection logic (preferring cut_cross_entropy_model_path over base_model) aligns with intended usage patterns

Suggested reviewers

  • winglian

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the primary change: switching from passing model_type to passing base_model to the cce_patch function.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Nov 14 '25 10:11 coderabbitai[bot]

Codecov Report

:x: Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...axolotl/integrations/cut_cross_entropy/__init__.py 0.00% 2 Missing :warning:
src/axolotl/integrations/cut_cross_entropy/args.py 0.00% 1 Missing :warning:

:loudspeaker: Thoughts on this report? Let us know!

codecov[bot] avatar Nov 16 '25 06:11 codecov[bot]