[fix][1760] Added fix for the missing `context` key issue in dolly!

Open pytholic opened this issue 4 months ago • 1 comments

What is the type of this PR?

[ ] Refactor (refactored code that neither fixes a bug nor adds a feature)
[ ] Feature (non-breaking change which adds functionality)
[x] Bug fix (non-breaking change which fixes an issue)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
[ ] Test (including new or correcting previous tests)
[ ] Style (changes to code formatting and styling)
[ ] Optimization (performance improvements)
[ ] Documentation Update (updates to the documentation (i.e. README, comments, docstrings, etc.)
[ ] Revert (reverts a previous commit)

To fix the issue where context key error was thrown in Dolly dataloader.

The following changes were made in this PR.

I added the following test to simulate the original issue. The code has been tests on MacOS.

Previous behavior Screenshot 2024-10-03 at 2 01 29 AM

After fixing

I noticed that some tests related to the tokenizer are failing. Note that these are not related to my updates (CC: @rasbt ).
I updated the printing logs a bit. The instruction was being printed twice which wasn't appealing. I have also added some newline characters. I applied it to only lora.py. If it is desirable, it can be applied to other scripts as well.

Before

After format_new

Oct 02 '24 18:10 pytholic