torchtune icon indicating copy to clipboard operation
torchtune copied to clipboard

Gemma lora

Open solitude-alive opened this issue 1 year ago • 2 comments

Context

  • Create lora fine-tune for Gemma model.

Changelog

  • ...

Test plan

  • ....
Screenshot 2024-04-18 at 7 37 25 PM

It can work with apply_lora_to_mlp = True, apply_lora_to_output = False, but not apply_lora_to_output = True. The current Gemma model does not have output_layer, the output is calculated by output = F.linear(h, self.tok_embeddings.weight).float(), so I don't know if need to set apply_lora_to_output = True .

solitude-alive avatar Apr 18 '24 11:04 solitude-alive

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/789

Note: Links to docs will display an error until the docs builds have been completed.

:white_check_mark: No Failures

As of commit 5a966865bafd418f1f621ead4d8ebd9c9fe628d6 with merge base 41341fd1df326c0f6665c4e0a79965c3a7e8d836 (image): :green_heart: Looks good so far! There are no failures yet. :green_heart:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot[bot] avatar Apr 18 '24 11:04 pytorch-bot[bot]

This is the lora single device test: single

This is the qlora single device test: Screenshot 2024-04-19 at 1 20 34 PM

solitude-alive avatar Apr 19 '24 05:04 solitude-alive

do you also mind sharing what your peak memory allocated was for both the lora and qlora runs? and to be extra sure, do you mind making sure the checkpoints save correctly :) Left some small comments, otherwise it looks nearly ready to go

RdoubleA avatar Apr 19 '24 15:04 RdoubleA

LoRA single device:

INFO:torchtune.utils.logging:Memory Stats after model init:
{'peak_memory_active': 5.547595264, 'peak_memory_alloc': 5.547595264, 'peak_memory_reserved': 5.559549952}

Step 100 | peak_memory_active:14.325987328 peak_memory_alloc:14.325987328 peak_memory_reserved:24.664604672 

QLoRA single device:

INFO:torchtune.utils.logging:Memory Stats after model init:
{'peak_memory_active': 4.887322112, 'peak_memory_alloc': 4.887322112, 'peak_memory_reserved': 5.802819584}

Step 100 | peak_memory_active:11.555596288 peak_memory_alloc:11.555596288 peak_memory_reserved:24.301797376 

LoRA checkpoints save: lora

QLoRA checkpoints save: Screenshot 2024-04-20 at 12 11 00 AM

solitude-alive avatar Apr 19 '24 16:04 solitude-alive

Codecov Report

Attention: Patch coverage is 39.53488% with 26 lines in your changes are missing coverage. Please review.

Project coverage is 28.08%. Comparing base (e72c9a6) to head (5a96686). Report is 7 commits behind head on main.

Files Patch % Lines
torchtune/models/gemma/_component_builders.py 21.87% 25 Missing :warning:
torchtune/models/gemma/_model_builders.py 90.00% 1 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #789      +/-   ##
==========================================
- Coverage   28.11%   28.08%   -0.03%     
==========================================
  Files         153      153              
  Lines        6260     6291      +31     
==========================================
+ Hits         1760     1767       +7     
- Misses       4500     4524      +24     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Apr 20 '24 02:04 codecov-commenter