Liger-Kernel
Liger-Kernel copied to clipboard
Solar pro implementation
Summary
implements #537
Details
I don't know if modeling_solar.py and configuration_solar.py are in the right place.
I also changed labels is not None to self.training and (labels is not None) in solar.py. To make it similar to how llama.py is written.
Testing Done
- Hardware Type: NVIDIA GeForce MX230
- [ ] run
make testto ensure correctness - [x] run
make checkstyleto ensure code style - [ ] run
make test-convergenceto ensure convergence - [x] write new tests
what other tests should I add?
what other tests should I add?
Please add fp16 and fp32 convergence tests as well. See https://github.com/linkedin/Liger-Kernel/pull/685 and https://github.com/linkedin/Liger-Kernel/pull/692
@vaibhavjindal done! I don't have the hardware to test it but I think it should work.