TransnormerLLM
                                
                                
                                
                                    TransnormerLLM copied to clipboard
                            
                            
                            
                        Differences between Lightning Attention1 and Lightning Attention2 code implementations
hello, I have two questions I’d like to ask:
- In this repository, I noticed that the implementations of lightning attention1 and lightning attention2 appear identical
 - The implementation of lightning attention2 in this repository differs from the code provided at this GitHub link(https://github.com/OpenNLPLab/lightning-attention). By testing the computational efficiency of these two implementations, I found that this repository’s version of lightning attention2 has lower computational efficiency than the one from that GitHub link.