Jimmy Alvarez

Results 1 comments of Jimmy Alvarez

Thanks for contribution first. So has this implementation already finished or closing? I haven't see any related argument about sliding attention window in megatron/training/argument.py.