Jimmy Alvarez
Results
1
comments of
Jimmy Alvarez
Thanks for contribution first. So has this implementation already finished or closing? I haven't see any related argument about sliding attention window in megatron/training/argument.py.