Jamie DeAntonis

Results 12 comments of Jamie DeAntonis

Yes, we got it to converge. On our task, in relation to regular attention, it was (i) a step worse in terms of loss, (ii) noticeably lower memory utilization on...

As an update, I realized I can get past this by setting ```python os.environ["SB_DISABLE_QUIRKS"] = "disable_jit_profiling" ``` that still seems like a bug though, that the default settings cause an...