Masaki Kozuki

Results 167 comments of Masaki Kozuki

Thank you for your answers. I'll look into my implementation ASAP But, I've been a bit busy these days, so could you either share your implementation of Hinge and/or LS...

Thank you for your report & suggestion! I'm sorry for my late response and bothering you by my bad. To be honest, I did not test/use FID scores.

> Oh, I did not mean to blame you but just wanted to help your repo and other users. Thank you for your kind words. I added the link to...

Sorry for my late response. That should be my mistake. The complete implementation is in pfnet-research/sngan_projection repository

I think it's expected that all the allgathers are launched in the beginning. To reduce the overhead, rate limiting we do for zero3 could be needed as well for zero2,...

In zero2 the forward trace would consists of a sequence of all-gather's followed by another of computations, which would explain the long idling in the compute stream (at least on...

The failures as of https://github.com/Lightning-AI/lightning-thunder/commit/f724a886639bc616c93879435efcbbeb4c8ac2fe look related to https://github.com/Lightning-AI/lightning-thunder/issues/432.

at glance https://github.com/NVIDIA/TransformerEngine/commit/07291027ed353287149e9df6030862e1e815f32f could be related https://github.com/NVIDIA/TransformerEngine/pull/839

you needn't, the discussion is going in the pr I referenced :)

The rebase to https://github.com/Lightning-AI/lightning-thunder/pull/222/commits/b855247e171527d4bc523cd4e2ca44b8461460c4 seems to give me a bug which at glance is unintelligible where comms are there even under `no_sync`. The change of this pr doesn't look that...