Shaden Smith

Results 12 comments of Shaden Smith

In addition to `to_sequential`, there may be another way we could accomplish this while keeping the normal `PipelineModule`, if that would be useful. If we short-circuit this condition and use...

This still needs FP32 and ZeRO. And unit tests :-).

The whole MEX interface more or less needs a complete rewrite, anyway. I would like to do all of those at once.

Thanks for the helpful report! I will look into this.

Hi there, CSF author here. There's a newer CSF [paper](http://shaden.io/pub-files/smith2017knl.pdf) that makes the actual data structure a little more obvious (Figure 2). CSF is a good format for performance and...

If you have worked with CSR before for matrices, you can think of CSF as a higher-order interpretation of that. Essentially, one `rowptr` points into the next `rowptr` to encode...

@jaywonchung , thanks so much for this fantastic find!! Great work, findings, and report; we super appreciate it!

Hi @eddy16112 , thanks for your interest in 3D parallelism! At this time we have not adapted BERT to support pipeline parallelism. Only the GPT code path is supported.

Hi there, thanks for sharing and for the ping. Can you share the config that also reproduces, and which deepspeed version that you are using? This issue was [fixed for...