Results 2 issues of Mikhail Sirotenko

Thanks for creating this comparison page. I think it will be usefull for many people. Few comments: 1. CNTK Multi-GPU. The paper you mentioned only presents results for fully-connected networks....

# What does this PR do? This is a minor issue. In the example of how to restore a partial checkpoint when calling `flax.serialization.from_state_dict` we better to assign the return...