flax icon indicating copy to clipboard operation
flax copied to clipboard

Add example to README.md

Open conceptofmind opened this issue 3 years ago • 2 comments

Hi @marcvanzee ,

There seemed to be another issue with the number of commits in the original pull request after resolving the initial branch conflicts. When following the troubleshooting section of the Flax documentation it stated to:

$ git rebase main && git reset --soft main && git commit
$ git push --force

I squashed the number of commits down to under 5 but this seemed to create multiple additional conflicts on my end during rebasing and pushing.

Since I am only making a minor edit to the markdown file I figured it would be better to have only a single commit after following the instructions above. I deleted and closed the previous pull request. This pull request has only a single commit with the 4 line edits and no noticeable current branch conflicts.

I apologize for the inconvenience.

Thank you for your time and consideration,

Enrico Shippole

Checklist

  • [x] This change is discussed in a Github discussion.
  • [x] The documentation and docstrings adhere to the documentation guidelines.
  • [x] This change includes necessary high-coverage tests.

conceptofmind avatar Jul 22 '22 18:07 conceptofmind

Hey @conceptofmind! #2367 will move the Community Examples to docs/examples.rst so they are visible in the new Example section on the documentation, this should be ported there (depending on which PR lands first).

More over, to ensure some quality for our users we've are going to implement a new Contributing Policy of the Community Examples which you can find here: #2300. I am a bit worried that the implementations in vit-flax, while the code looks very clean and understandable, don't contain performance metrics of fully trained models nor show numerical equivalence w.r.t. a reference implementation.

cgarciae avatar Aug 05 '22 00:08 cgarciae

Hey @conceptofmind! #2367 will move the Community Examples to docs/examples.rst so they are visible in the new Example section on the documentation, this should be ported there (depending on which PR lands first).

More over, to ensure some quality for our users we've are going to implement a new Contributing Policy of the Community Examples which you can find here: #2300. I am a bit worried that the implementations in vit-flax, while the code looks very clean and understandable, don't contain performance metrics of fully trained models nor show numerical equivalence w.r.t. a reference implementation.

Hi @cgarciae ,

I greatly appreciate you keeping me updated.

I read through the new Contributing policy. I can provide tutorial notebooks and training scripts for each of the models on CIFAR and ImageNet. If there is any model in which the code is difficult to understand I can write a sufficient comment detailing that part of the architecture. I did some basic testing to assert that each of the models had similar dimensionality to the originals when receiving an image of input shape (1, 224, 224, 3) or (1, 256, 256, 3) and that each model outputs a shape of (1, 1000) for ImageNet 1k.

To show a basic test for numerical equivalency I can match the total number of trainable parameters of each vit-flax model with the original PyTorch, TensorFlow, and JAX research counterparts with something like:

# Flax
n_params_flax = sum(
    jax.tree_leaves(jax.tree_map(lambda x: np.prod(x.shape), params))
)
# PyTorch
n_params_torch = sum(
    p.numel() for p in model_torch.parameters() if p.requires_grad
)
# TensorFlow
n_params_tensorflow = np.sum(
   [K.count_params(w) for w in model.trainable_weights]
)

One thing I noted is that some of the research papers do not contain open-source code for reference or trained weights, so for those few cases it may be difficult to match numerical equivalence. These specific papers provide an in-depth explanation of the model architecture used but do not provide any way to actually verify the author's results. For the papers which provide open-source code and weights, I will further update the repository to show equivalency to the original implementation.

Almost all of the research papers were trained on ImageNet 1k. I am willing to train each of the models on ImageNet 1k and open-source the trained weights, although this will likely take some/many months to complete all 18 models due to budgeting and time constraints. I can continue to update the repository to include links to the trained models after verifying their accuracy.

Please let me know if there is anything specific that want me to do or include in the example repository.

Thank you for your time and consideration.

Enrico Shippole

conceptofmind avatar Aug 05 '22 01:08 conceptofmind