keras-cv
keras-cv copied to clipboard
630/add efficientnet lite
What does this PR do?
Adds EfficientNet Lite variants to keras_cv
models.
Fixes #630
This is a port of PR from the Keras repository, as per [this comment].(https://github.com/keras-team/keras/pull/16905#issuecomment-1262811641)
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [x] Did you read the contributor guideline, Pull Request section?
- [x] Was this discussed/approved via a Github issue? Please add a link to it if that's the case.
- [x] Did you write any new necessary tests?
- [ ] If this adds a new model, can you run a few training steps on TPU in Colab to ensure that no XLA incompatible OP are used?
Who can review?
@LukeWood
Let me know if I should tag anyone else :)
@LukeWood I do have weights converted from the original tpu repository, but they expect different preprocessing:
Normalization(mean=127.0, variance=128.0**2)
instead of Keras-CV Rescaling(1.0 / 255.0)
.
Using Rescaling
layer with these weights will probably result in lower Imagenet accuracy.
If not, I will run the script, but I currently have a backlog of training experiments, so this would be at least a week from now.
I don't know if you could discuss this with the Team but it would be very nice to launch a training job from a PR with a Github Action (after a manual trigger).
It will help us to follow the training on the same page and we could automate the upload of the log to https://tensorboard.dev/.
If not I think it will be a little bit hard to scale at some point.
I don't know if you could discuss this with the Team but it would be very nice to launch a training job from a PR with a Github Action (after a manual trigger).
It will help us to follow the training on the same page and we could automate the upload of the log to https://tensorboard.dev/.
If not I think it will be a little bit hard to scale at some point.
This is on our radar. It's something we'd like to include eventually, but haven't yet prioritized.
This is on our radar. It's something we'd like to include eventually, but haven't yet prioritized.
Good, when you will be ready I hope that we find a space to discuss this with the community before you finalize some details.
@ianstenbit
Are you able to run our training script to verify that these models (just one is fine for now) converge on ImageNet / potentially provide weights alongside this PR?
Sorry, I do not have the compute resources to run Imagenet train job :(
@ianstenbit
Are you able to run our training script to verify that these models (just one is fine for now) converge on ImageNet / potentially provide weights alongside this PR?
Sorry, I do not have the compute resources to run Imagenet train job :(
No worries at all -- we need a solution to provide GCP resources for training in situations like these. For now, I will try to train one of these in the next week.
Thank you! Once I can produce some benchmark training results / weights for one of these I will merge.
@ianstenbit Thanks! Looking forward to the results, as I am curious as well.
@ianstenbit Thanks! Looking forward to the results, as I am curious as well.
I've just started a training run on the EfficientNetLiteB0. I'm using this as benchmark scores, so we're looking for close to 74.83% top-1 accuracy on ImageNet. (Our baseline for now is that we need to match 95% of that result to merge this + add weights, so that would be 71.09%)
@ianstenbit Thank you for the update.
I'm using this as benchmark scores
It seems metrics from timm + papers with code are slightly higher than the ones from tensorflow/tpu repository.
Looking forward to the results!
First training run had ~70.1% top-1 accuracy for EfficientNetLiteB0. I am re-starting a new run with a higher batch size and a warmped-up cosine decay LR schedule and I'll see how that goes. May need to add weight decay as well -- we shall see.
After a few more training runs, I'm not able to get SOTA weights with this model just yet.
That said, it looks correct to me. I think weight regularization in our basic_training script is probably necessary to get good scores here. @LukeWood are you fine with merging this? I am still actively working on getting good weights
/gcbrun
@ianstenbit Sorry to hear about the results. If I have more time I will probably look into it too.
Thanks for the merge!