Incredible stuff and some comments
@MarkDaoust @fchollet this is amazing stuff! Thank you so much to the team.
Just reading README was enough for me to get hooked.
A couple of things that immediately came to mind after studying the README:
- The data samplers look very promising. We might want to add a bit more context (just links would work IMO) on why separate data samplers are needed for similarity learning models so that the users are better aware.
- Providing brief descriptions for the supported losses and usage scenarios for each of them.
- Providing timing information for building the indexers and retrieval querying.
Lastly, I wanted to ask if you folks plan to include self-supervised algorithms for the trainer APIs since works like DINO yields tremendous retrieval performance.
Thanks for the feedback
- Adding samplers colab in 0.14
- Loss will be discuss in the paper (ETA TBD)
@sayakpaul what does providing timing for query and indexing means? You want a benchmark graph?
what does providing timing for query and indexing means? You want a benchmark graph?
Yes.
removing the 0.14 milestone as benchmarking will be done in an upcoming release.
Also, self-supervised algorithms will be included in 0.15 and are available on the development branch. We currently have support for SimCLR, SimSiam, and Barlow Twins and we are working on an example unsupervised notebook.
Would love to be able to contribute. I actively contribute to keras.io with a special focus on semi-supervision and self-supervision.