FastAI.jl fastai parity

This issue tracks the progress on fastai parity.

Last updated 2021/08/11

Datasets

[x] easily download datasets from fastai dataset collection (datasetpath)
[x] flexible APIs for loading datasets in varied formats (Data container tutorial)

Data pipelines

[x] create data pipelines from data block information (BlockMethod)
- [x] visualizations based on blocks
[x] fast, paralllelized data loading (DataLoaders.jl)
[x] fast, composable affine augmentations for images, masks and keypoints (DataAugmentation.jl)
- [ ] on GPU: GPU support is still WIP, see https://github.com/lorenzoh/DataAugmentation.jl/issues/48
advanced data augmentation
- [ ] MixUp
- [ ] CutMix

Models

architectures
- [x] XResNet (FastAI.Models.xresnet*)
- [x] UNet (FastAI.Models.UNetDynamic)
[x] pretrained weights (Metalhead.jl#70)

Training

training schedules
- [x] hyperparameter scheduling (Hyperparameter scheduling in FluxTraining.jl)
- [x] one-cycle schedule One-cycle in FastAI.jl)
- [x] finetuning schedule Finetuning in FastAI.jl)
- [ ] flat cosine schedule
[ ] mixed precision training

AFAIK this is currently in the works by the Flux.jl team
[ ] distributed data parallel training

Lots of progress at DaggerFlux.jl. Needs to be integrated with FastAI.jl
callbacks
- [x] early stopping (EarlyStopping)
- [x] checkpointing (Checkpointer)
  
  Works but could use some improvements such as conditional checkpointing
- [x] stopping on NaN loss (StopOnNaNLoss)
- [x] metrics and history tracking (Metrics, Recorder)
- logging modalities
  - [x] hyperparameters
  - [x] metrics
- logging backends
  - [x] TensorBoard (TensorBoardBackend)
  - [x] WandB (WandB.jl)
  - [ ] Neptune.ai
- [ ] gradient accumulation
[ ] metrics and loss functions

Many metrics and loss functions are still missing, see discussion below

Applications

computer vision
- image classification
  - [x] single-label
  - [x] multi-label
- [x] image segmentation
- [x] image keypoint regression
tabular
- [x] classification
- [x] regression
[ ] recommender systems
[ ] natural language processing

Nov 05 '20 09:11 lorenzoh

Updated the list with pointers to existing implementations

Dec 18 '20 12:12 lorenzoh

I vote for reviving MLMetrics instead of rolling them into FluxTraining. I'm sure there would be many base Flux, Knet and other library users who could make use of them. Currently everybody has a incomplete subset of metrics with incompatible APIs (e.g. Avalon.jl). I would hope that we could get rid of those for a "NNlib of metrics" instead.

Dec 18 '20 16:12 ToucheSir

I vote for reviving MLMetrics instead of rolling them into FluxTraining. I'm sure there would be many base Flux, Knet and other library users who could make use of them. Currently everybody has a incomplete subset of metrics with incompatible APIs (e.g. Avalon.jl). I would hope that we could get rid of those for a "NNlib of metrics" instead.

Since most of these metrics are really distances in one form or another wouldn't it be natural to use the existing implementation in Distances.jl ? But maybe I'm missing something.

Dec 19 '20 10:12 DoktorMike

Yeah I think implementing the distance part of it in Distances.jl makes a lot of sense.

Dec 19 '20 14:12 darsnack

I think there's still a need for a metrics package because most of the classification metrics don't make sense in Distances.jl. There's also the case of domain-specific metrics like ~~Dice score~~(edit: potentially general enough to go in something like Distances) and BLEU.

Dec 19 '20 16:12 ToucheSir

there should be some distinction between losses and metrics?

Dec 19 '20 16:12 CarloLucibello

At the very least, the metrics package/namespace could reexport losses that are also metrics.

Dec 19 '20 16:12 ToucheSir

Yeah I think the hierarchy would be Distances -> Metrics -> Losses. There will be losses that are not metrics (i.e. defined completely in a loss package), and losses that just reexport a metric. Similarly there will be metrics that are completely defined the metrics package, but many will reexport or build upon standard distances.

To that end, I agree that we should make use of Distances.jl as much as possible. And if there is a metric that generalizes to a distance, then we can submit a PR to Distances.jl.

Dec 19 '20 17:12 darsnack

I agree with @darsnack! Every loss can be a metric (not in a strict mathematical sense but as a measure of model performance), but not the reverse.

Dec 19 '20 17:12 lorenzoh

Also, if MLMetrics.jl is revived, FluxTraining.jl should depend on it.

Dec 19 '20 17:12 lorenzoh

Updated the list

Aug 11 '21 10:08 lorenzoh

@lorenzoh do you want to transfer stuff from here to FastAI.jl issues?

Aug 09 '22 16:08 darsnack

I went ahead and transferred the issue to the FastAI.jl repo which should make it easier to track 👍

Sep 03 '22 04:09 lorenzoh