kedro
kedro copied to clipboard
Ability to link plots to an experiment
Description
This is based on the first high priority issue resulting from the experiment tracking user research, which is:
Ability to save and link images of plots/model artefacts to an experiment. This would provide users with more insight (images and metrics together) to track/compare the evolution of runs across a timeline
What is the problem?
- User wants the ability to save images of model artefacts (such as Roc curve, or confusion matrix) alongside the metrics of a run
- "For example, you go into the UI, say okay, this is the run that that's important to me. I can get certain objects that I store". "90% of the cases would be CSVs and images"
Who are the users of this functionality?
- Data Scientist
Why do our users currently have this problem?
- Existing Solution 1: Use MLFlow - " MLflow allows us to save images and not just metrics"
- Existing Solution 2: Kedro - "I am saving those as PNG files (in the azure blob storage) and using some parameters to set the sub folder names so that I can compare to previous runs … not perfect but works". "I’d like to be able to flag some pngs to be included in the experiment tracking so I have a record (with time line) how they’ve change"
What is the impact of solving this problem?
- User can keep track of specific artefacts alongside the experiment results
- "If I run a model I want to save the columns that were created next to it, I might want to create a model saved next to it (artefacts below the model) - something I am used to that I didn't have. There is a lot of artefacts I would want to save with an experiment"
What could we possibly do?
- Enable the ability to save model artefacts such as images and CSVs which makeup 90% of usercases
This makes sense for users, in a previous iteration of this functionality we used to allow users to:
- Track PNGs, PDFs, CSVs and Excel spreadsheets as part of an experiment and see it on the UI
- Compare the artifacts to each other
Copying this to here so we don't lose it:
I spoke to Lim about this a long time ago and made some notes on his thoughts. He thinks we should have a dataset called something like
tracking.ArtifactDataSet
which is basically for everything that's not a metric or json. kedro-viz would then work out how to render the dataset dependent on the file type (e.g. png).
I am not sure how this fits in with our existing matplotlib and plotly datasets. Especially because plotly dataset saves to json, how would kedro-viz know to render that as a plot? Do we need another type tracking.PlotlyDataSet
to handle this case? Should we just be using the existing matplotlib/plotly datasets for this?
@idanov thought that tracking.JSONDataSet
was not the right approach (vs. the pre-existing json.JSONDataSet
) so I am guessing would also not like this tracking.ArtifactDataSet
. We need to figure out exactly what datasets we want to use here and what the significance of a "tracked" dataset is (i.e. is it the same as a versioned one? is it a separate dataset altogether?).
Notes from Technical Design session:
The team discussed possible solutions to enable users to track plots and other artifacts.
Possible solutions:
-
The
tracking.ArtifactDataSet
as proposed by Lim (see comment above). This dataset would allow users to store any type of data that can be considered an artifact, e.g. images, plots etc. Viz would then figure out how to render whatever data is stored under this dataset type.
The general consensus about this approach is that special tracking
datasets shouldn't be the way to log more data as part of a run. It raises the question about how many "tracking" datasets we'd end up adding. The discussion led to the option of not having tracking datasets anymore at all.
- No tracking datasets at all
- Tracking datasets are really just versioned datasets with some extra logic when it comes to the
tracking.MetricsDataSet
, but thetracking.JSONDataSet
is just the same as the regularJSONDataSet
with versioning on by default. - Originally, one of the main reasons why we decided we needed them was as a way to tell viz what data to show as part of the experiment tracking panel.
- All existing datasets in Kedro now allow users to log artifacts (plots, images, etc.) so it's silly to add special
tracking
datasets that would pretty much do the same thing - Arguably, versioning isn't exactly the same as tracking. As in, a user might want to version a dataset, but not make it part of the experiment tracking data. Letting the user decide what data to show in the experiment tracking panel, could happen on the UI side (needs design).
Follow up actions:
The decision was made to go for option 2 and move away from special tracking
datasets and instead showing all versioned and visualisable datasets on the experiment tracking panel. This leads to the following actions:
- [ ] Kedro will throw an error when turning on versioning for a dataset later on in the process. We need to fix that workflow as showing versioned datasets in experiment tracking might be an incentive for users to turn on versioning later on when they find they need this data to be displayed.
- [ ] We will not immediately remove or deprecate the existing
tracking
datasets, but we need to decide on the future of those, keeping in mind the use case for showing the metric timeline. - [ ] Add functionality to render all versioned datasets on the Viz side. This links to: https://github.com/kedro-org/kedro-viz/issues/907
Just to record this in writing also: while I agree with the "tracked plot = versioned
dataset" approach, it does feel like an inconsistent and confusing UX given the already-existing tracking
datasets:
- Want to track json data? Change your dataset type to
tracking.JSONDataSet
. - Want to track a plot? Keep the same dataset type but set
versioned: true
.
Hence I think we do need to work out what happens with tracking.JSONDataSet
and tracking.MetricsDataSet
sooner rather than later. tracking.JSONDataSet
could be easily deprecated in favour of json.JSONDataSet
with versioned: true
, but tracking.MetricsDataSet
is trickier. To me this is directly coupled to questions like "how do I search runs by metric" and "why not just do log_metric
call" (which we decided against before). Overall, adding plots to experiment tracking sounds straightforward and I'm very happy to do it by versioned: true
, but we need work out a more holistic and complete solution here or experiment tracking becomes a bit of a mish-mash of different approaches.
Now on the question of showing plots in experiment tracking:
- not for the MVP, but a killer feature here would be if there were a good way to compare plots between runs. This is something data scientists do A LOT and it's always done manually just by putting the plots next to each other on your screen and flicking your eyes between them for several minutes. This is also how the current compare screen on experiment tracking would work. Can we do something like this instead? https://github.blog/2011-03-21-behold-image-view-modes/
- bear in mind that matplotlib dataset (which I think will cater for a very high percentage of artifacts) supports multiple plot pngs. This is the same complication I mentioned in https://github.com/kedro-org/kedro-viz/issues/783. Again, fine for the MVP to ignore this, but I do think it will come up as a user requirement. There's actually a separate question here of whether/how matplotlib dataset should allow this in the first place (https://github.com/kedro-org/kedro-plugins/issues/529), but either way the case of multiple outputs plots remains.
Notes from Follow up Design/Engineering session:
The team discussed a way for users to be able to visualise and compare the dataset plots during experiment tracking.
Follow up actions:
- [x] Design(@GabrielComymQB and @Mackay031 ) to start exploratory designs: Low-Fi mockups and then provide feedback to the team
- [x] Once completed, engineering (@tynandebold ) to scope and commence development
Timeline:
To be completed by the end of the next sprint: 15/07/22
This issue is completed in https://github.com/kedro-org/kedro-viz/issues/953