atlas icon indicating copy to clipboard operation
atlas copied to clipboard

[BUG] Job metrics are appended into a list instead of overwritten

Open shazraz opened this issue 4 years ago • 13 comments

When logging metrics using foundations.log_metric(), metrics for the same job across epochs are appended into a list instead of being overwritten.

This is because of the JobMetricsConsumer.

job_id = message['job_id']
        key = 'jobs:{}:metrics'.format(job_id)
        metric_key = message['key']
        metric_value = message['value']
        value = (timestamp, metric_key, metric_value)
        self._redis.rpush(key, serialize(value))

This makes the metrics very hard to read on the UI.

image

The user should see only the latest metric in the UI. @rayhane-mamah can you please comment?

shazraz avatar May 21 '20 21:05 shazraz

Proposal: Add a default parameter that indicates the function will append to existing value, i.e. change from log_metric(key, value) to log_metric(key, value, overwrite=False)

Pros: backwards compatible, user gets to choose behaviour if they want the existing value to be overwritten Con: potentially not the default behaviour users want (do we have some other user feedback?)

Alternative considerations:

  1. using an alternative default parameter like "append=True" or "replace=False"
  2. make the default behaviour to overwrite existing values, with append as the non-default behaviour (requires minor version change/breaks backwards compatibility)
  3. don't change the signature and do not offer control to user, make the function always overwrite

ekhl avatar May 21 '20 21:05 ekhl

I'm in favour of keeping backward compatibility as we haven't figured out a way to communicate changes between versions yet to users. This might improve the need to communicate changes to end users.

2 considerations:

  • I like overwrite=False
  • I also like the reverse ordering array.insert(0,values)

kyledef avatar May 22 '20 13:05 kyledef

I am also in favor of keeping backwards compatibility. In certain situations, having the ability to see all historical values would be valuable; so removing the functionality may just force us to bring it back later after a user complaint.

As a thought, would it be possible to store all values in the backend but serve them to the user based on a frontend setting? Storing all values on the backend would also allow visualization of training and validation curves across models later on.

Dessa-ColeClifford avatar May 22 '20 18:05 Dessa-ColeClifford

@Dessa-ColeClifford I like that idea. We don't have to modify the SDK in that case. Were you thinking a toggle of sorts on a per-column basis for metrics?

ekhl avatar May 22 '20 21:05 ekhl

@ekhl A toggle per-column would be the quickest implementation that delivers all required functionality to close this bug. However, this toggle will only hide and not fix the comment "This makes the metrics very hard to read on the UI". In the toggle state that shows the list, we should still re-style the list popup.

I can imagine a more complicated scenario where a cell with "complex data" (which would still need to be clearly defined) has an on-hover and/or on-click component that allows for visualizations (e.g. value over time) or further data analysis. But this shouldn't be treated as part of the described bug due to size of work and required design.

Dessa-ColeClifford avatar May 25 '20 14:05 Dessa-ColeClifford

As a thought, would it be possible to store all values in the backend but serve them to the user based on a frontend setting? Storing all values on the backend would also allow visualization of training and validation curves across models later on.

How about if we continue to store all metrics in the backend but only render the most recent metric in the UI? I think this would be the least-effort path that maintains backwards compatibility.

If the user wants access to all of the metrics for plotting purposes, a safe assumption is that they would want programmatic access. In this case, they can retrieve those metrics using get_metrics_for_all_jobs or get_metric. Was there a use-case defined for rendering all values in the UI when we originally went down this path?

shazraz avatar May 25 '20 17:05 shazraz

How about if we continue to store all metrics in the backend but only render the most recent metric in the UI? I think this would be the least-effort path that maintains backwards compatibility.

I agree with this. A toggle can be added in later.

The use-case is as simple as knowing the history of how the model did over time. The debate for what we displayed went back-and-forth with "it's too much information" and "but we need to know historic values".

Dessa-ColeClifford avatar May 25 '20 17:05 Dessa-ColeClifford

Im actually opposed to adding a toggle on a per column basis because it may clutter the UI and add additional complexity. If a user wants to view historical values, does it make sense if we limit this to retrieving them via the SDK as I described above?

My thought is that providing a list of metrics in the UI adds little value other than being able to quickly eyeball the metrics. Would love some user feedback on this from the ML team @rayhane-mamah

shazraz avatar May 25 '20 17:05 shazraz

The requirement to easily view this information was something that was user validated and discussed at length before implementation.

If we want to limit to the SDK, I am fine as long as we have more than those immediately involved (e.g. MLE's on the Risk project) as to not create a localized regression in functionality.

Sorry for pushing on this, it just feels that we are overfitting at the moment and want to make sure we validate that the general sentiment has truly changed before updating something.

Dessa-ColeClifford avatar May 25 '20 18:05 Dessa-ColeClifford

This makes sense. Let me ping the ML team with this ticket and see if we can get some feedback.

shazraz avatar May 25 '20 18:05 shazraz

Excuse the awful drawing, but is this what you were thinking @Dessa-ColeClifford ?

image

If so - I agree with Cole - this was something we discussed heavily before and was always favoured. If we give the ability to expand the column view - which epoch run shows us up in the graph above?

mohammedri avatar May 25 '20 18:05 mohammedri

I think a simple solution would be a (single) check box in the GUI to "show only latest value".

If the values/metrics are tracked as well with tensorboard, the user can visualize the evolution of a metric there, however in atlas, there is no option to plot this evolution of a metric yet, am I right? (The plot in atlas allows me to compare model parameters and final (I mean 'single valued') metric results, but does not show/visualize all appended values of a metric, right?) If I would not use tensorflow (tensorboard), I would miss such a plot of the evolution of a metric.

If I think about it now I would consider as well an additional tab in the "Details For Job" page (now: Logs, Artifacts) and add one tab with "metric plots" where lists of metric values are plotted. If a user wants a metric to be plotted, that could be indicated by a flag in the foundations.log_metric call.

When I understand it correctly this is what you @Dessa-ColeClifford and @mohammedri already indicated as well here.

SebastianGergen avatar May 26 '20 14:05 SebastianGergen

With #165 we made the decision to only show the latest metric in the UI column based on @pippinlee 's research; I am curious whether we can easily show all the metrics recorded so far in detailed results page (where the artifacts and logs are as well)

mohammedri avatar Jun 01 '20 19:06 mohammedri