vscode-dvc Only show relevant columns in experiments table

Similar to dvc exp show --only-changed, the experiments table should be able to show (either by default or through some option) only the columns where there are differences between experiments.

Jul 06 '22 16:07 dberenbaum

Thank you @dberenbaum for bringing this in, I am working on a few new concepts and this may help me to have a better story and more convenient workflow.

I am sharing the ideas. Let's discuss it friendly, please :)

Only show relevant columns in experiments table

+ and collapse rows without changes

or if it's not important to show hidden expS in between

Jul 07 '22 03:07 maxagin

Thanks @maxagin! I worry those are a bit too specific and inflexible regarding hiding/unhiding only one of rows/columns. Comparing to Studio, they hide rows by default and provide an option to expand to show all, right? I think this could be a sensible default (and probably one we should consider in the CLI also).

The default table could hide redundant columns and rows and provide options to:

Show all columns (whether hidden by default or by manual filtering).
Show all rows (whether hidden by default or manual filtering).

What is the way to unhide columns now? I can't seem to find it.

Jul 07 '22 14:07 dberenbaum

Comparing to Studio, they hide rows by default and provide an option to expand to show all, right?

and

I think this could be a sensible default

@dberenbaum, you will need to run some exps before having information in the table and plots. No? Run exps -> See result -> Adjust interface Your needs are changing with the amount of exps you have run. Meaning more exps you have better filtering options you will need.

What is the way to unhide columns now? I can't seem to find it.

At the GUI level is at the sidebar/columns

Show all columns (whether hidden by default or by manual filtering).

I do not think we have “unhide all”

Thanks max! I worry those are a bit too specific and inflexible regarding hiding/unhiding only one of rows/columns.

We already providing tools, filters and sorts. What you had mentioned in the current situation is a very specific option to my mind. Not sure, please correct me if I am wrong.

regarding hiding/unhiding only one of rows/columns.

The concept describes the situation when you will hide (unhide) all the rows and columns that were not changed. Conceptually this is a toggle.

Jul 07 '22 16:07 maxagin

Sorry @maxagin, I misunderstood your diagrams.

Your needs are changing with the amount of exps you have run. Meaning more exps you have better filtering options you will need.

Good point, what is "redundant" will change over time since the table is dynamically updated, so it won't make sense to have this option always "on" by default.

regarding hiding/unhiding only one of rows/columns.

The concept describes the situation when you will hide (unhide) all the rows and columns that were not changed. Conceptually this is a toggle.

Yup, makes sense.

Thanks max! I worry those are a bit too specific and inflexible regarding hiding/unhiding only one of rows/columns.

We already providing tools, filters and sorts. What you had mentioned in the current situation is a very specific option to my mind. Not sure, please correct me if I am wrong.

It sounds specific, but it's actually common and a bit different from what you have above and what I see in Studio. In Studio, it is a long-term view, where often many rows are the same or have slight differences, and those changes may appear in different columns over time.

In VS Code, there is active experimentation, so each row is very likely to contain some changes, often on the same subset of columns. For example:

In this simplified example, all the blue params columns are the same except for min_split, which is the current parameter of interest (in reality, there are often a handful of columns likely to change each time). There are often way too many columns to fit on the screen so this column may be hidden out of view. Users can drag the relevant columns over, so it might not be critical, but finding them if there are many columns could be annoying.

So I guess it's less about showing, hiding, or highlighting, and more about the column order. For example, maybe it would be useful to have an option to reorder columns to show those where any row has a change first, but maybe you can come up with better ideas to solve the problem.

What is the way to unhide columns now? I can't seem to find it.

At the GUI level is at the sidebar/columns

By the way, I still don't understand how to do this 😅 . I'm sure it's my own mental block, but just a note to consider how easy it is to find this option.

Jul 07 '22 17:07 dberenbaum

Good point, what is "redundant" will change over time since the table is dynamically updated, so it won't make sense to have this option always "on" by default.

You right. By default, we have all (more correct will be to say: we will have nothing at the start). What you proposed, to my mind, does not affect the main flow. It is only another way to simplify the table for comparison. I can see the following scenario: run exps (have a few at least) -> toggle --only-changed -> at this point I would mark and label (ad descriptions) to the most interesting -> toggle show all to see the entire flow Interesting conclusion: I may want to only collapse or completely delete the rows that have not been changed to have a more compact environment in the “toggle --only-changed” step from the above flow.

Updated flow (for v1): run exps -> toggle --only-changed -> analyze, mark and add descriptions to the most interesting, collapse and delete rows -> toggle show all

!! Another option would be to always highlight differences in the table without any extra UI controls.

So I guess it's less about showing, hiding, or highlighting, and more about the column order.

You have all the freedom to hide columns and also change their position in the table right now. Talking about our example, I would do: toggle --only-changed (simplify the view, considering the amount of data we have, this can be helpful) -> move the “min_split” column near the other two columns that are already beside the exps column, so all the changes are at the one place close to each other -> finally compare and make some decisions + [analyze, mark and add descriptions to the most interesting, collapse and delete rows] -> mark all as phase [number] -> continue experimenting.

WDYT?

By the way, I still don't understand how to do this 😅 . I'm sure it's my own mental block, but just a note to consider how easy it is to find this option.

We are working on redefining this and I will use your comment as another good argument :) Thank you @dberenbaum !

Jul 07 '22 19:07 maxagin

@dberenbaum

By the way, I still don't understand how to do this 😅 . I'm sure it's my own mental block, but just a note to consider how easy it is to find this option.

Just in case you are still stuck => See the view container in the sidebar:

https://user-images.githubusercontent.com/37993418/178634176-50806fd2-550b-468a-a1e8-3c18543fe48a.mov

Jul 13 '22 01:07 mattseddon

@dberenbaum @maxagin Something like this, maybe?

https://user-images.githubusercontent.com/1231848/179824782-b794c219-9124-4c7b-803b-423948dc863a.mov

Jul 19 '22 18:07 wolmir

Looks good @wolmir! After speaking with @maxagin, I'm not sure we even need to actually drop any columns though. At most, we might need a way to bring the changed columns to the left.

Jul 20 '22 14:07 dberenbaum

Great ! I think now we are ready for the UI solution iterations. I will update you when it's ready. Thank you, folks!

Jul 20 '22 19:07 maxagin

Hey folks! I think the below sketch may be a good solution. Let me know how you feel about it. Thank you !

In relation to - Inform the user about hidden (plots, sidebar) or applied actions with table https://github.com/iterative/vscode-dvc/issues/2075

Concept: We just highlight changes. The user can still see the same table and continue to work with the information.

@dberenbaum especially would like to ask you for your opinion here:) Thanks!

Before

After

Same two examples, but in context

Jul 21 '22 03:07 maxagin

Thanks @maxagin!

What is considered "changed" here?

This seems potentially useful to quickly pick out meaningful values from the table. However, my immediate concern was that those values are often hidden in columns to the right off the screen and potentially spread out, so highlighting won't do much good.

Regarding the problem of the columns being off the screen, if I'm doing hyperparameter tuning, I may be trying a different value every experiment for a few columns and all other columns stay unchanged for every row in the table. I was hoping a solution might allow me in a single step to "snap" or reorder columns and bring them to the left of the screen if any values in that column are unequal (and if more experiments come, I can perform the reordering again based on the updated table values).

Regarding the problem of quickly seeing meaningful values from the table, highlighting seems nice, but it's unclear to me how we identify which values to highlight. It's not always important whether the value has changed from the previous row since often many experiments will be queued and run together and the order won't matter. At my previous company, their internal experiment tracking tool used color gradients to highlight the extreme values in each column (like conditional formatting in a spreadsheet). I know those are noisy and not that clean looking, but something like that can be a helpful visual aid more than a binary choice of whether the value is interesting or not.

Jul 21 '22 16:07 dberenbaum

Hi @dberenbaum ! Please see my response below.

What is considered "changed" here?

High-fidelity UI solution + actions ribbon concept. The logic we discussed is the same. See below for more details.

This seems potentially useful to quickly pick out meaningful values from the table. However, my immediate concern was that those values are often hidden in columns to the right off the screen and potentially spread out, so highlighting won't do much good.

You can hide columns and also change their position in the table already

Change position

https://user-images.githubusercontent.com/98249521/180369357-707f3a4c-a9a9-46df-a0f2-c7816046b158.mov

Hide

https://user-images.githubusercontent.com/98249521/180369369-086db1cb-8583-489e-be79-83e816667376.mov

The Show Only Changed will help to adjust the table as you would like to. The workflow I can imagine:

a.toggle --only-changed (simplify the view). b.move and hide columns. Put them near each other beside the exps column, so all the columns with the changes are at the one place close to each other. c. compare and mark with the stars most interesting, rename and eventually add some descriptions d. eventually collapse, not interesting rows (with the delete option if the user wants to remove it completely) e. continue experimenting in the table and comparing Runs in the Plots view

Regarding the problem of the columns being off the screen, if I'm doing hyperparameter tuning, I may be trying a different value every experiment for a few columns and all other columns stay unchanged for every row in the table. I was hoping a solution might allow me in a single step to "snap" or reorder columns and bring them to the left of the screen if any values in that column are unequal

Good. I thought that it may be good if users can see all the information, but have the option to reorder or hide columns manually, but if you think it’s not important we could hide unchanged columns automatically and show them back if the changes happened in the hidden columns. See below:

Before

After

(and if more experiments come, I can perform the reordering again based on the updated table values).

This is something that will happen also automatically, based on the above, unless you will toggle the Show Only Changed off. Does it make sense?

Regarding the problem of quickly seeing meaningful values from the table, highlighting seems nice, but it's unclear to me how we identify which values to highlight. It's not always important whether the value has changed from the previous row since often many experiments will be queued and run together and the order won't matter.

Yeah. It is why I have proposed keeping all the values, so if the “system” is wrong, the user still can see all info and make the decision. However, we are displaying the exps in the order, so my guess is the oldest (cell value) will be highlighted

At my previous company, their internal experiment tracking tool used color gradients to highlight the extreme values in each column (like conditional formatting in a spreadsheet). I know those are noisy and not that clean looking, but something like that can be a helpful visual aid more than a binary choice of whether the value is interesting or not.

This is great, but may be very complex to implement. @mattseddon what do you think about this?

Jul 22 '22 05:07 maxagin

FWIW we have already implemented color highlighting for deps when they are changed with respect to the previous commit. See https://github.com/iterative/vscode-dvc/pull/2029. This was done as part of https://github.com/iterative/vscode-dvc/issues/1657.

Jul 22 '22 06:07 mattseddon

At my previous company, their internal experiment tracking tool used color gradients to highlight the extreme values in each column (like conditional formatting in a spreadsheet). I know those are noisy and not that clean looking, but something like that can be a helpful visual aid more than a binary choice of whether the value is interesting or not.

This is great, but may be very complex to implement. @mattseddon what do you think about this?

Not impossible but would be painful. We would need a very good reason (or be very brave) to start on this without a lot of data/signal to back it up.

Jul 22 '22 06:07 mattseddon

The workflow I can imagine:

a.toggle --only-changed (simplify the view). b.move and hide columns. Put them near each other beside the exps column, so all the columns with the changes are at the one place close to each other. c. compare and mark with the stars most interesting, rename and eventually add some descriptions d. eventually collapse, not interesting rows (with the delete option if the user wants to remove it completely) e. continue experimenting in the table and comparing Runs in the Plots view

Sorry, I should have been explicit that I'm aware that I can hide and move columns, but I would like it to be easy and quick to get to the relevant info. Those columns of interest may be way off to the right and spread out from each other. This workflow feels like a lot more work for the user than auto-reordering (which is what I meant by "snapping").

Good. I thought that it may be good if users can see all the information, but have the option to reorder or hide columns manually, but if you think it’s not important we could hide unchanged columns automatically and show them back if the changes happened in the hidden columns.

Yes, this is what I would like to see. I agree that it may be aggressive to hide the columns, so I would prefer to reorder them. No strong opinion on whether we do that automatically or via a toggle.

(and if more experiments come, I can perform the reordering again based on the updated table values).

This is something that will happen also automatically, based on the above, unless you will toggle the Show Only Changed off. Does it make sense?

I thought from the above that "Show Only Changed" highlights changes but doesn't reorder anything?

If we go with the option to reorder columns, then yes, it makes sense.

Jul 22 '22 18:07 dberenbaum

Yes, this is what I would like to see. I agree that it may be aggressive to hide the columns, so I would prefer to reorder them. No strong opinion on whether we do that automatically or via a toggle.

Reorder without user permission would not be good. If we hide unchanged columns with the toggle “--only-changed” this is more correct to my mind, as the user requesting this changes by toggle activation.

I thought from the above that "Show Only Changed" highlights changes but doesn't reorder anything?

Hiding unchanged columns and highlighting changed cell values.

So the mockup below is satisfying our requirements to my mind:

Jul 22 '22 23:07 maxagin

@dberenbaum I have put together a document including four possible options with a detailed analysis of each one. Please review the document and let me know what you think would be the best solution for us to follow. Have a great weekend!

Figma

@shcheklein in case you have nothing better to do than browsing GH this beautiful Friday evening :) You are most welcome to share your comments on this issue.

Jul 23 '22 03:07 maxagin

Hi @maxagin, sounds good, thanks! It looks like I don't have permission to see the figma doc. Can you check?

Jul 25 '22 20:07 dberenbaum

Thanks @maxagin!

Thoughts on option 1

Option 1 does NOT reorder nor hide any columns, right?

If that's true, is a toggle needed? Is it ever helpful to toggle highlighting off?

Option 1 seems like a good start no matter what else we decide to do.

**Option 5 🤣 **

Let me try to better explain what I have in mind, and you can let me know if it makes sense.

The idea is to reorder changed/unchanged columns using something like an action button. This is not a toggle, nor is it automatic. Instead there is some button to "move changed columns to the left." It reorders the columns and then returns control to the user and does not "stay on" like a toggle.

The workflow would look like:

Click the "move changed columns" button to create a useful default view.
Manually move and hide columns to further tweak the view.
Compare rows and star, label, hide, etc.
Continue experimenting.
Click the "move changed columns" button again. If there are newly changed columns, they will be moved to the left of any unchanged columns.
Continue comparing.

Advantages I see to this approach:

It allows users to get to a reasonable view quickly.
It doesn't interfere with users' choices of columns to hide.
Since it's not a toggle, users can manually reorder columns after without confusion.

Final thoughts

Without any reordering, I don't think we address the original request. The step to "manually move and hide columns" to get to a reasonable view still seems painful. It doesn't need to be top priority, but I would like to at least keep the issue open until we have some way to avoid that step.

Jul 27 '22 12:07 dberenbaum