orange3 icon indicating copy to clipboard operation
orange3 copied to clipboard

Create Class: settings seem to change randomly

Open bluewin4 opened this issue 3 years ago • 9 comments

  • [ ] What's wrong?

I'm trying to establish a class that pulls out the temperatures from my data set, but every time I try to delete a previous class I'm not using for this data set it overwrites my edits and it's very frustrating. I've had this happen in a couple of places, but this is the one that's making things harder right now. bug_class.zip

  • [ ] How can we reproduce the problem?

Just open it up and delete the one that's labeled "delete" and watch the one I care about turn into the one that got deleted.

I figured out that if I keep the window for the widget I care about open then when I delete the other one it doesn't overwrite anymore. nvm, when it gets reconneted to the multifile it overwrites...

  • [ ] What's your environment?
  • Operating system: windows
  • Orange version: 3.30.0.dev
  • How you installed Orange: command line

bluewin4 avatar Jun 22 '21 13:06 bluewin4

This also happens when I try to connect classes I've made elsewhere in a workflow to a new data set I have, it's very frustrating having to manually rewrite everything and then having my data overwritten in a seemingly arbitrary fashion.

bluewin4 avatar Jul 01 '21 16:07 bluewin4

Could you explain what you mean with a very small example? How could we replicate the problem you are facing? What is rewritten?

markotoplak avatar Jul 01 '21 18:07 markotoplak

ckass.zip Here's another workflow with this problem, I removed all unessicary widgets. The two "Create Class" widgets with the same name were copy and pasted, the only difference being that when I connect it to the new one a completely irrelevant set of class designations overwrites the ones I just finished making. If you delete the class that says "pointless but don't delete" then reconnect it will overwrite the new class widgets. It's very frustrating.

bluewin4 avatar Jul 01 '21 18:07 bluewin4

Ah, now I see what probably causes the problem: context settings. So, Orange tries to be smart and remember your widget setting connected to the similar data (similar = similar column names), so that it could be easier to recreate workflows for that particular data set. So here it seems to match the wrong context. We will take a look at this, thanks!

For other Orange devs: this seems really strange, maybe we use contexts wrongly here? I did not check the workflow, but why would a new context be applied if the old one matches? Anyway, need there be any other context settings except the one for the source feature? We could make the rest schema only. Users from spectroscopy, at least, often create classes from the Filename column, so all contexts match perfectly.

markotoplak avatar Jul 01 '21 19:07 markotoplak

Oh, that makes sense. I've noticed that when I'm making workflows and it can be really helpful, I just wish I had some control over when it was using context settings and which set to choose. Because as it is right now I'm never quite certain what I'm going to get when I try to connect things together, beyond just some vague hope that it'll be the same as one of my previous iterations.

If there was a little menu or set of contexts I could manually select it would be amazingly helpful, especially as it would help prevent overwriting my work.

bluewin4 avatar Jul 01 '21 20:07 bluewin4

I've run into similar problems with Feature Constructor.

If there was a little menu or set of contexts I could manually select it would be amazingly helpful, especially as it would help prevent overwriting my work.

+1, a way to explicitly manage contexts would be perfect. We should empower users, not impose magic on them.

The way it currently works is, when you delete a widget, it saves its Setting values for that domain into a context. However, a user will typically delete something because they want to get rid of it. The unwanted context comes back every single time the user connects the data set to a widget of the same type. Unless this mechanism is explained to the user explicitly, they don't understand why they keep getting something they deleted ages ago. Even then, having to explicitly think "now I'm going to delete this widget to reset my context" is bad.

A couple of ideas:

  • explicit contexts
  • contexts should refresh whenever a Setting change is made, not when the widget is deleted (perhaps this will be possible after https://github.com/biolab/orange-widget-base/pull/151?)
  • a widget's Settings shouldn't be overwritten with a context unless the widget is clean, in the sense that it's a new widget that hasn't had a data set connected yet

Would the third idea disrupt how some people use Orange right now, wherein they make a chain of widgets and switch what data set is connected to the first one?

irgolic avatar Jul 01 '21 20:07 irgolic

@irgolic, we go through this discussion every few months. (Thanks God for this one-year break during Covid.) So let's do it again. :) Have you thought about how to describe a context in a menu entry? Even domain contexts may be tricky to summarize. Naively, a domain context is simply a domain. It's not. First, it's about settings, not domains. Non-perfect domain context handlers actually wouldn't need to have the domain encoded at all. Maybe we could indeed discard it after the changes I mention below in a paragraph marked with (*).

So, a context is essentially a list of widget's ContextSetting attributes and their corresponding values, encoded in some appropriate for. These values are not necessarily encoded instances of Variable; in general, they are some information that is needed to retrieve the widget's state. Can be a dict of lists of named tuples. Whatever. They may not have any representation in UX. There may be nothing there that the user can understand without reading the widget's code.

While some widget that you may have been thinking about, such as scatter plot, could expose its settings because they are just encoded Variable names and types, the widgets in which the users are most likely to get an unwanted context are exactly those where invididual settings are not directly related to UX elements and are stored in complex data structures that are restored into widget's state. See what the settings of Edit Domain look like. Or see https://github.com/biolab/orange-widget-base/pull/62/files#diff-f06d738cc6f1df19600dbdb79559386ae3a27394647889ee6ca91922d2241fbeR801 for an impression of what a setting can be. Or unpack_value and pack_value that follow.

Or see settings annotations in https://github.com/biolab/orange3/pull/4721/files. For example, settings of OWSelectColumns are domain_role_hints: Dict[Tuple[str, int], Tuple[str, int]]. How do you present this to the user?

Also, don't forget that contexts are not necessarily matched by domains. What is the context if the signal is Evaluation Results? Or Distance Matrix?

(Now I have to pin this answer, so I can find it the next time we start this discussion. :) )

explicit contexts

If by that you mean that user can select a context: see above.

contexts should refresh whenever a Setting change is made

This functionality was eliminated because we at some point decided that it's bad and confuses users. I won't go into details here, but this is not coming back; the last remnants have been removed in https://github.com/biolab/orange-widget-base/pull/62.

(*) You may have meant something else: the current context should be put at the top, so it is the first match in case there are multiple. This has already been done (I won't dig out the PR number). However, there were cases in which we'd appreciate the previous behaviour -- which we realised immediatelly after merging. If I remember correctly, though, this functionality was reinforced in https://github.com/biolab/orange-widget-base/pull/62.

a widget's Settings shouldn't be overwritten with a context unless the widget is clean, in the sense that it's a new widget that hasn't had a data set connected yet

Say that Scatter plot shows Iris, so it's context are x and y variables, say petal length and width. Then we give it heart disease. You say that the scatter plot will still show petal length and width because it's not the case that "the widget is clean, in the sense that it's a new widget that hasn't had a data set connected yet"?!

Here's some history:

  • I implemented the first version of context settings; it took me a month, full-time, and a lot of discussion with Gregor Leban.
  • This first version became a pilot study that I threw away and wrote everything again from scratch, with some functionality removed. It still took a month.
  • For Orange 3, Anže said that contexts were too messy and programmed them again. He removed a lot of functionality but most of it came back eventually. His implementation, however, is amazingly well designed. A real jewel, which ages beautifully.
  • In May 2020 I started reorganizing settings again to add type annotations, which allow changing the way settings are (re)stored -- this will eventually let us store them as Python literals. I'm on and off on this project for more than a year. At some point we even merged my changes but immediately reverted them because everything was crashing.
  • Now you say that you have studied how settings work and would like to propose a few fresh ideas? :)

janezd avatar Jul 01 '21 23:07 janezd

Thank you for the detailed writeup! :D Honestly, I still don't fully understand it. I looked at the documentation, but it's too complicated for me to understand. Where does it explain that contexts are saved upon widget destruction? What's a perfect context match? https://orange-widget-base.readthedocs.io/en/latest/tutorial-settings.html?highlight=context#context-dependent-settings

Honestly, thank you for your work on this issue. It seems very intricate, forgive my superficial understanding. I don't claim to know what the optimal way to implement this is, I'm just throwing ideas out because I see the UX fail.

Here's another idea: the same thing that triggers upon widget destruction (context refresh/putting to top/whatever) also triggers when you save your workflow.

Yes, I remember our last conversation. It ended on me saying I'd prepare some examples/scenarios. I wasn't able to, which is why I'm thankful for this issue tbh – it's an example! Another example that bugs me: I once opened File and set one of the columns in the domain to metas. Then I deleted the widget because it was wrong. Since that day, every time I add a File, it will always set that one column to metas. Users have no idea that to fix that, they have to change the column role and DELETE the widget.

This functionality was eliminated because we at some point decided that it's bad and confuses users. I won't go into details here, but this is not coming back; the last remnants have been removed in biolab/orange-widget-base#62.

Could you please go into the details? What this needs is user testing. You don't need more than three Orange users with no development experience and a task.

Have you thought about how to describe a context in a menu entry?

For lack of a better word, 'Untitled context'?

I tried making a mockup, and realized my proposition is pretty much report.

Screen Shot 2021-07-02 at 2 02 29 AM

(this is very thrown together to illustrate it, I'd use hover-over buttons that Report uses for Delete/Open scheme)

Each domain (context match/whatever) matches several configurations of domain-specific Settings, right? Let the user create these, name these, and manage these from within the widget. When one of the rows is chosen, the context-dependent Settings in the widget change (automatically or after pressing Apply in the autocommit).

I've not used report extensively, but could it be extended with this functionality?

irgolic avatar Jul 02 '21 01:07 irgolic

For other Orange devs: this seems really strange, maybe we use contexts wrongly here? I did not check the workflow, but why would a new context be applied if the old one matches?

Context are reused only on perfect match (see here: https://github.com/biolab/orange-widget-base/blob/master/orangewidget/settings.py#L851). If the context for scatter plot has a match for the (mandatory) x and y but not the (ContextSetting.OPTIONAL), color, it is copied because the match is imperfect.

Another thing that can go wrong: widgets have individual settings stores, which are merged into global ones when the widget is deleted. (Or earlier? I'm not sure). Context handlers cannot detect duplicated (or "overlapping") contexts, so this can be a source of multiple matching contexts with different settings.

Two possible general changes to discuss:

  • I forgot the reason for not reusing contexts on partial matches. Maybe you're right and we should always reuse.
  • We can make contexts local to a widget, without the global store. When the widget dies, its context settings die with it. However, non-context settings would remain because they represent a kind of "global configuration" for this widget -- some users may prefer to see the regression line by default.

Both seem like a good idea, but we'd have to think about any undesired side effects. I suppose you'll particularly like the second because it essentially makes context settings schema-only. And I'd like it because it simplifies the code. :)

A potential change in this widget's settings:

  • We can add the substrings to contexts; the context would not match if it doesn't contain any (all?) substrings.

Anyway, need there be any other context settings except the one for the source feature? We could make the rest schema only.

Checkboxes are typically non-context, though they can also be in the context for as long as there exists some setting that is useful for matching. https://github.com/biolab/orange3/pull/4721/commits/9e81cf71c34a0fa7326e014e40614b744371152d#diff-e772854563aeab8cc16da1052453f52c3462cbd2c69d3da2b008d81d943baeccR129 will add a warning because I found some widgets with domain context handler but without any setting of type Variable :).

I believe that in this widget the two checkboxes should remain in the context. Options like "Match only at the beginning" does not sound like a general setting. But if you think they should be schema-only, they can be schema-only context settings. If I recall correctly, we (I think it was actually me) fixed them so that this combination is possible.

janezd avatar Jul 03 '21 19:07 janezd