dash icon indicating copy to clipboard operation
dash copied to clipboard

[BUG] Duplicate callback outputs with identical inputs (and potentially different states) fail

Open RenaudLN opened this issue 1 year ago • 12 comments

Environment

dash                 2.9.2

Describe the bug

When 2 callbacks are created with a duplicate output (and allow_duplicate=True) and the same Inputs (States can be different), Dash gives a Duplicate callback outputs error.

from dash import Input, Output, Patch, State, callback

@callback(
    Output("deck", "data", allow_duplicate=True),
    Input("filter1", "value"),
    Input("filter2", "value"),
    State("state1", "value),
    prevent_initial_call=True,
)
def add_layer1(filter1, vilter2, state1):
    layer1 = load_layer1(filter1, vilter2, state1)
    update = Patch()
    update.layers[0] = layer1
    return update


@callback(
    Output("deck", "data", allow_duplicate=True),
    Input("filter1", "value"),
    Input("filter2", "value"),
    State("state2", "value),
    prevent_initial_call=True,
)
def add_layer2(filter1, vilter2, state2):
    layer2 = load_layer2(filter1, vilter2, state2)
    update = Patch()
    update.layers[1] = layer2
    return update 

Expected behavior

This should either:

  • work, the use case is to load the data in several chunks / layers and return it with Patch so that things start displaying on the screen sooner
  • or give a descriptive error message (something like "Several callbacks with duplicate Outputs have identical Inputs")

I assume that this is due to a hashing that is done based on the Inputs only, maybe it could include the decorated function name and the States in the hash as well?

Current workaround: Adding a dummy Input in the callback saves the day.

Screenshots

image

RenaudLN avatar Mar 31 '23 07:03 RenaudLN

This is the same callback two times, only the state differs and they will fire both at the same time. They should be combined for performance reason so there is only one request.

Duplicate output now create the different callbacks based on the Inputs, so each callback should have a different set of inputs. I am not really sure we should be fixing this and as I think this create an undesirable effect, if the returned object is not a patch, there is no guarantee which one will update first without a priority system. Maybe add a more comprehensive error instead.

T4rk1n avatar Mar 31 '23 14:03 T4rk1n

I'm fine with a different error message, the current one is just really unhelpful to solve the problem :)

RenaudLN avatar Mar 31 '23 15:03 RenaudLN

An interesting point from @tlauli, merging this in from #2546:

I think that two callbacks with matching signatures and allow_duplicate=True on all Outputs should be allowed. My use case would be using Patch to update individual traces of a dcc.Graph. Each callback would update a different trace, so there is no ambiguity in the final state. It also allows for the callbacks to run in parallel, improving responsibility.

Being able to parallelize a callback this way is a really interesting use case. Unfortunately it's also one that potentially breaks many of the things we could do to disambiguate these callbacks in our internal tracking - which needs to be deterministic ESPECIALLY when we're talking about parallelizing since that implies multiple processes. Consider this pattern, which might look natural for this case:

def create_trace_callback(trace_num):
    @callback(Output("graph", "figure", allow_duplicate=True), Input("control", "value"))
    def update_one_trace(value):
        update = Patch().data[trace_num]
        update.x, update.y = get_new_data(value, index=trace_num)
        return update

for i in range(n_traces):
    create_trace_callback(i)

(edit: example updated, @T4rk1n points out my original syntax of creating the callback directly in a for loop would leave i as the final value in all cases when the callback is executed)

These would be all exactly the same function, just with a different value of the variable i from the enclosing scope. So what do we use to create the unique ID for tracking each instance of this callback separately? The only solution that's apparent to me is to let you to specify a disambiguation key in the callback definition, then creating a nice error message that says something like "You have duplicate callbacks with identical inputs and outputs. If this is really what you want, please add a non-random key to distinguish these" which would look something like:

@callback(Output("graph", "figure", allow_duplicate=True), Input("control", "value"), key=i)

alexcjohnson avatar May 31 '23 15:05 alexcjohnson

I intended allow_duplicate to really allow any duplicate, if it returns a Patch object there should be no problem with response order but it would trigger 2x render.

Initially it created a random uuid for the duplicate output but with multi workers and server restarts the uuid would not be the same and the callback would not be found. Thus the id for the outputs needs to be deterministic and the id is now created with a hash of the callback inputs.

A workaround is to change the order of the inputs or add dummy inputs that are different across the callbacks.

I think the states could be added to the created hash for that case can be a different callback that would be common as reported in this issue.

@callback(Output("graph", "figure", allow_duplicate=True), Input("control", "value"), key=i)

That could be a good solution it would take the key instead of the hash.

T4rk1n avatar May 31 '23 15:05 T4rk1n

Or include the key in the hash, in case you have other callbacks with the same outputs but different inputs.

but it would trigger 2x render

Does it? In principle we should be able to get around that, detecting that another callback is in progress for the same output and waiting to rerender until that's done.

alexcjohnson avatar May 31 '23 16:05 alexcjohnson

Kinda, react do handle batch updates, it changed to be more automatic in react 18. We have in our code something that handle this I think for same callbacks only.

T4rk1n avatar May 31 '23 16:05 T4rk1n

Triggering render multiple times might be desirable. Back to the example with updating multiple traces of a dcc.Graph, triggering the render only once would mean waiting until all traces are ready to update to draw them, right? And if one trace takes a significant amount of time to get (slow db call, post-processing...), then all the other traces would would be delayed by the one slow trace.

tlauli avatar Jun 05 '23 07:06 tlauli

Note that this happens if the callbacks have different outputs in addition to the common duplicated output as well.

I guess I see the argument that they might as well be combined, but this can make the structure less organized IMO. One example would be when an interval is the input and multiple mostly independent things should happen regularly. The logic inside these callback could be such that the same input rarely will trigger writing to the same output at the same time. (and when they do the only expense is some compute)

A better error message at the very least would be helpful.

And if I understand correctly this does not trigger if the inputs is not exactly the same? The exact same issues can easily happen for such callbacks, so not sure I see why there should be an exception for the case where the input set is exactly the same.

(adding the error message in plaintext for searchability: "Output 1 (...) is already in use. To resolve this, set allow_duplicate=True on duplicate outputs")

(Also note that it's ok to have two such callbacks if one of them use Output and the other OutputDuplicate)

olejorgenb avatar Nov 07 '23 12:11 olejorgenb

Thank you @olejorgenb and @tlauli for your comments. Making note of possible ways for us to move forward:

  • Include State in the hash
  • Improve error message
  • Add an arbitrary extra key (folks that prefer having identical inputs running the callbacks in parallel won't have to add extra inputs)

Coding-with-Adam avatar Nov 07 '23 16:11 Coding-with-Adam

I have an input (a button in this case) and 2 callbacks. The first callback is fast and inserts the "loading..." indicator into the UI. The second, slow one clears the loading indicator and displays results.

I believe this is a valid use-case for having 2 callbacks with exactly the same inputs and a duplicate output.

At the moment this seems to be the simplest way to show a loading indicator for a callback that runs in the same process as the main app.

franekp avatar Feb 07 '24 10:02 franekp

@franekp We have loading_state for that: https://dash.plotly.com/loading-states and you can achieve the same behavior with the running argument of background callbacks, the first argument is Output, then the value when running and finally the value after running.

Example:

running=[(Output("status", "children"), "Running", "Finished")],

T4rk1n avatar Feb 07 '24 14:02 T4rk1n

Oh, this is nice, thanks!

franekp avatar Feb 07 '24 23:02 franekp

@T4rk1n is this one still relevant?

gvwilson avatar Jul 25 '24 13:07 gvwilson