DearPyGui icon indicating copy to clipboard operation
DearPyGui copied to clipboard

Optimize table updates

Open sailfish009 opened this issue 1 year ago • 4 comments

Is your feature request related to a problem? Please describe. Updating a table with more than 64 columns and 1000 rows of data is time consuming. Currently, it takes about 20-25 seconds. I improved it a bit by converting pandas dataframe to numpy array. If it is possible to optimize the 2D array data in a parallel way, there is a hope that the loading speed can be reduced to one tenth of the level.

Describe the solution you'd like It would be nice to be able to separate the data update from the way each cell is generated. I've used commercial components in my work, and I think that approach is good.

data = [from optimized io routine]
table.data_source = data

Or, I'd like to see examples using parallel libraries such as multiprocessing, joblib, dask, ray, etc. I've tried a few things, and here's what happened: I tried the following in a function that I pass as an argument to joblib's parallel(), but only the row is generated and nothing is displayed inside the actual cell.

 with table_row():
    [list comprehension]

Describe alternatives you've considered None

Additional context None

sailfish009 avatar Nov 24 '23 10:11 sailfish009

64 columns by 1000 rows is at least 64,000 widgets. DPG simply wasn't designed for such volumes of data.

While I agree that things start getting pretty slow in DPG when it gets to thousands of widgets, and there's space for optimization, I'd also like to ask you a question - do your users really, really need to see 64,000 cells in the table? Won't they want to filter it somehow, or to search? Would it make sense to only show a part of that data? Maybe even load it dynamically as the user scrolls, if you really need to display 1,000 rows...

v-ein avatar Nov 24 '23 13:11 v-ein

BTW you can't use multiprocessing (either directly or via joblib) because DPG in other processes won't have its internal structures. Moreover, DPG is designed in a way that only one thread at a time can work with the widgets tree.

v-ein avatar Nov 24 '23 13:11 v-ein

It would be nice to have an example (demo.py) that shows only a portion of the data in the table, and one that dynamically shows the earlier and later data naturally as you scroll up and down.

I knew it wasn't thread safe because I randomly got a no container to pop error while testing the joblib parallel code. Regarding the table, it looks like something structural needs to be fixed. Pipelined processing of data on top of a python-based GUI seems to have great commercial potential.

sailfish009 avatar Nov 24 '23 14:11 sailfish009

Regarding that "no container to pop" error, let me quote my own explanation I gave on Discord:

IMPORTANT: When calling DPG from multiple threads, keep in mind certain parts of DPG keep an internal state - and this is critical to the stability of your code! When you use something like this:

with dpg.child_window() as container:
   with dpg.table(header_row=False) as table:
       dpg.add_table_column()

there's an internal container stack that the with blocks affect. If another thread takes over somewhere in the middle, and starts adding widgets, it may well add them to your current container. This will break everything. Always, always enclose such pieces with with dpg.mutex(): (or another mutex if you will). Another caveat is related to dpg.last_item(), dpg.last_container(), etc. - if you don't use a mutex, they may give you an item added from a neighbour thread.

v-ein avatar Nov 25 '23 06:11 v-ein