pyface icon indicating copy to clipboard operation
pyface copied to clipboard

DataView gets slow when a lot of elements are selected

Open xamcost opened this issue 3 years ago • 8 comments

Hi there ! It seems that upon using DataView with a large amount of data, the app using it becomes quite slow if a lot of items are selected in the tree. Basically scrolling through the data, resizing the app, or other actions involving the DataView widget are slowed down. You can check this using the code snippet hereafter (this snippet is just this example in pyface, modified to select random elements before showing the UI (you can also check that selecting only one item restores a smooth behaviour, for instance when scrolling or resizing the window).

Tested with pyface 7.2 and Traits 6.1 on MacOS X 10.15.7.

import logging
from random import randint

from traits.api import Bool, Dict, HasStrictTraits, Instance, Int, Str, List

from pyface.api import ApplicationWindow, GUI, Image, ImageResource
from pyface.ui_traits import PyfaceColor
from pyface.data_view.data_models.api import (
    AttributeDataAccessor, RowTableDataModel
)
from pyface.data_view.api import DataViewWidget, IDataViewWidget
from pyface.data_view.value_types.api import (
    BoolValue, ColorValue, IntValue, TextValue
)

from example_data import (
    any_name, family_name, favorite_color, age, street, city, country
)


logger = logging.getLogger(__name__)


flags = {
    'Canada': ImageResource('ca.png'),
    'UK': ImageResource('gb.png'),
    'USA': ImageResource('us.png'),
}


# The data model

class Address(HasStrictTraits):

    street = Str()

    city = Str()

    country = Str()


class Person(HasStrictTraits):

    name = Str()

    age = Int()

    favorite_color = PyfaceColor()

    contacted = Bool()

    address = Instance(Address, ())


class CountryValue(TextValue):

    flags = Dict(Str, Image, update_value_type=True)

    def has_image(self, model, row, column):
        value = model.get_value(row, column)
        return value in self.flags

    def get_image(self, model, row, column):
        value = model.get_value(row, column)
        return self.flags[value]


row_header_data = AttributeDataAccessor(
    title='People',
    attr='name',
    value_type=TextValue(),
)

column_data = [
    AttributeDataAccessor(
        attr="age",
        value_type=IntValue(minimum=0),
    ),
    AttributeDataAccessor(
        attr="favorite_color",
        value_type=ColorValue(),
    ),
    AttributeDataAccessor(
        attr="contacted",
        value_type=BoolValue(),
    ),
    AttributeDataAccessor(
        attr="address.street",
        value_type=TextValue(),
    ),
    AttributeDataAccessor(
        attr="address.city",
        value_type=TextValue(),
    ),
    AttributeDataAccessor(
        attr="address.country",
        value_type=CountryValue(flags=flags),
    ),
]


class MainWindow(ApplicationWindow):
    """ The main application window. """

    #: A collection of People.
    data = List(Instance(Person))

    #: The data view widget.
    data_view = Instance(IDataViewWidget)

    def _create_contents(self, parent):
        """ Creates the left hand side or top depending on the style. """

        self.data_view = DataViewWidget(
            parent=parent,
            data_model=RowTableDataModel(
                data=self.data,
                row_header_data=row_header_data,
                column_data=column_data
            ),
        )
        self.data_view._create()

        logger.info("Starting selection")
        selection = [
            ((randint(0, 9999),), ()) for _ in range(1000)
        ]
        self.data_view.control._widget.selection = selection
        logger.info("Selection done")

        return self.data_view.control

    def _data_default(self):
        logger.info("Initializing data")
        people = [
            Person(
                name='%s %s' % (any_name(), family_name()),
                age=age(),
                favorite_color=favorite_color(),
                address=Address(
                    street=street(),
                    city=city(),
                    country=country(),
                ),
            )
            for i in range(10000)
        ]
        logger.info("Data initialized")
        return people

    def destroy(self):
        self.data_view.destroy()
        super().destroy()


if __name__ == '__main__':
    logging.basicConfig(level=logging.INFO)

    # Create the GUI (this does NOT start the GUI event loop).
    gui = GUI()

    # Create and open the main window.
    window = MainWindow()
    window.open()

    # Start the GUI event loop!
    gui.start_event_loop()
    logger.info("Shutting down")

xamcost avatar Feb 03 '21 18:02 xamcost

Personally, I wouldn't call this a bug - i'd call this an enhancement. The UI is definitely slow - it improves if we go down to 5000 elements with 500 selected - and it improves more with still some jitter when we go down to 1000 elements with 100 selected.

In your realistic usecase, how many rows (order of magnitude) do you expect to be working with? Is it 10^3 or 10^4?

Here's how slow the example looks for me on windows + python 3.6 + pyside2 -

data-view-slow

rahulporuri avatar Feb 04 '21 04:02 rahulporuri

I think some profiling might be in order to find out what the culprit is in terms of where time is being spent. It is possible that this is a Qt bug, but more likely it is something dumb.

Personally, I am suspicious of the color objects, as we know that they are slow (that's the reason it takes so long to initialize).

corranwebster avatar Feb 04 '21 08:02 corranwebster

In your realistic usecase, how many rows (order of magnitude) do you expect to be working with? Is it 10^3 or 10^4?

@rahulporuri It is hard to assess for the time being, but 10^3 is not irrealistic, since we have a use case were large table would be used, where users could use a select all action, or select columns.

Personally, I am suspicious of the color objects, as we know that they are slow (that's the reason it takes so long to initialize).

@corranwebster We encountered this also in a situation where we did not reimplement AbstractValueType.get_color(), but I suppose that doesn't change your suspicion, since the base implementation returns a pyface.api.Color object anyway..?

xamcost avatar Feb 04 '21 11:02 xamcost

@corranwebster A quick update following your suspicion: to check it, I quickly tried just commenting image and color setting in DataViewItemModel.data() (this block of code), and it doesn't make the app faster. I reckon though that's not an elegant way of checking this. I'll keep you posted if I investigate further (and better!)

xamcost avatar Feb 04 '21 11:02 xamcost

I did investigate a bit more on this issue, and although I do not have a solution, it seems that the problem comes from Qt. Upon selecting a lot of elements, there is an abnormally huge amount of calls to QAbstractItemModel.parent(). To observe this, you can run this modified version of pyface.examples.data_view.array_example, which performs profiling once the gui event loop is running.

import logging
import cProfile
import io
import pstats
from random import randint

from traits.api import Array, Instance

from pyface.api import ApplicationWindow, GUI
from pyface.data_view.data_models.array_data_model import ArrayDataModel
from pyface.data_view.i_data_view_widget import IDataViewWidget
from pyface.data_view.data_view_widget import DataViewWidget
from pyface.data_view.value_types.api import FloatValue


logger = logging.getLogger(__name__)


class MainWindow(ApplicationWindow):
    """ The main application window. """

    data = Array

    data_view = Instance(IDataViewWidget)

    def _create_contents(self, parent):
        """ Creates the left hand side or top depending on the style. """

        self.data_view = DataViewWidget(
            parent=parent,
            data_model=ArrayDataModel(
                data=self.data,
                value_type=FloatValue(),
            ),
        )
        self.data_view._create()

        logger.info("Starting selection")
        selection = [
            ((randint(0, 9999),), ()) for _ in range(1000)
        ]
        self.data_view.control._widget.selection = selection
        logger.info("Selection done")

        return self.data_view.control

    def _data_default(self):
        import numpy
        return numpy.random.uniform(size=(10000, 10))*1000000
        # return numpy.random.uniform(size=(2, 3, 5, 2))*10

    def destroy(self):
        self.data_view.destroy()
        super().destroy()


if __name__ == "__main__":
    logging.basicConfig(level=logging.INFO)

    # Create the GUI (this does NOT start the GUI event loop).
    gui = GUI()

    # Creating the window
    window = MainWindow()
    window.open()
    window.size = (1000, 1000)

    # Create and open the main window.
    # Setup profiler
    pr = cProfile.Profile()
    pr.enable()

    # Start the GUI event loop!
    gui.start_event_loop()
    logger.info("Shutting down")

    # Stop profiling
    pr.disable()

    # Print profiler info
    s = io.StringIO()
    ps = pstats.Stats(pr, stream=s).sort_stats('tottime')
    ps.print_stats(20)
    print(s.getvalue())

Running it on my machine, I get this output:

INFO:__main__:Starting selection
INFO:__main__:Selection done
INFO:pyface.ui.qt4.data_view.data_view_widget:selectionChanged already disconnected
INFO:__main__:Shutting down
         16098349 function calls (16098266 primitive calls) in 18.901 seconds

   Ordered by: internal time
   List reduced from 194 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   12.839   12.839   18.900   18.900 {built-in method exec_}
  5189135    3.977    0.000    5.724    0.000 /Users/mcostalonga/Documents/Python_mod/pyface/pyface/ui/qt4/data_view/data_view_item_model.py:97(parent)
  5259934    1.101    0.000    1.101    0.000 {built-in method isValid}
  5223197    0.670    0.000    0.670    0.000 {built-in method internalPointer}
    18291    0.049    0.000    0.158    0.000 /Users/mcostalonga/Documents/Python_mod/pyface/pyface/ui/qt4/data_view/data_view_item_model.py:175(data)
     5771    0.046    0.000    0.088    0.000 /Users/mcostalonga/Documents/Python_mod/pyface/pyface/ui/qt4/data_view/data_view_item_model.py:141(flags)
        1    0.029    0.029    0.070    0.070 {built-in method setModel}
    34112    0.027    0.000    0.050    0.000 /Users/mcostalonga/Documents/Python_mod/pyface/pyface/ui/qt4/data_view/data_view_item_model.py:346(_to_row_index)
     5294    0.020    0.000    0.020    0.000 {method 'format' of 'str' objects}
    24302    0.018    0.000    0.026    0.000 /Users/mcostalonga/Documents/Python_mod/pyface/pyface/data_view/data_models/array_data_model.py:218(get_value_type)
    24062    0.013    0.000    0.024    0.000 /Users/mcostalonga/Documents/Python_mod/pyface/pyface/ui/qt4/data_view/data_view_item_model.py:358(_to_column_index)
    12625    0.011    0.000    0.011    0.000 {built-in method createIndex}
   101230    0.010    0.000    0.010    0.000 {built-in method builtins.len}
    61691    0.010    0.000    0.010    0.000 /Users/mcostalonga/Documents/Python_mod/pyface/pyface/ui/qt4/data_view/data_view_item_model.py:53(model)
    10048    0.010    0.000    0.028    0.000 /Users/mcostalonga/Documents/Python_mod/pyface/pyface/ui/qt4/data_view/data_view_item_model.py:118(rowCount)
    12625    0.009    0.000    0.025    0.000 /Users/mcostalonga/Documents/Python_mod/pyface/pyface/ui/qt4/data_view/data_view_item_model.py:107(index)
     5294    0.009    0.000    0.010    0.000 /Users/mcostalonga/Documents/Python_mod/pyface/pyface/data_view/data_models/array_data_model.py:149(get_value)
    34062    0.008    0.000    0.008    0.000 {built-in method row}
     5771    0.007    0.000    0.007    0.000 /Users/mcostalonga/Documents/Python_mod/pyface/pyface/data_view/data_models/array_data_model.py:174(can_set_value)
     5294    0.006    0.000    0.040    0.000 /Users/mcostalonga/Documents/Python_mod/pyface/pyface/data_view/value_types/numeric_value.py:91(get_text)

You'll also notice that the selection of elements is pretty slow. If you enable the profiler just before the MainWindow instantiation and disable it just after the call to MainWindow.open(), you'll see that this slowness seems due to the same reason: huge amount of call to parent().

I'm sorry I don't know why parent() is called this often, but this reveals the problem lies within Qt, not within pyface or how DataView is implemented.

xamcost avatar Apr 16 '21 14:04 xamcost

As noted on an internal repo, if this is the QTreeView's fault, then assuming that your use-case isn't using a tree structure, then having an alternative widget which uses a QTableView seems like the solution.

corranwebster avatar Apr 23 '21 10:04 corranwebster

That would be great indeed. The AbstractDataModel framework is just lovely to be honest, it would be fantastic to use it for either a flat table with a DataView widget wrapping a QTableView, or for a tree structure with a DataView widget wrapping a QTreeView.

xamcost avatar Apr 23 '21 10:04 xamcost

I attempted to create an alternate data view widget using QTableView, but ultimately found it still very slow. I also was seeing comparably large number of calls to parent() when running the profiling scripts posted above using it instead of the QTreeView based widget.

One thing I notice when doing this is that the DataViewWidget.control trait is defined as Instance(QAbstractItemView) but the class uses methods on control which are not defined on all QAbstractItemView subclasses, i.e. methods specific to QTreeView. For example isHeaderHidden, setUniformRowHeights, setAnimated. In my dumb first pass implementation I basically ignored these for the QTableView version. For example just commenting those lines out, or having the method calling them do something else that may or may not have been the correct thing (not ideal but the examples given above were working mostly as expected). Nonetheless perhaps we should change the trait to be Instance(QTreeView) fully tying the class to QTreeView, or not use those methods to keep the class more general.

In any case, I am not certain an alternative using QTableView will resolve the speed issues (see second link below). I will work to clean up / look into further what I have started currently, but we may want to keep our eyes open for other potential solutions as well.

ref: https://stackoverflow.com/questions/841096/slow-selection-in-qtreeview-why https://bugreports.qt.io/browse/QTBUG-59478

aaronayres35 avatar Jun 17 '21 21:06 aaronayres35