textual icon indicating copy to clipboard operation
textual copied to clipboard

FR: Support for Bulk Row Removal in DataTable

Open charles-001 opened this issue 1 year ago • 3 comments

Currently, the DataTable widget provides a remove_row method that only allows removing a single row at a time. I’d like to request the addition of a new method, remove_rows, which would accept an array of RowKey values and efficiently remove multiple rows in a single operation.

I think his new method would improve performance and usability when dealing with scenarios that require removing a large number of rows, especialy with a tool like Dolphie that refreshes its datatables every 1 to 2 seconds.

Thank you!

charles-001 avatar Nov 22 '24 20:11 charles-001

Thank you for your issue. Give us a little time to review it.

PS. You might want to check the FAQ if you haven't done so already.

This is an automated reply, generated by FAQtory

github-actions[bot] avatar Nov 22 '24 20:11 github-actions[bot]

I guess the bigger issue is that remove_row causes refresh(layout=True), which iterates every row so we get O(n) performance compared to O(1) when adding rows. This only gets triggered once anyway, so the performance removing multiple rows at a time has the same performance as removing one.

So I think it needs a layout cache more than a bulk remove method

#!/usr/bin/env python3
import time
import asyncio
import sys
from textual.app import App
from textual.widgets import DataTable


class PerfTestApp(App):
    def __init__(self, row_count):
        super().__init__()
        self.row_count = row_count
        self.add_time = 0
        self.remove_time = 0
        
    def compose(self):
        table = DataTable()
        table.add_column("ID")
        table.add_column("Data")
        yield table
    
    async def on_mount(self):
        """Run on app start"""
        table = self.query_one(DataTable)

        #
        # Test add
        #
        start_time = time.perf_counter()
        
        row_keys = []
        for i in range(self.row_count):
            key = table.add_row(str(i), f"Data {i}")
            row_keys.append(key)
            
        self.add_time = time.perf_counter() - start_time
        
        # Brief pause to let any pending refreshes complete
        await asyncio.sleep(0.1)
        
        #
        # Test remove
        #
        start_time = time.perf_counter()
        
        for key in reversed(row_keys):
            table.remove_row(key)
            
        self.remove_time = time.perf_counter() - start_time
        
        self.exit()


def test_row_count(row_count):
    """Test a specific row count and return results"""
    app = PerfTestApp(row_count)
    app.run()
    
    add_per_sec = row_count / app.add_time if app.add_time > 0 else 0
    remove_per_sec = row_count / app.remove_time if app.remove_time > 0 else 0
    
    return add_per_sec, remove_per_sec


if __name__ == "__main__":
    # Test different row counts
    row_counts = range(100, 5000, 500)
    
    print("rows     add/sec    remove/sec")
    print("----     -------    ----------")
    
    for row_count in row_counts:
        add_per_sec, remove_per_sec = test_row_count(row_count)
        print(f"{row_count:<8} {add_per_sec:<10.0f} {remove_per_sec:<10.0f}")
        
        # Don't test larger sizes if remove performance is getting too slow
        if remove_per_sec < 50:  # Less than 50 rows/sec
            print("(stopping due to poor remove performance)")
            break

Image

bitplane avatar Jun 04 '25 17:06 bitplane

Hi @willmcgugan Is this issue still worth to take? And if it is, then could you clarify the scope of work to be done again? I am willing to take this one and if you can assign this to me then that will be great. Thanks

dev-KingMaster avatar Nov 05 '25 13:11 dev-KingMaster