Search engine for tables - Research
Problem description
Let's say you have a large table and you want to check if a specific entity is listed there. Right now there is no way to do it. The user will have to scroll down and read all elements within the table.
Steps to reproduce it
- Open the ODE.
- Go to ADD and select a file from your laptop.
- On the upper right corner you will see there ir no search engine.
Suggested solution
Add search engine.
More context on this: the search bar should be located at the top of the table (upper right corner).
See Material UI option here. Section: App bar with search field
Let me know if you need further info.
Here is an implementation example of Search: https://reactdatagrid.io/docs/miscellaneous#csv-export-+-custom-search-box
So I have been playing around with searching.
Here are some conclusions:
- Our datagrid uses remote pagination which makes sense if we want to be able to work with files with millon of rows.
- Given point number one, we are forced to use a backend search over more simple front end solutions like the one I mentioned in the previous comment.
-
ReactDataGridre-renders everytime itsdataSourcechange. - Our
dataSourceis tightly coupled with thehistoryandsavefeatures so, our search feature should not affect it. (Or a user that hitSavethen a search is applied will save only the filtered data.)
Some implementation notes:
- The
dataSourcelogic is implemented in theloaderfunction of the Table's store - The call to
client.tableReadis the one responsible to fetch data from the backend and handle the offset, limit parameter for pagination
Possible solution:
- I'm leaning for a Full-Text search feature of
SQLite: https://www.sqlitetutorial.net/sqlite-full-text-search/. @roll the indexer seems to be at the core offrictionless-pyhave you ever explored this option?
@pdelboca
I agree that FTS would make sense but in frctioness-py indexing mean e.g. CSV -> database with validation etc so it's not related to full-text-search
@pdelboca I want to add something here that I discussed with @guergana and @roll this week, in case it is relevant for the implementation of the search engine:
Currently, when opening a tabular file, the datagrid shows a certain number of rows (5,10,20,25,40,50,100) and the user has to click on the icon at the bottom (pagination) of the screen to keep checking the rest of the table:
This way of exploring data is problematic, especially if you have a tabular file with a large number of rows. I checked how Flourish and Datawrapper show to the user tables and both tools allow to scroll data keeping the column headings immobilized. I'll create a separate issue to remove pagination, but I wanted to mention this in case this is something you need to consider when implementing the search engine.