vorta icon indicating copy to clipboard operation
vorta copied to clipboard

Diff and Extract View Filtering Options

Open jetchirag opened this issue 2 years ago • 67 comments

Description

WIP. Adds filtering option(s) to Vorta diff and extract view.

  • [x] Add a Search button to avoid resource issue with large tree
  • [x] Add syntax search
  • [x] View specific search options
  • [ ] Tests

Related Issue

Fixes #1674

Screenshots (if appropriate):

image

Types of changes

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • [x] I have read the CONTRIBUTING guide.
  • [x] My code follows the code style of this project.
  • [ ] My change requires a change to the documentation.
  • [ ] I have updated the documentation accordingly.
  • [ ] I have added tests to cover my changes.
  • [ ] All new and existing tests passed.

I provide my contribution under the terms of the license of this repository and I affirm the Developer Certificate of Origin.


Follow ups

  • [ ] Document search syntax in Vorta docs
  • [ ] Link documentation somewhere near the search bar
  • [ ] Implement helper widget for inserting the search syntax
  • [ ] Show error message when search syntax is violated

jetchirag avatar Jul 11 '23 03:07 jetchirag

Any suggestions how I can expand parent from QSortFilterProxyModel?

jetchirag avatar Jul 11 '23 03:07 jetchirag

Any suggestions how I can expand parent from QSortFilterProxyModel?

What do you mean by 'expand' and 'parent'?

real-yfprojects avatar Jul 11 '23 19:07 real-yfprojects

@real-yfprojects When a match is found in an item, I want the parent items of that item to expand to reveal the matched child.

I've also spent un-necessarily high time thinking about the search syntax. There are lot of good suggestions in the linked issue. I'm thinking this currently:

Search input must start with either a normal string or re: depending on which the name will be searched using either in or re.search. The search input will be split into dictionary afterwards which can contain path: and other other syntax as needed. All of this will be joined and filtered using and condition.

It will also be easier to create a dialog box using this pattern as suggested in the issue and include regex search. Just wanted to run this by you before I start working on this.

jetchirag avatar Jul 11 '23 22:07 jetchirag

When a match is found in an item, I want the parent items of that item to expand to reveal the matched child.

You could use QTreeView::scrollTo. I like the idea but what about multiple matches?

real-yfprojects avatar Jul 12 '23 13:07 real-yfprojects

Just wanted to run this by you before I start working on this.

Sounds good! Can you describe the syntax in detail and present examples before implementing it?

real-yfprojects avatar Jul 12 '23 15:07 real-yfprojects

It should give you an idea what I'm thinking

Valid: important.txt path:/home/chirag/ <--- if search_pattern in name re:^doc_.*.pdf$ path:/home/chirag/work size<:1MB <--- if re.search(search_pattern, name)

Invalid: something re:^doc_.*.pdf$ path:/home/chirag/ important.txt

I've not look at implementation of size though so I'll have to figure this out when I start implementing this.

jetchirag avatar Jul 12 '23 15:07 jetchirag

It should give you an idea what I'm thinking

Valid: important.txt path:/home/chirag/ <--- if search_pattern in name re:^doc_.*.pdf$ path:/home/chirag/work size<:1MB <--- if re.search(search_pattern, name)

Invalid: something re:^doc_.*.pdf$ path:/home/chirag/ important.txt

Why is the last one invalid? And isn't path: the same as re:^?

I've not look at implementation of size though so I'll have to figure this out when I start implementing this.

Yes, start with the path-based filters. Then you could proceed with implemeting the same for the extract view. In the process it would be great if you moved the common GUI and code into a superclass implementation to reduce redundancy. Afterwards you can think about implemeting filtering by other fields.

real-yfprojects avatar Jul 12 '23 16:07 real-yfprojects

Why is the last one invalid?

I was thinking allowing path: to come only after the name would give flexibility and not require us to use name: syntax. Otherwise it would be difficult to identify what part of search is path and what is name.

And isn't path: the same as re:^?

Was going to use path: to search the path either using in operator or using glob.glob to allow more flexibility. While re: would just search the name. Reason being regex would require escaping special characters in path so knowledge of regex would be needed for just filtering by path.

I think we will have to construct the path by going up and getting name from parent and then joining to form a path. I'm trying to find if there's a more effecient way. Do you have any clue?

jetchirag avatar Jul 14 '23 09:07 jetchirag

I see, so by default only the subpath is matched. Does that mean when entering a, path/a will be shown but not path/a/b ?

real-yfprojects avatar Jul 14 '23 10:07 real-yfprojects

Are you familiar with the concept of user stories?

real-yfprojects avatar Jul 14 '23 10:07 real-yfprojects

I see, so by default only the subpath is matched. Does that mean when entering a, path/a will be shown but not path/a/b ?

No, if something is matched, its parents (path/) should be revealed but that item should be expandable to reveal child /b as well

Are you familiar with the concept of user stories?

I'm not but I did a quick lookup.

jetchirag avatar Jul 14 '23 10:07 jetchirag

While finished user stories compose a full description of a (sub)feature from a user perspective, I like to work with their basis that is describing a situation in which our feature might help. This allows us to better understand possible needs for our software and to design our software to better serve those needs. That's why I propose we think now think of some user (need) stories first before we continue thinking about the details of the search syntax. In the mean time you can work on this step (see previous comment):

In the process it would be great if you moved the common GUI and code into a superclass implementation to reduce redundancy.

I will start with a story:

Story 1: exact path

As someone that uses Vorta for accessing private backups I want to extract the old version of a specific file which I just modified erroneously. While I know the exact file path I find expanding each parent item in the path by clicking on it, very tedious. Expanding all items doesn't improve the situation since now there are so many elements that I need a lot of patience for finding the right path. I wish Vorta could help me with that.

This user need could be solved in several ways. This new filtering option might not be the best option to address this user story. However we should still describe this story although in the end we might decide that this isn't relevant enough to be addressed or needs a different feature entirely. Let's first write down these stories and think about how to solve them afterwards. Now its your turn!

real-yfprojects avatar Jul 16 '23 18:07 real-yfprojects

Pushed commit to move methods I could find similar to base class.

A story based on my last experience:

I was working on an app sometimes ago where I needed to restore .configuration file from backups. I provided the path as app/.configuration to borgmatic which was able to fetch the correct file for me despite not knowing/providing absolute path. I don't want to provide just the file name either as there maybe several files with same name in this project like __init__.py. So path option should be flexible.

Another story: Size based filters

I have a cronjob which deletes all files from Downloads folder which haven't been modified for more than 14 days. I realised a month later I had some important documents I downloaded but never moved. I now want to recover them from my Vorta backup but there are lot of unnecessary stuff in Downloads folder like large application files which I don't need.

They were simple documents and spreadsheet files which I am sure were small, surely under 50MB. I'd want a simple filter in Vorta where I can enter the size range and set path to /home/chirag/Downloads. I also haven't read the documentation and don't know how to use correct size format. Is it MB or M or does it only support bytes? It would be great if there was a dropdown to select the size format and perhaps a path selector.

Some small points

  • It would be even great if it supports unix like pathname (/home/*/Documents) but for personal use which is Vorta I personally don't need it.
  • I am not sure how resource intensive / fast adding these features would be as they are essentially run on every row. As the backup grows it would get slower and slower.
  • This made me realise, recovering a file requires knowing archive which has it or atleast the dates so you can filter through them. Can we have a feature to search all archives? Though this would be a bigger and seperate feature.

jetchirag avatar Jul 17 '23 09:07 jetchirag

  • I am not sure how resource intensive / fast adding these features would be as they are essentially run on every row. As the backup grows it would get slower and slower.

There is nothing we can do about that. That's part of the nature of filtering/searching. However Qt does implement some optimizations afaik. Those should include only calculating the filter for expanded items.

Story 5: Name

As a user who has recently moved a file between backups. I want to find out where was located in earlier backups. Borg and Vorta do not detect the renaming or moving of a file (afaik). So filtering diff changes to only show the ones with a specific file name would be useful.

Story 6: Removed files

As a normal user I want to get an overview of all the files I deleted from my harddrive (between two backups). However sorting by change type still clutters the list (in tree view) with added or modified files. This is because it is only sorted in the sublist of each parent.

Story 7: Search for parent

As a user who wants to extract a document regarding GSoC I deleted previoiusly, I know it is located in a folder somewhere that contains GSoC in its name. Now I want to enter something simple like GSoC in the search bar and have the folder appear. Then I want to be able to expand the correct folder and look for the document in its children.

real-yfprojects avatar Jul 18 '23 15:07 real-yfprojects

Can you now lay out the current iteration of your syntax and describe in which way it satisfies the needs of each story? Please also think about how conveniently it solved each problem and whether the resulting order ov convenience is justified.

real-yfprojects avatar Jul 18 '23 15:07 real-yfprojects

Here's my final draft of syntax. I chose to use command like syntax as it feels easier to read to me eyes and overall better. But let me know.

[ string ] [ -m ] [ -i ] [ -p ] [ -c ] [ -s ]
Note: Search String and other options are all optional

Options/Flags:

	--match -m <type>
	Type of match query
	Supported values:
		in (default): Filename should contain search string
		exact: Filename must match the search string exactly
		re: Use regex to match filename                        <-- Do we need this?

	--ignore-case -i    <-- inspired by grep
	Ignore Case. Supported by `in` and `exact` match type only.	

	--path -p <path>
	Specify path to match. Supports wildcard. Parsed using glob (?)
	Example:
		--path /home/vorta/Documents
		--path /var/www/public/*
		--path */src/*
	
	--change -c <type>
	Only available in Diff View. Supported values:
		A: Match only Added files
		R: Match only Removed files
		...
	
	--size -s <relation[<|>]number[B|KB|MB|GB>]>
	Match by size. Can be combined once to create a range. Range must be bounded.
	Example:
		--size <10MB
		--size >1GB
		--size >1GB --size <10GB

I am planning on using argparser for this. I was looking into implementing custom with startsWith, in and so but that would be just reinventing the wheel. I also think many users will be familiar with this type of syntax than the earlier proposed one.

Matching with the stories

1. exact path

importantfile.txt --path /home/chirag/work/

Or

--path /home/chirag/work/importantfile.txt

Satisfies the need with a clear syntax

2. A story based on my last experience

.configuration --path */app
# Or
--path */app/.configuration

Allows wildcards hence I can match relative path.

3. Size based filters

They were simple documents and spreadsheet files which I am sure were small, surely under 50MB. I'd want a simple filter in Vorta where I can enter the size range and set path to /home/chirag/Downloads.

--size <50GB --path /home/chirag/Downloads

4. Name

mylogins.txt --match exact

5. Removed files

--change R

6. Search for parent

I know it is located in a folder somewhere that contains GSoC in its name

GSoC

# was it GSoC, GSOC, gsoc, Gsoc?
gsoc -i

Overall I feel this fits every use case, is easier to read and organised. Took some time to write this, let me know what you think.

Some other special use specific options "may" add (from grep): -v, --invert-match --exclude-dir

jetchirag avatar Jul 20 '23 00:07 jetchirag

let me know what you think

I've pushed some code to match filenames and path to stay on the timeline. It's working well but can be easily modified shall we decide to change syntax.

jetchirag avatar Jul 21 '23 08:07 jetchirag

I will have a look at this tomorrow.

I've pushed some code to match filenames and path to stay on the timeline. It's working well but can be easily modified shall we decide to change syntax.

You can implement the search button until we have settled on a syntax.

real-yfprojects avatar Jul 21 '23 14:07 real-yfprojects

Can we use an icon for the search button?

real-yfprojects avatar Jul 22 '23 09:07 real-yfprojects

Can we use this icon?

https://fonts.google.com/icons?selected=Material+Symbols+Outlined:search:FILL@0;wght@400;GRAD@0;opsz@48&icon.query=search

jetchirag avatar Jul 22 '23 10:07 jetchirag

Yes, that works! @m3nu how do we have to handle licensing?

real-yfprojects avatar Jul 22 '23 10:07 real-yfprojects

I like the syntax you proposed since it can be (partly) implemented by using argparse.

--size >1GB --size <10GB

Why not combine those to a single argument: --size >1GB,<10GB and allow --size <=10GB?

Specify path to match. Supports wildcard. Parsed using glob (?)

Glob support is great for not so technical users but one should also be able to use a regex to match the path.

re: Use regex to match filename <-- Do we need this?

I don't think we need this if one can use a regex for path.

.configuration --path */app
# Or
--path */app/.configuration

Why have to ways of doing the same. If we make name and path exclusive we could add a flag to toggle between them and use match for choosing the pattern format for both options. Like so:

[ <search_pattern> [ ( -m | --match ) ( ex | in | re | fm ) ] [ -p | --path ] [ -i | --ignore-case ] ] [ ( -c | --change ) <type> ] ...

The naming of the pattern types is inspired by borg exclude patterns since the user has to learn less when using the same design patterns as borg. The user would need to write less when using the borg pattern prefixes like pf:, re:, ... directly.

For the diff view there also should be an option to filter by balance. For extract view there should be options to filter by health, modification time and size.

I feel like when the path with a regex, vorta shouldn't show all children of a matched file even though they don't match as well. And I wondered whether we only want to support joining the filters like path or size with AND or with OR also?

real-yfprojects avatar Jul 22 '23 11:07 real-yfprojects

And I wondered whether we only want to support joining the filters like path or size with AND or with OR also?

I actually pondered over this for quite a while but settled on AND because that seems more intuitive and I couldn't find personal use case where I would need to use OR.

Why have to ways of doing the same. If we make name and path exclusive we could add a flag to toggle between them and use match for choosing the pattern format for both options.

I like it but there's a usecase I thought of. If I want to find a file in a given path, I thought we would require to use regex but the wildcard here seems to support nested directories so it should be fine.

print(fnmatch.fnmatch("/home/chirag/docs/latest/test", "/home/chirag/*/test"))
# True

What do you think about this?

jetchirag avatar Jul 22 '23 12:07 jetchirag

If I want to find a file in a given path, I thought we would require to use regex but the wildcard here seems to support nested directories so it should be fine.

Yes, exactly. This is the reasoning for my suggestion to make path and name mutually exclusive.

real-yfprojects avatar Jul 22 '23 12:07 real-yfprojects

and allow --size <=10GB?

Since underline size comparison is in bytes, I thought this match would be a rare care.

I feel like when the path with a regex, vorta shouldn't show all children of a matched file even though they don't match as well.

Any reason for that? By behavior, filter function will stop quering children if parent matches and just return them too.

I added path function as you suggested and also implemented size filter. Pushing the code now.

jetchirag avatar Jul 25 '23 23:07 jetchirag

Since underline size comparison is in bytes, I thought this match would be a rare care.

Oh, I didn't think of that. But It doesn't hurt either.

Any reason for that? By behavior, filter function will stop quering children if parent matches and just return them too.

I guess we can leave it like that and change it in the future when users bother.

When a match is found in an item, I want the parent items of that item to expand to reveal the matched child.

You could use QTreeView::scrollTo. I like the idea but what about multiple matches?

Something like this would be very useful for Story 1 and also for 4 and 6.


I noticed our filters behave very differently in the different view modes of the tree view. However I haven't figured out yet whether this is an issue (or a feature).

real-yfprojects avatar Jul 28 '23 08:07 real-yfprojects

I noticed our filters behave very differently in the different view modes of the tree view. However I haven't figured out yet whether this is an issue (or a feature).

It's a feature ofcourse 😃 Can you share any example? I found them to be similar but it's only my test data not very vast.

jetchirag avatar Jul 29 '23 00:07 jetchirag

I found that changed_size=73746 corresponds to Size Column, and size=57358 to Balance. So changed_size should be the overall new size of file, so should we be comparing size filter with it instead of size?

Do you know if changed_size is indeed overall new size? Because it's different from the size in extract view.


I also just found out Pending review needs to be submitted. I was wondering if you saw my comment on model reset usage but I didn't even submit it. Ah!

jetchirag avatar Jul 29 '23 00:07 jetchirag

For the diff view there also should be an option to filter by balance. For extract view there should be options to filter by health, modification time and size.

Only this should be pending now.

jetchirag avatar Jul 29 '23 01:07 jetchirag

I found that changed_size=73746 corresponds to Size Column, and size=57358 to Balance. So changed_size should be the overall new size of file, so should we be comparing size filter with it instead of size?

Do you know if changed_size is indeed overall new size? Because it's different from the size in extract view.

Changed size tells you wether the file size grew or shrinke, since it is calculated by substracting the number of bytes removed from the number of bytes added to the file.

real-yfprojects avatar Jul 29 '23 17:07 real-yfprojects