TMSU icon indicating copy to clipboard operation
TMSU copied to clipboard

Support sorting by recency in 'tmsu files'

Open 0ion9 opened this issue 11 years ago • 10 comments

Getting an 'unsorted' list (which is effectively, actually sorted by how recently a tag was applied on the file) is much more useful for some uses. I've run some tests and it looks like you can get sorting-by-recency (oldest -> newest tagging) just by leaving out the 'order by' clause from the SQL.

--recency would be a decent option name, I guess.

0ion9 avatar Dec 20 '14 14:12 0ion9

Hi,

Everything should be sorted by name: if it's not then it's a bug. An option for sort order is a good idea though, I'll look to add this, probably using how 'ls' handles this as a guide.

Thanks, Paul On 20 Dec 2014 14:03, "0ion9" [email protected] wrote:

For some things, getting an 'unsorted' list (which is effectively, actually sorted by how recently a tag was applied on that. I've run some tests and it looks like you can get sorting-by-recency (oldest -> newest tagging) just by leaving out the 'order by' clause from the SQL.

--recency would be a decent option name, I guess.

— Reply to this email directly or view it on GitHub https://github.com/oniony/TMSU/issues/12.

oniony avatar Dec 20 '14 19:12 oniony

Yes, if I was unclear, I meant that sorting by recency is much better than alphabetic for some uses. I manually did this kind of query recently, because I needed it:

select F.name from file_tag as FT join file as F on F.id=file_id where tag_id = (select id from tag where name = 'have_drawn')

And suggested this because it is clearly possible with no changes to the database necessary. At no point did TMSU provide output that wasn't alphabetically sorted.

0ion9 avatar Dec 20 '14 23:12 0ion9

Cool, no problem. I'll look to add a sort option. I've already a todo item for numerical sorting do it would make sense to do this at the same time.

Thanks, Paul On 20 Dec 2014 23:53, "0ion9" [email protected] wrote:

Yes, if I was unclear, I meant that sorting by recency is much better than alphabetic for some uses. I manually did this kind of query recently, because I needed it:

select F.name from file_tag as FT join file as F on F.id=file_id where tag_id = (select id from tag where name = 'have_drawn')

And suggested this because it is clearly possible with no changes to the database necessary. At no point did TMSU provide output that wasn't alphabetically sorted.

— Reply to this email directly or view it on GitHub https://github.com/oniony/TMSU/issues/12#issuecomment-67754493.

oniony avatar Dec 21 '14 08:12 oniony

I don't like the idea of functionality that relies on undefined behaviour of the underlying database: https://www.sqlite.org/lang_select.html#orderby. However, that is easily addressed by ordering by the (normally hidden) 'rowid' column which is, effectively, the insert order.

Not sure how the recency thing would work though: it could be implement this using the rowid column on the file table, which would result in files by ordered by when they were first tagged. To sort them by when they were last tagged would be considerably more difficult. The easiest way would be to first retrieve the set of files that match the query and then perform a second query to work out the last update order. For example, if the files are identified as numbers 1, 5, 8 and 12 then a query such as the following could get the order:

SELECT file_id
FROM file_tag
WHERE file_id IN (1, 5, 8, 12)
GROUP BY file_id
ORDER by max(rowid) DESC

However this would be the last time these files had a tag applied, not necessarily when the tags that are in the query were applied. Would this suffice?

oniony avatar Dec 22 '14 21:12 oniony

The two-pass approach seems to obtain the correct results, with, I guess, reasonable time? As long as its no worse than say 2x the time of the default alphabetical sort order, I'm not personally fussed.

Sorting by time of first tagging.. isn't something I would personally make much use of. There are probably applications for it though.

Thanks for the rowid hint; I used it to construct a single-pass query structure that works for AND queries only:

SELECT DISTINCT F.directory, F.name FROM file_tag AS FT 
 JOIN file AS F ON F.id=FT.file_id
 JOIN file_tag AS FT2 ON F.id=FT2.file_id
 WHERE FT.tag_id = (SELECT id FROM tag WHERE name = "foo")
  AND FT2.tag_id = (SELECT id FROM tag WHERE name = "bar")
 ORDER BY max(FT.rowid, FT2.rowid) DESC LIMIT 10;

(I don't know enough about SQL to handle NOT properly in this context, but I usually use just AND anyway, so this is fine for a temporary workaround.)

EDIT: It just occurred to me -- don't you have the file_tag rowid available for selection when you are doing the main query anyway? Could you avoid the second query by selecting this and doing the sorting outside of SQL, or not?

0ion9 avatar Dec 23 '14 06:12 0ion9

The file_tag rowid is not directly available as I'm doing an 'in' check, rather than getting values from file_tag. Leave it with me and I'll see what I can do. There might be something I can do with the SQL otherwise I'll just have to run the results through the aforementioned sort.

oniony avatar Dec 23 '14 10:12 oniony

The --sort option on 'files' should now cover this. Please let me know if there's anything I've missed otherwise I'll close this.

oniony avatar Jul 31 '15 19:07 oniony

Changes went in first in b88a6fc.

oniony avatar Jul 31 '15 19:07 oniony

Ah, realised the requirement for 'recency' is not update, not tagging so the --sort=id is probably not cutting it for you.

oniony avatar Jul 31 '15 19:07 oniony

Yeah, I tried all --sort options awhile ago, none of them achieve this effect so I'm still using an sqlite3 command to perform this.

Ah, realised the requirement for 'recency' is not update, not tagging

I guess that there is a mistake in this sentence, so I'll just restate what I'm talking about for clarity's sake: "Sort by on-disk order within the file_tag table (not the file table). Functionally, sorts by 'how recently was applied to a given file'"

And in SQL + bash:

sqlite3 $DB 'SELECT F.directory || "/" || F.name FROM file_tag AS T JOIN file AS F ON T.file_id=F.id WHERE tag_id = (SELECT id FROM tag WHERE name = "have_drawn") ORDER BY T.rowid;' | tail -30

(to get the 30 files that were most recently tagged have_drawn, with the absolute most recent at the end. The 'ORDER BY T.rowid' is just for explicitness' sake -- omitting it achieves the same result set in the same order.)

0ion9 avatar Aug 01 '15 04:08 0ion9