ghidra icon indicating copy to clipboard operation
ghidra copied to clipboard

Version Tracking: various improvements

Open sad-dev opened this issue 1 year ago • 2 comments

Is your feature request related to a problem? Please describe. I played around with the Version Tracking (thanks for the BSIM correlator!) and based on my experience, have a few pain points. I can split this into multiple issues if needed

i.)

Per the sample workflow in Ghidra Help, a typical version tracking session might proceed as thus:

1.) Run the exact correlators and accept 2.) Run other (typically duplicate first, and then the others) correlators and manually review/accept matches. The use of the Version Tracking Matches (VTM) table in the source/destination tool is explicitly suggested. To avoid ambiguity, I will refer to the VTM table in the main tool as MVTM 3.) Repeat (2) until success

Clearly the lion's share of time spent is on the manual review. If I click an entry in the MVTM table, I can see the diff views for manual review. I would like however, to see at a glance the confidence for various correlators on the selected entry i.e. for a given pair, the various confidences from correlators.

ii)

I found myself craving custom filters on the MVTM table, for example, to hide entries with conflicting results (also, would be nice to distinguish between conflicts with accepted matches versus conflicts with all matches).

iii) The VT session becomes clogged over time with entries from old correlator runs. I found it normal to rerun correlators after getting better information, and found myself wanting a way to easily remove non-accepted matches from these stale correlators. The correlators cannot be cleared without undoing all accepted matches after running them. I thought the "Clear" action might be it, but it doesn't appear to do anything?

iv) VT is extremely greedy when it comes to screen real estate. The Source/Destination tools (which can't be closed) are used primarily for their VTM tables (as scripting from them isn't well explored yet). AFAIK, individual windows like the all-important VTM cannot be docked out while minimizing the source/dest tools, which aggravates the problem.

in the Version Tracking Markup table, the table cannot be removed and I hardly use it (granted, this could just be me not knowing the way to use it effectively): image

At best I can drag it up: image

Describe the solution you'd like Some ideas are (in likely order of difficulty/effectiveness): 1.) Allow running correlators on a selection from the MVTM table. 2.) Custom table that shows all correlators involving source or destination (so conflicts can be reviewed as well) 3.) Port the Version tracking matches table from the source/destination tool into the main session tool. 4.) Action on the MVTM table that sets a filter on source/destination (as well as another to quickly remove it). This also requires allowing custom filters on the table, which might be weird to juggle in conjunction with the existing score/conf/length filters.

Describe alternatives you've considered Try to script it and pop my own tables? This might entail modifications to the code to make the VT model/ correlators accessible from the source/destination tools.

sad-dev avatar Dec 29 '23 02:12 sad-dev

AFAIK, individual windows like the all-important VTM cannot be docked out while minimizing the source/dest tools,

Correct

in the Version Tracking Markup table, the table cannot be removed and I hardly use it (granted, this could just be me not knowing the way to use it effectively):

Correct. The issue you are running into here is one that has been mentioned before. The diff view is meant to work directly with the Markup Items view. If you do not use that view, then you can close it. But, in doing so, you lose the function diff view. We are in talks now about creating a stand-alone diff view, which would address your use case.

The correlators cannot be cleared without undoing all accepted matches after running them. I thought the "Clear" action might be it, but it doesn't appear to do anything?

So, briefly, currently you can only Remove 'Implied Matches' and 'Manual Matches'. All other correlator matches cannot be removed. It has been long enough now that I don't recall the rationale behind this. We will look into changing this restriction. Also, in some cases, to remove a removable match, you have to first Clear that match.

The VT session becomes clogged over time with entries from old correlator runs.

Oh boy. TL;DR: much of the VT clunkiness you are encountering relates to original design.

The tool was meant to be small in scope, basically to be used for migrating markup from one version of a Ghidra program to a new version of that same program. We do not enforce that restriction, but the workflows we support are derived from that thinking.

VT sessions are considered transient creations the lifetime of which is very short-lived. Once the items of interest have been pulled-over, the session is to be recycled into the ether.

Admittedly, many users like to keep the sessions around for later memory stimulation and perhaps even continued work. Our support for this is non-existent.


Changes I think we can make:

  1. Show confidence values for a given match in the table
  2. Add filters to the Match Table for hiding conflicts
  3. Allow any match to be removed
  4. Run a correlator against the selected matches in the Match Table

dragonmacher avatar Feb 09 '24 17:02 dragonmacher

A few comments/questions to get a bit more clarity or to make suggestions to hopefully make your life a bit easier with the VT in its current form.

i. Have you tried the auto version tracking? It tries to automate the creation of matches when it can determine good ones. ii. Wrt hiding conflicting matches. I assume you want this so that you can see only matches that have no conflicts in the table to make it easier to match the ones with no conflicts first? Currently, if there are conflicting matches and you accept one of them, the rest of the possible matches that either have the same source or destination as the Accepted match are marked Blocked. You can use the filter button on the bottom right side of the MVTM (the big button with a lightbulb on it) to filter blocked matches out. There is currently no way to filter matches that are still unaccepted but have conflicting matches. That does sound like it would be a useful filter. iii. Do you have a reason for completely removing them instead of filtering out the unaccepted matches? The correlators that rerun using known info only use the Accepted matches to do so so you don't have to worry about the others changing those results. If you don't want to see them you can use a combination of filters using that same filter button I mentioned. You can uncheck "Available" and whatever correlator you used to find them. Or just the former if you don't want to see any available matches. iv. part one. I agree the other windows take up a lot of room and that it would be nice to not have them tied to the other tools. You could undock the VTM tables from the codebrowsers then shrink them to a preferred size then shrink the codebrowsers so they are tiny. If you keep the VTM's from overlapping the MVTM you can keep them all open at the same time. Not ideal but might help for now. iv part two. This is something I only learned recently. The markup table is what is driving the compare window, not the match table as I had assumed. The match table drives the markup table and the compare window is a part of that window. The markup compare window is to show users where the markup items are and if you select one, highlight the selected one in a darker color. There is another compare window attached to the Functions window. It highlights the differences in listing/decompiler of the selected functions. I am hoping we can change the behavior so that there is a function compare window ( like the one I just described for the functions window) attached to the MVTM table. Then if users don't need to see the markup at all, they could at least see the side by side differences between the selected matches in the MVTM table.

Regarding your suggested solutions:

  1. If you are asking to run on address selections corresponding to selected rows, you can do this now by selecting a row or rows in the table then right mousing and choosing "Make Selections". This will make the corresponding selections in the two code browser listings. If data it will select the whole item. If functions it will select the top addresses of the functions but you can go to each cb and choose Select-Functions and it will use only the selected addresses to select their entire function bodies. then if you run a correlator it will automatically only run on the selected address ranges. If however, you are asking to run the correlators that depend on other known matches to get better results, we cannot do that at this time. If this is what you want to do, can you describe why you do not want the correlators to use all the known match info. 2 I assume you want to only see one set of related conflicting matches at a time instead of all of them at once. I think an action for this that would populate a function compare window with pull down lists of these related conflicts would be the best way to do this. Or possibly a filter in the current Functions Window that would limit those lists to just related conflicts. That table does drive the function compare window so that would be useful to determine the "good" matches then once accepting the match the rest would become blocked and you could use the filter button to not show the rest of blocked ones for that family. Right now you can get half the story in the match table if you type in the address of source or destination in the text filter. At least you can see all matches for that one address like you would in the VTM.
  2. I agree this would be useful but have no idea how feasible it would be to do. It is worth triaging at least.
  3. See end of 2 - if you type in address (or copy/paste from the table row to save typing) to the text filter you will get this now. Not as quick or easy as just clicking a filter button as you suggested. It is a busy filter area right now. Perhaps an action that can be set to a keybinding.

Thanks for all the suggestions. Please let us know if our interpretations of your issues/suggestions are incorrect.

ghidra007 avatar Feb 16 '24 00:02 ghidra007

@ghidra007 any updates on this? A simple filter to allow filtering out results from the votes/conflicting/multiple source labels columns is almost critical to my workflow.

Eg if I sort by votes, I can still get conflicts, but I can get 20 votes and 5 conflicts, ot 2 votes and 0 conflicts (which is a better match).

Also a filter for length delta?

rollsch avatar Mar 06 '24 01:03 rollsch

@ghidra007 any updates on this? A simple filter to allow filtering out results from the votes/conflicting/multiple source labels columns is almost critical to my workflow.

Eg if I sort by votes, I can still get conflicts, but I can get 20 votes and 5 conflicts, ot 2 votes and 0 conflicts (which is a better match).

Also a filter for length delta?

I was waiting for feedback from the original requester to see if my suggestions fixed any of their issues and to make sure I understood what they were asking for. I was hoping they were unaware of some of the filters that can be set besides the ones visible at the bottom of the table. There is a button on the right of those filters that brings up a dialog with more filters.

For your use-case, have you tried multi-sorting on both the votes and conflicts columns? You can do that by first sorting on the votes column (or whichever column you want as a primary sort) then you can do a secondary sort on any other column by holding the Ctrl button when clicking on the column header to sort it. You should see a number 1 on the primary and 2 on the secondary. You can keep going and add a third level sort and so-on if you choose to, which may or may not help with the multi-label issue.

Just to make sure you understand what it means, the multiple source column just means that there are more than one labels at the source address, not that there are multiple matches for the source function/data. Is there a reason you don't want to see these? if it is because you don't want to apply the labels of both you can probably use the apply filters to get the behavior you want when applying the results.

For your length issue it might be worth adding a toggle option that allows showing only those with same length.

For the original posters multiple match issue, perhaps a toggle only showing unique matches could be added. Are you seeing your multiple matches from mainly the duplicate match correlators or mostly from different correlators showing different results? If the former, you can use the filter button to not show the duplicate match correlator results.

ghidra007 avatar Mar 06 '24 15:03 ghidra007

I've tried a few things with the sorting and if you want to show highest votes, no conflicts and no multiple labels at the top, you can do so by sorting on the multiple labels column first in descending order, then do a secondary sort on the votes in descending order, then do a third level sort on the conflicts in ascending order.

ghidra007 avatar Mar 06 '24 15:03 ghidra007

Apologies for the late reply. I have been playing around a bit more (including with the VT api) to get a better understanding of how I can/would like to use VT.

My preferences when porting symbols (via VT or other conventional approaches like bindiff/diaphora) is to perform a very conservative, iterative process of correlators with the analyst in the loop. I can see how VT was originally built for quick, transient ports, meaning some of the UI doesn't mesh well with my workflow.

In terms of window layouts, your interpretations of my initial post are pretty on point. The feature request for deletion of elements is mostly for to keep the window tidy (and potentially benefits from more sophisticated filters down the line); I think this could be replaced with (sorry if it already exists) feature to export all accepted matches to a new VT session.

I know about the additional custom filters (lightbulb button) on matches, but often the things I want to filter on go beyond that. Typically, this involves combining positive (correlators agreeing) and negative (conflicting matches) information in a way that goes beyond even what standard TableChooserDialog compound filters are capable of/suitable for

I found the VT scripting api (applied from CodeBrowser tool, which was an annoyance) pretty useful in this regard, as they can examine the results of previous correlators and combine them however I please. Perhaps it would be better to keep modifications to the UI light, while allowing us to directly run scripts from VT? Custom filter wise, i think the display for unaccepted + no conflicting matched would be useful to many VT users, but users hoping to invent their own diaphora are probably better off using scripts.

sad-dev avatar Mar 07 '24 05:03 sad-dev

Perhaps it would be better to keep modifications to the UI light, while allowing us to directly run scripts from VT?

Being able to run scripts from the VT main UI is an interesting idea. It is not clear how specialized that setup would need to be, as the normal scripting API makes assumptions about the relationship between the tool and its programs.

What may be easier is to provide a simple interface for allowing users to write code snippets specifically for filtering on the matches table.

Otherwise, adding more filters to the 'More Filters' (More More Filters?) is pretty easy, assuming we can figure out what is useful to users.

dragonmacher avatar Mar 07 '24 14:03 dragonmacher

In case you were not aware, with regard to being able to delete or remove matches from view, you can get the same effect in VT now by using tags. You can select matches you no longer wish to see, tag them with an arbitrary tag, and then hide them from view using the 'More Filters' dialog.

dragonmacher avatar Jul 02 '24 23:07 dragonmacher

You can now delete matches in the matches table. We suggest you generally keep matches you have accepted, as they help improve further correlator runs.

The work done for #6281 now allows you do perform more complex filtering on the matches table columns.

dragonmacher avatar Jul 22 '24 15:07 dragonmacher