ghidra icon indicating copy to clipboard operation
ghidra copied to clipboard

Request: Get component by field name

Open mumbel opened this issue 2 years ago • 7 comments

Is your feature request related to a problem? Please describe. Working with multiple versions of a binary struct offsets change and then often the number of fields in a struct for a binary may be more or less flushed out than others. The getComponent/getComponentAt APIs can become cumbersome to use when these differences occur since the index or the ordinal cannot be the same in some instances across two binaries.

Field name would ideally be available in these higher level APIs without having to include that logic each user end java/python file

Describe the solution you'd like getComponentByFieldName in the Structure class (not sure what others could use this)

Describe alternatives you've considered Loop through all getNumComponents with getComponent{At} and check DataTypeComponent.getFieldName

mumbel avatar Apr 21 '23 16:04 mumbel

I had been wanting to add this capability for some time but am faced with a dilema. The API has issues in that it can currently allow for duplicate field names to be specified. The API is inconsistent in throwning DuplicateNameException when specifying a component field name. So the question than occurs how to handle a duplicate name situation. I certainly do not want to encourage field name duplication. I am considering having a method such as getComponentByName(String name) return null if either a component was not found or more than one was found, alternatively it could throw an exception. I had also looked into tightening the API and throwing DuplicateNameException everywhere a component name could be established but there were quite a few methods affected which would impact user code. I would certainly like to hear user opinions on these points.

ghidra1 avatar Apr 24 '23 16:04 ghidra1

I would say for my use case: I do not expect a duplicate name, but I also do not want to guarantee that struct will contain the field, so the NullPointerException/null check would probably boil up to some other handler, so if you threw DuplicateNameException into the mix it wouldn't change how I plan to use it. Your null cases seem reasonable, but I do like the idea this new function would raise that exception so you're at least going down the path of unique field names (which approach is desirable to your current use cases and customers overall?).

mumbel avatar Apr 25 '23 01:04 mumbel

I had been wanting to add this capability for some time but am faced with a dilema. The API has issues in that it can currently allow for duplicate field names to be specified. The API is inconsistent in throwning DuplicateNameException when specifying a component field name. So the question than occurs how to handle a duplicate name situation. I certainly do not want to encourage field name duplication. I am considering having a method such as getComponentByName(String name) return null if either a component was not found or more than one was found, alternatively it could throw an exception. I had also looked into tightening the API and throwing DuplicateNameException everywhere a component name could be established but there were quite a few methods affected which would impact user code. I would certainly like to hear user opinions on these points.

I am guilty of having both intentionally and unintentionally duplicate field names using the API. There is inconsistency in this area of Ghidra in general. For example, if you attempt to rename a structure field using one of the actions and provide no field name it will throw an error saying it's not allowed. However, as everyone knows a field not having a name is most certainly something ghidra allows.

The only time I've really done it intentionally is when naming the destructor functions and overloaded functions for a vftable. There are times where I really would like to do this but it would more commonly be a mistake. Note: I hate having to "handle" a checked exception that I think shouldn't occur. It would be really cumbersome to have to handle a DuplicateNameException all over the composite API when adding new fields, renaming, etc.

astrelsky avatar Apr 25 '23 11:04 astrelsky

Could it be an array return type or were you hoping avoid that?

mumbel avatar Apr 25 '23 12:04 mumbel

Moving forward would it make sense when adding/changing components to force name uniqueness (e.g., adding a one-up suffix) and eliminate throwing DuplicateNameException where currently done. In addition, a new getComponentByName could return null and log error if a duplicate field name is detected (i.e., treat like it was not found). This would hopefully draw attention to it and allow it to be rectified.

ghidra1 avatar Apr 27 '23 19:04 ghidra1

My feeling is returning an array complicates matters and supporting duplicate names seems unneccessary. It seems in general if it occurs it would be rare and/or unintentional. Certainly, C/C++ source would not allow duplicate field names.

ghidra1 avatar Apr 27 '23 19:04 ghidra1

getComponentByName could return null and log error if a duplicate field name is detected

Revision: leaning towards returning first component found with given name. Error would only be logged when a duplicate detected for a component which is requested by name.

ghidra1 avatar Jun 21 '23 13:06 ghidra1