ghidra
ghidra copied to clipboard
Request: Get component by field name
Is your feature request related to a problem? Please describe.
Working with multiple versions of a binary struct offsets change and then often the number of fields in a struct for a binary may be more or less flushed out than others. The getComponent/getComponentAt APIs can become cumbersome to use when these differences occur since the index or the ordinal cannot be the same in some instances across two binaries.
Field name would ideally be available in these higher level APIs without having to include that logic each user end java/python file
Describe the solution you'd like
getComponentByFieldName in the Structure class (not sure what others could use this)
Describe alternatives you've considered
Loop through all getNumComponents with getComponent{At} and check DataTypeComponent.getFieldName
I had been wanting to add this capability for some time but am faced with a dilema. The API has issues in that it can currently allow for duplicate field names to be specified. The API is inconsistent in throwning DuplicateNameException when specifying a component field name. So the question than occurs how to handle a duplicate name situation. I certainly do not want to encourage field name duplication. I am considering having a method such as getComponentByName(String name) return null if either a component was not found or more than one was found, alternatively it could throw an exception. I had also looked into tightening the API and throwing DuplicateNameException everywhere a component name could be established but there were quite a few methods affected which would impact user code. I would certainly like to hear user opinions on these points.
I would say for my use case: I do not expect a duplicate name, but I also do not want to guarantee that struct will contain the field, so the NullPointerException/null check would probably boil up to some other handler, so if you threw DuplicateNameException into the mix it wouldn't change how I plan to use it. Your null cases seem reasonable, but I do like the idea this new function would raise that exception so you're at least going down the path of unique field names (which approach is desirable to your current use cases and customers overall?).
I had been wanting to add this capability for some time but am faced with a dilema. The API has issues in that it can currently allow for duplicate field names to be specified. The API is inconsistent in throwning
DuplicateNameExceptionwhen specifying a component field name. So the question than occurs how to handle a duplicate name situation. I certainly do not want to encourage field name duplication. I am considering having a method such asgetComponentByName(String name)return null if either a component was not found or more than one was found, alternatively it could throw an exception. I had also looked into tightening the API and throwingDuplicateNameExceptioneverywhere a component name could be established but there were quite a few methods affected which would impact user code. I would certainly like to hear user opinions on these points.
I am guilty of having both intentionally and unintentionally duplicate field names using the API. There is inconsistency in this area of Ghidra in general. For example, if you attempt to rename a structure field using one of the actions and provide no field name it will throw an error saying it's not allowed. However, as everyone knows a field not having a name is most certainly something ghidra allows.
The only time I've really done it intentionally is when naming the destructor functions and overloaded functions for a vftable. There are times where I really would like to do this but it would more commonly be a mistake. Note: I hate having to "handle" a checked exception that I think shouldn't occur. It would be really cumbersome to have to handle a DuplicateNameException all over the composite API when adding new fields, renaming, etc.
Could it be an array return type or were you hoping avoid that?
Moving forward would it make sense when adding/changing components to force name uniqueness (e.g., adding a one-up suffix) and eliminate throwing DuplicateNameException where currently done. In addition, a new getComponentByName could return null and log error if a duplicate field name is detected (i.e., treat like it was not found). This would hopefully draw attention to it and allow it to be rectified.
My feeling is returning an array complicates matters and supporting duplicate names seems unneccessary. It seems in general if it occurs it would be rare and/or unintentional. Certainly, C/C++ source would not allow duplicate field names.
getComponentByName could return null and log error if a duplicate field name is detected
Revision: leaning towards returning first component found with given name. Error would only be logged when a duplicate detected for a component which is requested by name.