Abstract InstructionForm
Have a class representation of an InstructionForm, which can be created from an assembly line or by describing a mnemonic and its parameters.
This class should be able to generate a canonical string representation and handle (or at least reference) performance relevant information.
After applying the semantics with add_semantics (and if wished the optimal throughput with assign_optimal_throughput) to the kernel, each element of the kernel list inherits the following attributes as dict:
>>> kernel[0].keys()
dict_keys(['instruction', 'operands', 'directive', 'comment', 'label', 'line',
'line_number', 'semantic_operands', 'port_pressure', 'port_uops', 'flags',
'throughput', 'latency', 'latency_wo_load', 'latency_cp', 'latency_lcd'])
This object includes a canonical number (line_number), the original string representation (line) and performance relevant information (basically all other attributes).
Is that sufficient for you?
It's rather tedious to work with, some OO programming could be helpful here. E.g., a class InstructionForm, which captures the parsed information found in an assembly code line. Then this object can be passed a machine model, which allows it to relate it self to a mnemonic and offers performance relevant information. And so on.
Not urgent and requires some rewriting, but makes it easier to work with from a programmers point of view.