datagrid-gtk3
datagrid-gtk3 copied to clipboard
Transform columns config into a dict
The way we pass the columns config to SQLiteSource if very error-prune. Here are the reasons:
- For example, taking a look at the code. Note that if the table have an id column that matches
ID_COLUMN
,counter
will not be incremented.
So, unless I know the inside of the code, just by looking at the api, I wouldn't know if I should add the id column in the config dict or not (answer: I should if none of the columns match ID_COLUMN
).
After some modifications I made on this PR, where I changed some assumptions that the id would be the index 1 (see this and this for examples), since the example data themselves didn't match ID_COLUMN
, some logic was added to get it based on the first primary key available.
So, after that code is merged, at least there's only one answer for the question above: I should not.
-
I have to be totally sure that the config order is the same as the database, considering the problem mentioned on 1.
-
If I want to add a config for column A but still let B and C be configured automatically, I can't with the actual code. We actually have an example here.
We just wanted to add datetime transformation on BirthDate
, but had to do the same for all other columns.
Also, the same applies for the config itself, that is a tuple containing (data_type, transform).
Not only it is hard to understand, but it falls in the same problems considered above (e.g. I can't change the transform without setting the data_type too). Not to mention that, tuples are hard to add new new options (as you can see here)
My suggestion for this issue is: Use a dict of dicts for config. Something like:
config = {
'column_a': {'transform': 'datetime', 'data_type': long},
'column_b': {'data_type': long},
'column_e': {'expand': True},
}
IMO these are all good insights and objections that need to be addressed.
Configuration store should definitely be a dict, but what do you think about making individual column configurations class instances? Something like this:
config = {
'column_a': ColumnConfig(transform='datetime', data_type=long},
'column_b': ColumnConfig(data_type=long),
'column_e': ColumnConfig(expand=True},
}
That would essentially be like having a schema for the pure dictionaries you suggested, and it would make value access a bit nicer. Also, this would enable us to easily encapsulate logic in column configurations if we ever need to.
@mdorn as you have been working on this a bit lately; what do you think of the suggestions in this issue? @bosonogi suggestion looks good to me
After working on viaforensics/viaextract-appdecode#38 I realize that the order of column in the column config can be significant -- for example for an implicit decoder query config. So this should be kept in mind when looking to replace the list with a dict.
EDIT: What this most likely means is that if we wish to go with a dict in the data source there will have to be a configuration conversion between the decoder and the data source; as opposed to using them directly like with viaforensics/datagrid-gtk3#19
I agree that @hackedbellini 's objections are legitimate, but I'm not sure the best way to deal with them in the long term. We're using lists of objects (dicts in JS lingo) because as Toni points out, order is significant. One alternative would be to add an "order" key to the dict, but I think that would be more difficult to maintain.
ID_COLUMN
should only ever be used for an application's internal purposes, e.g. for tracking which items have been selected, for deleting an item from "analyst findings", etc. Maybe we need a way to enforce its use/existence?
I'm hoping to get to the point where these configurations are generally not edited by hand, but by the app decoder application, which will reduce the error-proneness that Thiago describes.