OpenRefine
OpenRefine copied to clipboard
Add comments in a recipe
Hello, I think it would be a great feature to have the possibility to comment a recipe directly in the script. This would allow several collaborators to have a better view of how works the recipe and facilitate collaborative working.
Thank you!
I support Herve proposition. Preparing data is like telling a story as there is never one way of doing something.
Quickly a data preparation process can go through hundreds of steps and understanding the logic afterwards can be complex. Like developers comment their code to with their logic, we should be able to do the same in Refine.
This will help with collaboration and reproducibility of the code along with improving auditing.
I have been using comments for a while now, by adding column transformations of the form:
jython:return value # comment
(which do nothing) and then using an overloaded renderer of the HistoryPanel class in a custom plugin:
// labels in history panel
HistoryPanel.prototype._renderOrig=HistoryPanel.prototype._render;
HistoryPanel.prototype._render= function() {
this._renderOrig();
var self=this;
var elmts = DOM.bind(this._div);
//console.log(elmts);
elmts.bodyDiv.find(".history-entry").each(function() {
var text = this.childNodes[1].firstChild.nodeValue;
var mtch=text.match(/Text transform on 0 cells in column [^:]+: jython:
_return +value *#(._)/);
if (mtch) {
this.childNodes[1].firstChild.nodeValue=mtch[1];
$(this.childNodes[1]).addClass("history-comment");
}
});
}
together with some css for the history:
.history-past a.history-entry .history-comment {
color: #00f
}
.history-now a.history-entry .history-comment {
color: #ff4
}
.history-future a.history-entry .history-comment {
color: #88f
}
Cheers,
Herwig
Herwig Van Marck, PhD. Senior Expert Research Informatics
Rijvisschestraat 126 3/R, 9052 Zwijnaarde Tel: +32 (0)9 248.16.01 Bio Informatics Training and Service Facility (BITS http://www.bits.vib.be/)
Herwig
Thank you for sharing this! Is it currently available in a standalone extension ? Available on public repository? I'd love to see how we can integrate this to OpenRefine.
Martin.
Hey Martin,
It is part of an internal plugin (not public) because I see this as a temporary solution/hack. It would be good if comments could be implemented properly in OpenRefine!
Cheers,
Herwig
Herwig Van Marck, PhD. Senior Expert Research Informatics
Rijvisschestraat 126 3/R, 9052 Zwijnaarde Tel: +32 (0)9 248.16.01 Bio Informatics Training and Service Facility (BITS http://www.bits.vib.be/)
@hvmarck That's a interesting hack. what's the proper way to handle the comment according to your experience? I think it's a standalone plugin which you can apply the comment to any selected step chosen from the history panel.
Just curiosity if you don't mind. what's the other functionalities of your internal plugin?
The other stuff that is in the internal plugin:
-
project locking support (to run OpenRefine on a server accessed by multiple people), making sure only one person can open a project at a time
-
project ownership (keeps track who owns a project, defaulting to whoever created it) in conjunction with the previous locking support
-
adding an extra export button in the 'Open Project' page (between 'delete' and 'rename')
-
error handling to ProcessPanel.update to avoid process update error (automatically restarting when an error occurs)
-
a visualizer that treats columns that start with the string 'diff ' as a special column in which diff like annotations (created with a diff tool) are visualized with red strikethrough and green highlights
-
a function that compares a project to another project using the Coopy highlighter diff library http://dataprotocols.org/tabular-diff-format/
-
a function that compares the history of a project to the history of another project
-
a 'GoTo-record' to pagingControls, that allows you to click on the page control numbers and select a row number to start the display of the table (does not need to be a multiple of the range (e.g. 50) choosen)
-
Wrangler http://vis.stanford.edu/wrangler/ type bars in the headers to show how many non-blanks a column has (can be switched on and off)
So all in all a nice collection. Some of these could probably be put in a pulic plugin, but one reason I did not do that yet is that they rely on overloading the OpenRefine code, which must not change too much (after an update).
The recommended way to do this is use Javascript style comments and run things through JSmin before doing the JSON decoding.
The one reservation that I have to doing this is that it encourages people to consider the JSON editable which is something that we haven't thought through the implications of. There are a bunch of things like the client/server wire protocol, various internal data formats, etc which we've left undocumented on purpose so that we have the freedom to change them.
Has there been any further movement on this?
No movement, but someone could just add jsmin (perhaps from jawr-core https://github.com/j-a-w-r/jawr-main-repo/tree/master/jawr/jawr-core or https://github.com/collegeman/htmlcompressor) to where it would strip out the comments prior to where we apply operation history here: https://github.com/OpenRefine/OpenRefine/blob/master/main/webapp/modules/core/scripts/project/history-panel.js#L292
@cameronstewart Comments are now supported via the VIB-BITS plugin and its History Tools that support comments in Undo/Redo history. Read their manual. https://www.bits.vib.be/software-overview/openrefine
@hvmarck Thanks for finally adding support for comments in Undo/Redo history and making it a part of your public plugin !
Closing this issue, since feature is now handled via a public plugin that is available.
If this is useful functionality, which it would appear to be, it should be in the core product, not an extension (plus the VIB extension appears to be gone.
My reservation (https://github.com/OpenRefine/OpenRefine/issues/1057#issuecomment-148820539) still holds, but a better suggestion for a way to support comments, assuming we don't switch to an entirely new DSL, would be to support YAML and use YAML's comment facility. JSON readers typically support YAML as well since the two syntaxes are equivalent/compatible.
@tfmorris sounds reasonable.
Hi,
Is there something new about this issue?
Thanks in advance, Best,
Aude
Couldn’t you also support transformations in a dummy comment language and just ignore those transformations? They should end up in the recipe looking like comment:here are my comments. Also, I’m not sure how the suggestion to use YAML improves things. Don’t the same concerns exist there too?
Another option is to add a checksum to the serialized format to ensure integrity. So that if people edit the serialization without updating the checksum you get an error when you try to use it. This would probably catch the more naive programmers. You could even encrypt with a secret key if you want better assurances. This could all be done after stripping comments to make editing comments safe.
@audealexandre This issue contains the complete history and status. It will be updated when there is a change.
@gitonthescene Thanks for the implementation suggestions. The advantage of YAML is that it has a native comment syntax, whereas JSON doesn't.
Ah, JSON, not Javascript. Now I get it. Thanks.