[WIP] Implement automatic tool generation based on the source code of the tool
Hello, A few months ago I had to implement a bunch of tool wrappers for an open source toolkit and realised how much time could be saved if there was an option of automatic generation of parts of the wrapper in tool_init. That lead me to develop PyGalGen, my take on the functionality of tool_init. I have shown the tool to @martenson, and he has agreed that migration of the part of the functionality that generates the wrapper file is worthwhile.
This pull request contains the prototype of the functionality. The generator is capable of extracting inputs and generating a command section for tools written in Python3 that use the ArgumentParser module. There is still some work to be done, mainly in testing and documentation.
Current functionality: Parser information can be extracted if the following conditions are met:
- tool uses ArgumentParser
- the parser init is in a single function
TODO:
- [x] add documentation
Example input: url Example output: https://pastebin.com/v9z4ZMAD
Suggestions are welcome!
Hi @Kulivox are you aware of https://github.com/hexylena/argparse2tool
Hi @Kulivox are you aware of https://github.com/hexylena/argparse2tool
Hi, no, I didn't know such project already exists Good to know I guess :D
@bernt-matthias I met with @Kulivox @biomonika and discussed a bit how to proceed, what to do with argparse2tool. They discussed their methods, it seems a bit safer/cleaner than my fake argparse method and I'm happy to support this being in planemo, instead of a2t, and instead of attempting to merge the two as I think the two of us are the primary users. So we'll let them co-exist rather than spending the time to try and merge.
It would be nice to have a2t's functionality with both cwl + galaxy export, but I think that's not a priority, we can always direct people to use a2t instead if they need that functionality, just means I'll probably do less maintenance of it. @Kulivox's implement is more complete it seems, a2t is missing argparse command groups currently.
And TIL planemo has a text tool template inside, maybe that's a use case for our galaxyxml (but maybe that's also not worth the return on investment?)
@Kulivox would you be open to handling ('output', type=argparse.FileType('w')) as a tool output? I think that's the only major difference I see, and I'm pretty sure I'm the only person alive who uses it, but, I'd still like it. We also have some existing tool test cases, if you wanted to use those as a basis for the tests here.
@hexylena Sure, I'll add the type mapping in, and thanks; I think I'll also use the tool examples you provided to create tests
I'll be doing a final pass over the functionality of autpygen during the next few days, but I don't think I'll be changing much, maybe adding support for some very specific argument types and some cleanup of the output. The PR is IMHO ready for review, I have updated automatically generated docs, but I won't be adding custom docs until the functionality is finalised. Would it be possible for you @bernt-matthias to take a look at this PR? Or should I contact someone else?
@Kulivox will you be at the EGD in Freiburg next week?
@bernt-matthias No, unfortunately
Hey @Kulivox would you be available for a VC on Thursday or Friday? I guess the easiest for a review would be if you could walk us through your code.
Maybe @hexylena and @nsoranzo would like to join?
sounds good.
On 04-10-2022 11:21, M Bernt wrote:
Hey @Kulivox https://github.com/Kulivox would you be available for a VC on Thursday or Friday? I guess the easiest for a review would be if you could walk us through your code.
Maybe @hexylena https://github.com/hexylena and @nsoranzo https://github.com/nsoranzo would like to join?
— Reply to this email directly, view it on GitHub https://github.com/galaxyproject/planemo/pull/1263#issuecomment-1266653402, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADP7O5EPPB7HU5RY55TZ7LWBPZKRANCNFSM57IE7DKQ. You are receiving this because you were mentioned.Message ID: @.***>
Hi, I should be available on Friday after 4 pm CET. Does that work for you too? @bernt-matthias @hexylena
Hi @Kulivox I will be on the train from approx 2pm. So unfortunately, 4pm won't work for me.
Ah, that is unfortunate. I'll be at work from 7-15 on Friday, so I can't do the call earlier. What about some later date? I should be available anytime during the weekend and after 4pm on weekdays @bernt-matthias
Hi @Kulivox Due to vacation I will only have time from Nov 7 :(. But maybe @hexylena and @nsoranzo could help earlier...?
Hi, @hexylena @nsoranzo, would it be possible to arrange a VC where we take a look at the PR as @bernt-matthias suggested? I should be available any day apart from Tuesday.
Hi @bernt-matthias , I hope you have had a great time away. If you're up for it, could we please continue the review? Thanks a lot for your help!
Hi @Kulivox and @hexylena could you suggest a date / dates during next week? Currently I'm available except for Wednesday Morning (Berlin time).
@bernt-matthias I am free on Monday afternoon (16:30 + CET), Tuesday evening (18:00+) and then on any day from 7-9 AM CET and from 16:15-23:59 CET
Monday afternoon (16:30 + CET) would be cool for me.
I can make that as well
On Tue, 8 Nov 2022, 17:18 M Bernt, @.***> wrote:
Monday afternoon (16:30 + CET) would be cool for me.
— Reply to this email directly, view it on GitHub https://github.com/galaxyproject/planemo/pull/1263#issuecomment-1307474587, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADP7O4UDZMRBXRZJOVE5KDWHJ4IXANCNFSM57IE7DKQ . You are receiving this because you were mentioned.Message ID: @.***>
@hexylena @bernt-matthias Alright then, I'll set up a meeting starting at 17:00 because I might not be ready at 16:30 (train journey home)
Wonderful.
Hey @Kulivox .. thanks for keeping this alive. I feel bad that I have not tested this so for. Somehow the fell of the table. Please feel free to ping me in order to remind me :)
No worries, I also put the PR on hold because work and school took up most of my time :) Also, I was just setting up my dev environment and updating the branch, I didn't mean to ping you with the push :)
On Thu, 9 Feb 2023, 17:34 M Bernt, @.***> wrote:
Hey @Kulivox https://github.com/Kulivox .. thanks for keeping this alive. I feel bad that I have not tested this so for. Somehow the fell of the table. Please feel free to ping me in order to remind me :)
— Reply to this email directly, view it on GitHub https://github.com/galaxyproject/planemo/pull/1263#issuecomment-1424477936, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFCX3APMR6IENXDQHLMKCCDWWUMAJANCNFSM57IE7DKQ . You are receiving this because you were mentioned.Message ID: @.***>
Thank you for the test. I've also done some work on the tool, fixed the build steps and reworked the generation of XML (previously, it was done manually, now it is done using lxml) I wanted to test the tool on some projects before pushing the update to the branch, but I guess you were faster than me :) I'll go through your suggestions over the weekend.
On Fri, 3 Mar 2023, 14:29 M Bernt, @.***> wrote:
@.**** commented on this pull request.
Finally found the time to test the PR. I used calisp https://github.com/kinestetika/Calisp/blob/b32a1c3a4706b1a87ba0b34c9f9a19ec28fdd789/src/calisp/main.py#L33 for the test.
I have a few minor modifications that were needed to get it running. If you like I could commit them to your branch or open a PR to your branch (whatever you like most).
Really liked the outcome and only have a few more comments so far. Have another tool in the pipeline that I could test later.
Few comments:
- default values should go to value="..." (for selects the defualt options should have selected="true")
- in my experiment with calisp all arguments were optional .. but this might be a problem of the argparse specification of the tool?
- indentation of #if statements should be 4 spaces. It is 3 spaces at the moment.
- text and data parameters should be single quoted in the command.
- help should be quoted, e.g. & "` ...
- Not sure if we want to keep the comment lines (starting with two dashes ##) in the command
In planemo/tool_builder.py https://github.com/galaxyproject/planemo/pull/1263#discussion_r1124190803 :
REUSING_MACROS_MESSAGE = ( "Macros file macros.xml already exists, assuming " " it has relevant planemo-generated definitions." ) DEFAULT_CWL_VERSION = "v1.0"
TOOL_TEMPLATE = """
Would be cool to have @._VERSION@@._SUFFIX@" and define a macros section
<macros> <token ***@***.***_VERSION@">{{version}}</token> <token ***@***.***_SUFFIX@">0</token> </macros>(or add the two macro tokens to an already existing macros section)
In planemo/autopygen/xml/xml_utils.py https://github.com/galaxyproject/planemo/pull/1263#discussion_r1124196570 :
- return formatted_xml_elem("repeat", attributes, depth, inner)
+def param(param_info: ParamInfo, depth: int, body: Optional[str] = None):
- attributes = {
"argument": param_info.argument,"type": param_info.type,"format": param_info.format,- }
- if param_info.param_type.is_flag:
attributes["truevalue"] = param_info.argumentattributes["falsevalue"] = ""attributes["checked"] = "false"- attributes["optional"] = str(param_info.optional).lower()
optional can be omitted from booleans.
— Reply to this email directly, view it on GitHub https://github.com/galaxyproject/planemo/pull/1263#pullrequestreview-1323344465, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFCX3AJR7T6LXFKVAVVW4YLW2HW3ZANCNFSM57IE7DKQ . You are receiving this because you were mentioned.Message ID: @.***>
Hi @bernt-matthias, thanks for the fixes you pushed yesterday. I went through most of your suggestions:
default values should go to value="..." (for selects the default options should have selected="true")
Added in one of the commits for basic parameters and selects. Repeats will ignore default values because I'm not sure if there is a way to define default values for repeats.
in my experiment with calisp all arguments were optional .. but this might be a problem of the argparse specification of the tool?
I wasn't able to replicate that; this is my output. The required parameters have optional set to false https://pastebin.com/ANqbvMkw
indentation of #if statements should be 4 spaces. It is 3 spaces at the moment. text and data parameters should be single quoted in the command. Not sure if we want to keep the comment lines (starting with two dashes ##) in the command
Implemented and updated tests
help should be quoted, e.g. & "` ...
I'm not sure what you mean by this. Do you mean the help strings in the inputs?
Btw, thanks for reminding me generator is also able to use the formatted help from the argparse. I've added the help generation in the last commit.
Hi @bernt-matthias, thanks for the fixes you pushed yesterday. I went through most of your suggestions:
default values should go to value="..." (for selects the default options should have selected="true")
Added in one of the commits for basic parameters and selects. Repeats will ignore default values because I'm not sure if there is a way to define default values for repeats.
I think that is not possible. I noted this already here: https://github.com/galaxyproject/galaxy/issues/6806#issuecomment-559955872
in my experiment with calisp all arguments were optional .. but this might be a problem of the argparse specification of the tool?
I wasn't able to replicate that; this is my output. The required parameters have optional set to false https://pastebin.com/ANqbvMkw
Ok. Might well be that I missed that. Btw. booleans should either omit the attribute of have optional="true"
help should be quoted, e.g. & "` ...
I'm not sure what you mean by this. Do you mean the help strings in the inputs?
Exactly. Otherwise XML might end up invalid. For instance if " is used.
Btw, thanks for reminding me generator is also able to use the formatted help from the argparse. I've added the help generation in the last commit.
Cool
Ok. Might well be that I missed that. Btw. booleans should either omit the attribute of have optional="true"
I've implemented that a few weeks ago, but I did it after I created the Pastebin. Sorry for the confusion.
help should be quoted, e.g. & "` ...
This is done automatically by lxml
Currently testing another tool. If I find some things that might be improved I will add them here:
If there is a command line boolean flag: --flag it would be great to define it with truevalue="--flag" falsevalue="--flag" checked="true/false" (checked depending on the default) .. the value attribute should be omitted.
Use it in the command just as $flag.
Or wouldn't this work for:
parser.add_argument('--flag', metavar='BOOL', type=bool, default=true)
?
I guess we should merge this soon. It's really good and usable in my opinion. We can still improve it and fix bugs later on... or what do you think @mvdbeek @nsoranzo and @hexylena ?
This is good to go in my opinion. Not sure about docs: Obviously a good idea, but I don't know where it would fit best and what the extent/form should be.