planemo icon indicating copy to clipboard operation
planemo copied to clipboard

[WIP] Implement automatic tool generation based on the source code of the tool

Open Kulivox opened this issue 3 years ago • 14 comments

Hello, A few months ago I had to implement a bunch of tool wrappers for an open source toolkit and realised how much time could be saved if there was an option of automatic generation of parts of the wrapper in tool_init. That lead me to develop PyGalGen, my take on the functionality of tool_init. I have shown the tool to @martenson, and he has agreed that migration of the part of the functionality that generates the wrapper file is worthwhile.

This pull request contains the prototype of the functionality. The generator is capable of extracting inputs and generating a command section for tools written in Python3 that use the ArgumentParser module. There is still some work to be done, mainly in testing and documentation.

Current functionality: Parser information can be extracted if the following conditions are met:

  • tool uses ArgumentParser
  • the parser init is in a single function

TODO:

  • [x] add documentation

Example input: url Example output: https://pastebin.com/v9z4ZMAD

Suggestions are welcome!

Kulivox avatar Aug 22 '22 15:08 Kulivox

Hi @Kulivox are you aware of https://github.com/hexylena/argparse2tool

bernt-matthias avatar Aug 23 '22 11:08 bernt-matthias

Hi @Kulivox are you aware of https://github.com/hexylena/argparse2tool

Hi, no, I didn't know such project already exists Good to know I guess :D

Kulivox avatar Aug 23 '22 18:08 Kulivox

@bernt-matthias I met with @Kulivox @biomonika and discussed a bit how to proceed, what to do with argparse2tool. They discussed their methods, it seems a bit safer/cleaner than my fake argparse method and I'm happy to support this being in planemo, instead of a2t, and instead of attempting to merge the two as I think the two of us are the primary users. So we'll let them co-exist rather than spending the time to try and merge.

It would be nice to have a2t's functionality with both cwl + galaxy export, but I think that's not a priority, we can always direct people to use a2t instead if they need that functionality, just means I'll probably do less maintenance of it. @Kulivox's implement is more complete it seems, a2t is missing argparse command groups currently.

And TIL planemo has a text tool template inside, maybe that's a use case for our galaxyxml (but maybe that's also not worth the return on investment?)

@Kulivox would you be open to handling ('output', type=argparse.FileType('w')) as a tool output? I think that's the only major difference I see, and I'm pretty sure I'm the only person alive who uses it, but, I'd still like it. We also have some existing tool test cases, if you wanted to use those as a basis for the tests here.

hexylena avatar Sep 08 '22 15:09 hexylena

@hexylena Sure, I'll add the type mapping in, and thanks; I think I'll also use the tool examples you provided to create tests

Kulivox avatar Sep 11 '22 21:09 Kulivox

I'll be doing a final pass over the functionality of autpygen during the next few days, but I don't think I'll be changing much, maybe adding support for some very specific argument types and some cleanup of the output. The PR is IMHO ready for review, I have updated automatically generated docs, but I won't be adding custom docs until the functionality is finalised. Would it be possible for you @bernt-matthias to take a look at this PR? Or should I contact someone else?

Kulivox avatar Sep 30 '22 14:09 Kulivox

@Kulivox will you be at the EGD in Freiburg next week?

bernt-matthias avatar Sep 30 '22 14:09 bernt-matthias

@bernt-matthias No, unfortunately

Kulivox avatar Sep 30 '22 14:09 Kulivox

Hey @Kulivox would you be available for a VC on Thursday or Friday? I guess the easiest for a review would be if you could walk us through your code.

Maybe @hexylena and @nsoranzo would like to join?

bernt-matthias avatar Oct 04 '22 09:10 bernt-matthias

sounds good.

On 04-10-2022 11:21, M Bernt wrote:

Hey @Kulivox https://github.com/Kulivox would you be available for a VC on Thursday or Friday? I guess the easiest for a review would be if you could walk us through your code.

Maybe @hexylena https://github.com/hexylena and @nsoranzo https://github.com/nsoranzo would like to join?

— Reply to this email directly, view it on GitHub https://github.com/galaxyproject/planemo/pull/1263#issuecomment-1266653402, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADP7O5EPPB7HU5RY55TZ7LWBPZKRANCNFSM57IE7DKQ. You are receiving this because you were mentioned.Message ID: @.***>

hexylena avatar Oct 04 '22 09:10 hexylena

Hi, I should be available on Friday after 4 pm CET. Does that work for you too? @bernt-matthias @hexylena

Kulivox avatar Oct 04 '22 20:10 Kulivox

Hi @Kulivox I will be on the train from approx 2pm. So unfortunately, 4pm won't work for me.

bernt-matthias avatar Oct 05 '22 09:10 bernt-matthias

Ah, that is unfortunate. I'll be at work from 7-15 on Friday, so I can't do the call earlier. What about some later date? I should be available anytime during the weekend and after 4pm on weekdays @bernt-matthias

Kulivox avatar Oct 05 '22 18:10 Kulivox

Hi @Kulivox Due to vacation I will only have time from Nov 7 :(. But maybe @hexylena and @nsoranzo could help earlier...?

bernt-matthias avatar Oct 06 '22 10:10 bernt-matthias

Hi, @hexylena @nsoranzo, would it be possible to arrange a VC where we take a look at the PR as @bernt-matthias suggested? I should be available any day apart from Tuesday.

Kulivox avatar Oct 07 '22 06:10 Kulivox

Hi @bernt-matthias , I hope you have had a great time away. If you're up for it, could we please continue the review? Thanks a lot for your help!

biomonika avatar Nov 07 '22 20:11 biomonika

Hi @Kulivox and @hexylena could you suggest a date / dates during next week? Currently I'm available except for Wednesday Morning (Berlin time).

bernt-matthias avatar Nov 08 '22 10:11 bernt-matthias

@bernt-matthias I am free on Monday afternoon (16:30 + CET), Tuesday evening (18:00+) and then on any day from 7-9 AM CET and from 16:15-23:59 CET

Kulivox avatar Nov 08 '22 15:11 Kulivox

Monday afternoon (16:30 + CET) would be cool for me.

bernt-matthias avatar Nov 08 '22 16:11 bernt-matthias

I can make that as well

On Tue, 8 Nov 2022, 17:18 M Bernt, @.***> wrote:

Monday afternoon (16:30 + CET) would be cool for me.

— Reply to this email directly, view it on GitHub https://github.com/galaxyproject/planemo/pull/1263#issuecomment-1307474587, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADP7O4UDZMRBXRZJOVE5KDWHJ4IXANCNFSM57IE7DKQ . You are receiving this because you were mentioned.Message ID: @.***>

hexylena avatar Nov 09 '22 07:11 hexylena

@hexylena @bernt-matthias Alright then, I'll set up a meeting starting at 17:00 because I might not be ready at 16:30 (train journey home)

Kulivox avatar Nov 12 '22 10:11 Kulivox

Wonderful.

bernt-matthias avatar Nov 13 '22 09:11 bernt-matthias

Hey @Kulivox .. thanks for keeping this alive. I feel bad that I have not tested this so for. Somehow the fell of the table. Please feel free to ping me in order to remind me :)

bernt-matthias avatar Feb 09 '23 16:02 bernt-matthias

No worries, I also put the PR on hold because work and school took up most of my time :) Also, I was just setting up my dev environment and updating the branch, I didn't mean to ping you with the push :)

On Thu, 9 Feb 2023, 17:34 M Bernt, @.***> wrote:

Hey @Kulivox https://github.com/Kulivox .. thanks for keeping this alive. I feel bad that I have not tested this so for. Somehow the fell of the table. Please feel free to ping me in order to remind me :)

— Reply to this email directly, view it on GitHub https://github.com/galaxyproject/planemo/pull/1263#issuecomment-1424477936, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFCX3APMR6IENXDQHLMKCCDWWUMAJANCNFSM57IE7DKQ . You are receiving this because you were mentioned.Message ID: @.***>

Kulivox avatar Feb 09 '23 16:02 Kulivox

Thank you for the test. I've also done some work on the tool, fixed the build steps and reworked the generation of XML (previously, it was done manually, now it is done using lxml) I wanted to test the tool on some projects before pushing the update to the branch, but I guess you were faster than me :) I'll go through your suggestions over the weekend.

On Fri, 3 Mar 2023, 14:29 M Bernt, @.***> wrote:

@.**** commented on this pull request.

Finally found the time to test the PR. I used calisp https://github.com/kinestetika/Calisp/blob/b32a1c3a4706b1a87ba0b34c9f9a19ec28fdd789/src/calisp/main.py#L33 for the test.

I have a few minor modifications that were needed to get it running. If you like I could commit them to your branch or open a PR to your branch (whatever you like most).

Really liked the outcome and only have a few more comments so far. Have another tool in the pipeline that I could test later.

Few comments:

  • default values should go to value="..." (for selects the defualt options should have selected="true")
  • in my experiment with calisp all arguments were optional .. but this might be a problem of the argparse specification of the tool?
  • indentation of #if statements should be 4 spaces. It is 3 spaces at the moment.
  • text and data parameters should be single quoted in the command.
  • help should be quoted, e.g. & &quot` ...
  • Not sure if we want to keep the comment lines (starting with two dashes ##) in the command

In planemo/tool_builder.py https://github.com/galaxyproject/planemo/pull/1263#discussion_r1124190803 :

REUSING_MACROS_MESSAGE = ( "Macros file macros.xml already exists, assuming " " it has relevant planemo-generated definitions." ) DEFAULT_CWL_VERSION = "v1.0"

TOOL_TEMPLATE = """

Would be cool to have @._VERSION@@._SUFFIX@" and define a macros section

<macros>
    <token ***@***.***_VERSION@">{{version}}</token>
    <token ***@***.***_SUFFIX@">0</token>
</macros>

(or add the two macro tokens to an already existing macros section)

In planemo/autopygen/xml/xml_utils.py https://github.com/galaxyproject/planemo/pull/1263#discussion_r1124196570 :

  • return formatted_xml_elem("repeat", attributes, depth, inner)

+def param(param_info: ParamInfo, depth: int, body: Optional[str] = None):

  • attributes = {
  •    "argument": param_info.argument,
    
  •    "type": param_info.type,
    
  •    "format": param_info.format,
    
  • }
  • if param_info.param_type.is_flag:
  •    attributes["truevalue"] = param_info.argument
    
  •    attributes["falsevalue"] = ""
    
  •    attributes["checked"] = "false"
    
  • attributes["optional"] = str(param_info.optional).lower()

optional can be omitted from booleans.

— Reply to this email directly, view it on GitHub https://github.com/galaxyproject/planemo/pull/1263#pullrequestreview-1323344465, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFCX3AJR7T6LXFKVAVVW4YLW2HW3ZANCNFSM57IE7DKQ . You are receiving this because you were mentioned.Message ID: @.***>

Kulivox avatar Mar 03 '23 13:03 Kulivox

Hi @bernt-matthias, thanks for the fixes you pushed yesterday. I went through most of your suggestions:

default values should go to value="..." (for selects the default options should have selected="true")

Added in one of the commits for basic parameters and selects. Repeats will ignore default values because I'm not sure if there is a way to define default values for repeats.

in my experiment with calisp all arguments were optional .. but this might be a problem of the argparse specification of the tool?

I wasn't able to replicate that; this is my output. The required parameters have optional set to false https://pastebin.com/ANqbvMkw

indentation of #if statements should be 4 spaces. It is 3 spaces at the moment. text and data parameters should be single quoted in the command. Not sure if we want to keep the comment lines (starting with two dashes ##) in the command

Implemented and updated tests

help should be quoted, e.g. & &quot` ...

I'm not sure what you mean by this. Do you mean the help strings in the inputs?

Btw, thanks for reminding me generator is also able to use the formatted help from the argparse. I've added the help generation in the last commit.

Kulivox avatar Mar 05 '23 13:03 Kulivox

Hi @bernt-matthias, thanks for the fixes you pushed yesterday. I went through most of your suggestions:

default values should go to value="..." (for selects the default options should have selected="true")

Added in one of the commits for basic parameters and selects. Repeats will ignore default values because I'm not sure if there is a way to define default values for repeats.

I think that is not possible. I noted this already here: https://github.com/galaxyproject/galaxy/issues/6806#issuecomment-559955872

in my experiment with calisp all arguments were optional .. but this might be a problem of the argparse specification of the tool?

I wasn't able to replicate that; this is my output. The required parameters have optional set to false https://pastebin.com/ANqbvMkw

Ok. Might well be that I missed that. Btw. booleans should either omit the attribute of have optional="true"

help should be quoted, e.g. & &quot` ...

I'm not sure what you mean by this. Do you mean the help strings in the inputs?

Exactly. Otherwise XML might end up invalid. For instance if " is used.

Btw, thanks for reminding me generator is also able to use the formatted help from the argparse. I've added the help generation in the last commit.

Cool

bernt-matthias avatar Mar 06 '23 09:03 bernt-matthias

Ok. Might well be that I missed that. Btw. booleans should either omit the attribute of have optional="true"

I've implemented that a few weeks ago, but I did it after I created the Pastebin. Sorry for the confusion.

help should be quoted, e.g. & &quot` ...

This is done automatically by lxml

Kulivox avatar Mar 20 '23 10:03 Kulivox

Currently testing another tool. If I find some things that might be improved I will add them here:

If there is a command line boolean flag: --flag it would be great to define it with truevalue="--flag" falsevalue="--flag" checked="true/false" (checked depending on the default) .. the value attribute should be omitted.

Use it in the command just as $flag.

Or wouldn't this work for:

    parser.add_argument('--flag', metavar='BOOL', type=bool, default=true)

?

bernt-matthias avatar Jun 05 '23 18:06 bernt-matthias

I guess we should merge this soon. It's really good and usable in my opinion. We can still improve it and fix bugs later on... or what do you think @mvdbeek @nsoranzo and @hexylena ?

bernt-matthias avatar Jun 05 '23 18:06 bernt-matthias

This is good to go in my opinion. Not sure about docs: Obviously a good idea, but I don't know where it would fit best and what the extent/form should be.

bernt-matthias avatar Sep 01 '23 09:09 bernt-matthias