website icon indicating copy to clipboard operation
website copied to clipboard

Automatic registration of pipelines at bio.tools

Open apeltzer opened this issue 6 years ago • 25 comments

Apparently, it would be possible to fetch our metadata and submit a registration of new pipelines / update our existing pipelines and make them listed at bio.tools:

https://biotools.readthedocs.io/en/latest/api_reference.html#register-a-tool

Not sure its a tools thing to do, or we could rely on the metadata used for the webpage already to submit such a pipeline on the frequent updates that our webpage gets already (so might need to transfer this issue then...)

@ewels @sven1103 might be interested in this :-)

apeltzer avatar Jan 11 '19 15:01 apeltzer

I like this idea

maxulysse avatar Jan 11 '19 15:01 maxulysse

I think we should transfer this to the web page repo, yes. We should only ping the API when we detect a new release. It shouldn’t be too tricky to do, just need to parse the local JSON first and then look for relevant changes.

Note that the same is needed for automated tweets announcing pipeline releases, so the two issues could tie together nicely.

ewels avatar Jan 11 '19 15:01 ewels

Transferred to the webpage. I might look into this the next couple of days - combining this with the Twitter API :)

apeltzer avatar Jan 11 '19 15:01 apeltzer

x-ref nf-core/nf-co.re#24

ewels avatar Jan 11 '19 17:01 ewels

Do you like PHP..? 😉

ewels avatar Jan 11 '19 17:01 ewels

Aaah no 🙈

apeltzer avatar Jan 11 '19 18:01 apeltzer

Sarek is already there and has a tonne of annotation - could be good to use as as a model: https://bio.tools/sarek

ewels avatar Jul 22 '20 13:07 ewels

Looking at this again quickly, I guess that we will need to make a bot account for automated management of pipelines. There doesn't seem to be a concept of organisations / teams for multiple tools. We can then use this bot account to do everything with the API (and perhaps add all the core team members to each pipeline for easy manual management).

ewels avatar Jan 11 '21 13:01 ewels

Ok, we now also have a collection, a subdomain and a consortium for nf-core:

  • Collection: https://bio.tools/t?collectionID=%22nf-core%22
  • Subdomain: https://nf-core.bio.tools
  • Consortium: https://bio.tools/t?credit=%22nf-core%22

The collection tag shows up on a tool right at the bottom. The consortium is shown under credits. The subdomain doesn't seem to be shown anywhere.

Consortium and collection are both managed through the tool administration. The subdomain seems trickier - currently tied to just my account and I can't share?

I guess it doesn't hurt to have all three? So can aim to keep all nf-core pipelines in both of these as we go forwards. Should hopefully not be a problem with the API updates.

ewels avatar Jan 11 '21 13:01 ewels

NB: All of the above just trialled with Sarek so far, with the help of @MaxUlysse. @JoseEspinosa will now also try to add some more workflows manually. The aim will be to manage these via the API in the future as well as adding all remaining pipelines via that method.

ewels avatar Jan 11 '21 13:01 ewels

Hi @ewels , I have a JSON ready for the nfcore/rnaseq pipeline but I have some comments:

  • If I upload the file myself it will belong to me and I think that it might be worth to create a nf-core biotools account instead. Let me know what you think.
  • name field does not allow to have / meaning that the pipeline can not be named nf-core/rnaseq, currently I named it nf-core-rnaseq but you may have other suggestions.
  • Same for biotoolsID
  • I gave edit permissions to @ewels, @MaxUlysse and @JoseEspinosa, if someone else wants/should to be added let me know.

I can share the JSON with any of you and you can see how it looks like before uploading it.

JoseEspinosa avatar Jan 12 '21 09:01 JoseEspinosa

Nice - thanks @JoseEspinosa! I just made an account with username @nf-core - please also give that permissions.

ewels avatar Jan 12 '21 11:01 ewels

I think nf-core-rnaseq is good - most pipeline names will be far too generic without the nf-core- suffix.

@MaxUlysse - what do you think about renaming Sarek to be nf-core-sarek? I imagine varying from this convention will make automation difficult.

ewels avatar Jan 12 '21 11:01 ewels

I think we do need something like that indeed. Otherwise, I assume a pipeline named rnaseq will be difficult to find. I'll see what I can do now

maxulysse avatar Jan 12 '21 11:01 maxulysse

I think that once the id is provided it can not be changed, @MaxUlysse I can generate a new entry for Sarek named nf-core-sarek belonging to the nf-core account. I will give writing permissions to @ewels and @MaxUlysse so you can modify it. Actually, I download the JSON file from the sarek entry and modified it for the nf-core-rnaseq so it will be very easy to upload it. By the way, here is the nf-core-rnaseq entry

JoseEspinosa avatar Jan 12 '21 11:01 JoseEspinosa

I think that once the id is provided it can not be changed, @MaxUlysse I can generate a new entry for Sarek named nf-core-sarek belonging to the nf-core account. I will give writing permissions to @ewels and @MaxUlysse so you can modify it. Actually, I download the JSON file from the sarek entry and modified it for the nf-core-rnaseq so it will be very easy to upload it. By the way, here is the nf-core-rnaseq entry

I can try and ask the team to change the id to nf-core-sarek, or we can keep this one ('sarek'), and make it as the old version of nf-core-sarek, as I already did for caw: https://bio.tools/caw

maxulysse avatar Jan 12 '21 11:01 maxulysse

I think that to create a new version would be the best as for caw

JoseEspinosa avatar Jan 12 '21 11:01 JoseEspinosa

I have added the following pipelines to the nf-core bio.tools collection:

  • nf-core/sarek
  • nf-core/rnaseq
  • nf-core/chipseq
  • nf-core/hic
  • nf-core/atacseq
  • nf-core/cageseq
  • nf-core/methylseq
  • nf-core/viralrecon
  • nf-core/smrnaseq

Besides, on this repository I gather the JSON files I used for registering the pipelines and some API commands that might be useful to automatize the registration. @ewels, @MaxUlysse @JoseEspinosa and the nf-core bio.tools user have currently permissions to modify the entries.

JoseEspinosa avatar Jan 22 '21 06:01 JoseEspinosa

Awesome work @JoseEspinosa - thanks for this! When we get back to the automation work I'm sure that this will be super helpful! (Especially how you've written down the API commands 🙏🏻 )

ewels avatar Jan 25 '21 09:01 ewels

@JoseEspinosa Shall I manually update Sarek until we have the automatic registration?

maxulysse avatar Jan 26 '21 16:01 maxulysse

@MaxUlysse You can do it manually or otherwise, modify the corresponding JSON file using the bio.tools API, I collected some commands here, if you need anything let me know.

JoseEspinosa avatar Jan 26 '21 21:01 JoseEspinosa

Awesome work @JoseEspinosa - thanks for this! When we get back to the automation work I'm sure that this will be super helpful! (Especially how you've written down the API commands 🙏🏻 )

@ewels I might help with the automation, but to be fair, right now I don't have any clue on how/where to start... 😓

JoseEspinosa avatar Jan 26 '21 21:01 JoseEspinosa

I'll do it manually for this time, but I'll try to look into the API and help both of you out

maxulysse avatar Jan 26 '21 21:01 maxulysse

Bumping back to the nf-co.re repo again as this is not nf-core/tools. Tools is a CLI program that does not know about the current state of pipelines or releases. The website does respond to GitHub releases though, and can trigger automated events such as this.

ewels avatar Jun 02 '22 21:06 ewels

For example, this is the code that automatically tweets when there is a new pipeline release:

https://github.com/nf-core/nf-co.re/blob/f39ca67bc7652c6605b5877678cdf547342dc9b7/update_pipeline_details.php#L221-L222

ewels avatar Jun 02 '22 21:06 ewels