noctua icon indicating copy to clipboard operation
noctua copied to clipboard

curator-driven 1-time bulk loads

Open krchristie opened this issue 2 years ago • 4 comments

Hi,

This morning at MGI's weekly GO curator meeting, we were discussing the need for a mechanism to allow curators to trigger a bulk load for one-off cases that arise during curation.

The example we discussed this morning was this paper (citation below) about the salivary proteome in the mouse which identifies over 500 proteins present in mouse saliva. From this paper, we would like to load over 500 annotations. In MGI, we had a system were a curator could put a file in the appropriate format into a specific directory and it would get loaded without the curator needing to manually enter so many repetitive annotations. A similar system was in place when I was at SGD. We feel that many groups will have use of such a system.

  • format: @ukemi suggested that the file to be loaded should be in GPAD format, which is simple enough.

  • need for model ID info?: I was also wondering if the curator would need to specify a model for the annotations to go into. In this particular case, I think it would be simplest if all these annotations went into one model; I would be fine with either everything going into a new model OR with making one annotation manually so I can specify an existing model ID. I would strongly prefer that annotations from this paper not be dispersed into the existing gene-centric models, for a couple reasons: -- It is not practical for a curator to have to track down model IDs for multiple different gene-centric models. -- Some of MGI's gene-centric models are too large to edit in a timely fashion so MGI GO curators have agreed that it is not required to add all new annotations into the existing gene-centric models.

Stopka P, et al. On the saliva proteome of the Eastern European house mouse (Mus musculus musculus) focusing on sexual signalling and immunity. Sci Rep. 2016 Aug 31; 6:32481. PMID:27577013 https://pubmed.ncbi.nlm.nih.gov/27577013/

As it is no longer possible for me to have this kind of file loaded at MGI, this is fairly high priority for me in order to enter these annotations.

-Karen

@kltm @dustine32 - Apologies if I put this in the wrong repository, but I'm sure you'll move it if there's a preferred location.

krchristie avatar May 05 '22 17:05 krchristie

Hi @vanaukenk - any thoughts on when this might make it onto the priority list?

Thanks, Karen

krchristie avatar Feb 09 '23 18:02 krchristie

Sorry I just stumbled on this while looking for another ticket- this sounds a lot like the bulk imports @dustine32 is doing for SGD, especially

I think it would be simplest if all these annotations went into one model

@krchristie do you have these in GAF or GPAD format by any chance?

suzialeksander avatar Jul 21 '23 02:07 suzialeksander

Sorry I just stumbled on this while looking for another ticket- this sounds a lot like the bulk imports @dustine32 is doing for SGD, especially

I think it would be simplest if all these annotations went into one model

@krchristie do you have these in GAF or GPAD format by any chance?

@suzialeksander - I haven't spent any time formatting a file yet as I wanted to wait till I knew what format was required for a load. I think I could do either since I can make one annotation in Noctua and export in the appropriate format to use as a template where I'll just need to change the gene ID to generate all the other rows.

krchristie avatar Jul 28 '23 16:07 krchristie

Noting that bulk imports into Noctua indeed need to be GPAD, and this can be done. Will be similar to MGI, SGD, WB one-off imports but simpler, especially if these new adds can be put into the same new model without the need to slip them into existing models.

suzialeksander avatar Feb 12 '24 22:02 suzialeksander