kg-covid-19 icon indicating copy to clipboard operation
kg-covid-19 copied to clipboard

NCBI GEO

Open pnrobinson opened this issue 4 years ago • 5 comments

There are lots of relevant datasets. Many of them appear to be good old fashioned microarrays. This can be processed semiautomatically, and so we could try to download "lots" of datasets. This will need some further investigation

https://www.ncbi.nlm.nih.gov/gds?term=coronavirus&cmd=correctspelling

pnrobinson avatar Mar 19 '20 19:03 pnrobinson

I'm going to sign up for this one, assuming no other takers or higher priority data. Along the way I can also generate biclusters. I also have a biclustering collaborator at UPenn who is willing to donate time to help with dataset creation and applying methods. More on Monday!

realmarcin avatar Mar 22 '20 00:03 realmarcin

Hi Marcin, we need this data for another project as well. Please contact me about this. We did this about 10 years ago for 10K datasets in GEO and things went reasonably well with some Bioconductor scripts, although I woundn't be who I am if I could still find those scripts today....

pnrobinson avatar Mar 22 '20 00:03 pnrobinson

@pnrobinson @realmarcin -

Since there will likely be several GEO datasets we download, I'm wondering if we want to try and create (or use @pnrobinson existing scripts) a primary script that handles the bulk of the analyses that will be the same for these resources. Then, add specific transformation scripts as needed for each source. Seems more reproducible and robust in the long run. Thoughts? If you agree, I'm happy to help set-up a quick call so we formulate a plan for moving forward.

Linking other relevant issues: #11, #12, #19

callahantiff avatar Mar 23 '20 17:03 callahantiff

@pnrobinson @realmarcin -

Since there will likely be several GEO datasets we download, I'm wondering if we want to try and create (or use @pnrobinson existing scripts) a primary script that handles the bulk of the analyses that will be the same for these resources. Then, add specific transformation scripts as needed for each source. Seems more reproducible and robust in the long run. Thoughts? If you agree, I'm happy to help set-up a quick call so we formulate a plan for moving forward.

Linking other relevant issues: #11, #12, #19

@pnrobinson , @realmarcin - did you guys by chance talk about this today? Just curious on your thoughts.

callahantiff avatar Mar 26 '20 20:03 callahantiff

@pnrobinson discussed this ticket a bit with @realmarcin today, and we had some questions. possibly could discuss on on the n2v call (Thursday 12 pm your time), if you are going to be around

justaddcoffee avatar Apr 08 '20 00:04 justaddcoffee