pythongrid icon indicating copy to clipboard operation
pythongrid copied to clipboard

Joining forces with GridMap

Open dan-blanchard opened this issue 10 years ago • 1 comments

In October 2012, long before pythongrid was pip-installable, I took it upon myself to fork it (as GridMap) because I wanted to restructure many things and it wasn't on GitHub yet. In the course of doing so, I added a number of improvements over the then-current version of pythongrid:

  • Made everything Python 3 compatible (and require 2.7+)
  • Configuration is done via environment variables instead of modifying a Python file in site-packages
  • More documentation.
  • PEP8 clean up
  • Made it pip installable
  • Set up travis and coveralls for automated testing (and added more tests)
  • Remove a bunch of deprecated things
  • Split libpythongrid.py code up into separate modules for better readability
  • Process cpu/memory usage stats are now gathered using psutil
  • All jobs get killed if you kill the submitting Python process.
  • Other bug fixes.

One of the main reasons I forked it was that I experimented for a while with using Redis for managing the job inputs and outputs instead of sending everything with ØMQ, but I later saw the light and switched that back to how you guys were doing it previously.

Anyway, I feel like since I've returned to using ØMQ, our two projects aren't that different anymore. Really, I've just modernized things, added documentation, made configuration simpler, and fixed some bugs. At this point, I feel like it just makes sense to join forces, since we are both trying to do the same thing.

Trying to make a pull request with my changes would be very difficult (and essentially pointless for you to review), because we've both renamed files and moved things around since I initially grabbed things from your old Google Code repository in August of 2012.

As I've already got a number of issues/milestones/etc. we're tracking over on the GridMap repository, I'd like to invite you to check out our code and consider becoming a contributor to GridMap (and redirecting people there) if you like what you see.

Also, I should probably mention that I'm the current primary maintainer of DRMAA Python, which both our projects rely on, and I was planning on adding a link on the DRMAA Python page to GridMap. That's actually one of the reasons I am writing you this in the first place. I didn't want to confuse people by linking to both projects, since they're so similar, and since GridMap is easier to use and Python 3 compatible, I was just going to link to it. That made me feel guilty about never trying to give back to you guys after forking your project.

dan-blanchard avatar Aug 04 '14 17:08 dan-blanchard

Following.

gcuster1991 avatar Jan 22 '21 22:01 gcuster1991

you could use dplyr filter function to filter through the db with a vector of your taxa eg:

db %>% filter(., genus %in% taxa$genus)

db...object with funfun db taxa...object with you taxonomies

matevzl533 avatar Jan 23 '21 21:01 matevzl533

Thanks for the reply but sorry @matevzl533 that I couldn't understand it. Right now I have a count file (counts.tsv) and a taxonomy file (taxa.tsv). So how shall I proceed if I want to use FungalTraits with my data. Can I assign functions to each ASV ? Can you write the code in a bit detail. I am not an expert in R so its difficult for me to understand and I apologize for that.

Appreciate your help.

Sandipan

sandipansamaddar avatar Jan 26 '21 18:01 sandipansamaddar

ok.

Funfun db db is the object into which you have loaded the fungal traits file with:

db <- fungal _traits()

Genus is the column in db that contains the fungal genera. You can check it with:

db$Genus

Your taxonomy file taxa is the object into which you load taxa.tsv file - I guess it has columns with different taxonomic levels (eg. family, genus, species)`. With taxa$genus you select the column in taxa object that contains fungal genera.

Filtering method then you just run the code from above:

db %>%
 filter(., Genus %in% taxa$genus)

this will list the fungal traits db, but only rows that contain the same fungal genera as your taxa object.

Possible additional problem If your taxonomic names are clean eg. Glomus, then it works as described above. If you have additional parts denominating taxonomic level like f__Glomeraceae, g__Glomus (GreenGenes format), then you have to remove the first part with something like:

taxa <- taxa %>%
 mutate(genus = str_split_fixed(genus, “__”, 2)[2]

str_split_fixed is a function from library stringr. filter and mutate functions are from dplyr library.

matevzl533 avatar Jan 26 '21 20:01 matevzl533

WOW @matevzl533 . That worked. Thank you so much. I appreciate your help.

Sandipan

sandipansamaddar avatar Jan 28 '21 20:01 sandipansamaddar