dance icon indicating copy to clipboard operation
dance copied to clipboard

Some questions about general wrapper for datasets

Open HelloWorldLTY opened this issue 2 years ago • 8 comments

Hi, I intend to apply this model to different datasets rather than the competition datasets, and I wonder if you have any general loading data structure to load public datasets or not. Moreover, is it possible for me to use a lighter structure comparing the jointembedding structure if I have already processed the given dataset? Thanks.

HelloWorldLTY avatar Oct 16 '22 19:10 HelloWorldLTY

Hi @HelloWorldLTY, thanks for your interest in the dance package! We are currently working on some heavy refactoring to clean up several interfaces, including datasets', and make them more user-friendly, e.g., apply methods to their own datasets, and benchmark their method on datasets provided by the package. As for now, there isn't an easy way to work with custom datasets. We expect to fix this within the next month or so.

Is your primary interest in using your custom dataset on joint-embedding tasks? If so, I can make that a priority so that you can play with the models soon.

RemyLau avatar Oct 16 '22 21:10 RemyLau

Thanks a lot, I am now working on JAE and since my dataset is very large, this tool is not very efficient.

HelloWorldLTY avatar Oct 16 '22 22:10 HelloWorldLTY

@HelloWorldLTY Currently, most datasets are loaded from an AnnData object, which is one of the standard data objects for single-cell omics data. So long as your processed data structure can be interfaced with AnnData easily, it shouldn't be a big deal.

Could you briefly describe the type of data structure you are working with and what libraries you currently use to process them? We can also consider adding interfaces for this particular type of data structure if it is somewhat standard as well.

RemyLau avatar Oct 16 '22 22:10 RemyLau

Hi, I prefer anndata object based on scanpy, and I am currently using this type of data.

HelloWorldLTY avatar Oct 16 '22 22:10 HelloWorldLTY

Ok, sounds good! This should be supported natively soon. I'll keep you posted on that.

RemyLau avatar Oct 16 '22 23:10 RemyLau

This is related to an ongoing refactoring task #49

RemyLau avatar Dec 31 '22 18:12 RemyLau

This is related to an ongoing refactoring task #49

yeah, I also get into trouble when I want to apply the jointembedding scmogcn model to my own GEX+ATAC data. My data is stored as annadata, and is there any tutorial that I can learn from?

gabumon0 avatar Mar 14 '23 07:03 gabumon0

数据集在哪里下载

htumlc avatar Dec 26 '23 06:12 htumlc