scikit-bio icon indicating copy to clipboard operation
scikit-bio copied to clipboard

add _netembed utils

Open jianshu93 opened this issue 11 months ago • 6 comments

Dear skbio team (to Qiyun and Mat specifically),

I followed the _plotting.py in skbio/util folder to add this _netembed.py utility class, so that graphembed_rs can be imported and used. Not sure whether I need to also add some tests, but this add is to import the grapuembed library only when asked. By default it does not change anything. I am not a python person per se, so let me know your thoughts.

Thanks, Jianshu

jianshu93 avatar May 11 '25 03:05 jianshu93

Hi Qiyun and Mat, how about this one. I do not understand why it failed only in Ubuntu but not macOS. -Jianshu

jianshu93 avatar May 19 '25 00:05 jianshu93

@jianshu93 This looks great to me. The only thing I'd like to see before merging is some simple unit tests. We don't need to test the actual algorithms, just that the correct errors are raised if graphembed-rs isn't available and simple things like that. Similar to how we do the tests for the plotting module.

I also think it makes sense to keep it in util/_netembed.py for now. We're not certain which classes in scikit-bio will inherit this mixin in the future and we can always move it if necessary.

@qiyunzhu @mortonjt What do you think?

mataton avatar Aug 01 '25 18:08 mataton

Hi @mataton, Thanks I will add some basic tests. However the data has to be in a CSV format from a file. In that case, the test will need to read an additional file, where do you think I can add the files (a very small CSV file).

jianshu93 avatar Aug 01 '25 19:08 jianshu93

Data for a test can be added to a new folder at skbio/util/tests/data/some_file.csv. You can see an example of the typical setup at skbio/stats/tests/data.

mataton avatar Aug 01 '25 20:08 mataton

Hi both, do we have necessary code in scikit-bio to interface with graph embedding code? Like input / output data structures?

qiyunzhu avatar Aug 02 '25 14:08 qiyunzhu

For input we do not because I streamed the edge list. For output yes, output is just numpy vectors.

jianshu93 avatar Aug 02 '25 17:08 jianshu93