cugraph
cugraph copied to clipboard
[ENH] Re-org python test directory for consistency and ease-of-use
Our CPP tests are organized under a nice directory hierarchy, and our python tests could benefit from a similar organization to make particular tests easier to find and run. It might also be nice if the test subdirs were organized similar to the source code.
The default method of running pytest to run all tests shouldn't change, since pytest auto-discovers tests under a directory hierarchy.
Also, by organizing using separate folders, users can run a subset of tests easily by specifying a particular folder, for example: pytest cugraph/tests/algos/community (in addition to the currently supported ways using markers, -k expressions, etc.) which encourages working with several smaller files instead of single larger ones.
Below is a proposed test dir hierarchy. Is this something worth doing, or do we prefer the relatively flat layout we have today, or maybe some combination?
python
|
|- cugraph
| |
| |- tests
| | |
| | |- algos (this includes both SG & MG)
| | | |
| | | |- centrality
| | | | |- test_betweenness_centrality
| | | | |- test_edge_betweenness_centrality
| | | | |- test_katz_centrality
| | | | |- test_mg_katz_centrality
| | | | |- test_mg_batch_betweenness_centrality.py
| | | | |- test_mg_batch_edge_betweenness_centrality
| | | |
| | | |- community
| | | | |- test_egonet
| | | | |- test_ecg
| | | | |- test_leiden
| | | | |- test_louvain
| | | | |- test_mg_louvain
| | | | |- test_modularity (rename to spectral clustering related)
| | | | |- test_subgraph_extraction (rename to test_subgraph?)
| | | | |- test_triangle_count
| | | | |- test_balanced_cut
| | | |
| | | |- components
| | | | |- test_connectivity
| | | |
| | | |- cores
| | | | |- test_core_number
| | | |
| | | |- layout
| | | | |- test_force_atlas2
| | | |
| | | |- link_analysis
| | | | |- test_hits
| | | | |- test_pagerank
| | | | |- test_mg_pagerank
| | | |
| | | |- link_prediction
| | | | |- test_jaccard
| | | | |- test_wjaccard
| | | | |- test_overlap
| | | | |- test_woverlap
| | | |
| | | |- traversal
| | | | |- test_bfs
| | | | |- test_mg_bfs
| | | | |- test_sssp
| | | | |- test_mg_sssp
| | | | |- test_paths (rename to test_shortest_path_length?)
| | | | |- test_traveling_salesperson
| | | | |- test_filter_unreachable
| | | |
| | | |- tree
| | | | |- test_maximum_spanning_tree
| | | | |- test_minimum_spanning_tree
| | | |
| | | |- linear_assignment
| | | | |- test_hungarian
| | |
| | |- types (or structure?)
| | | |- test_graph
| | | |- test_hypergraph
| | | |- test_multigraph
| | | |- test_convert_matrix
| | | |- test_symmetrize
| | | |- test_mg_degree
| | |
| | |- internal (or infra?)
| | | |- test_nx_convert
| | | |- test_raft
| | | |- test_renumber
| | | |- test_utils
| | | |- test_mg_utility
| | | |- test_mg_renumber
| | | |- test_mg_replication
| | | |- test_mg_comms (need to verify we need this since it seems identical to test_mg_pagerank)
| | | |- [future] unit tests for graph infra classes
| | | |- [future] unit tests for cython layer calls?
looks like all the MG tests are thrown into a single folder. Would it be worth doing something like python | |- cugraph | | | |- tests | | |-SG .... | | |-MG ....
I put the MG and SG tests side-by-side with the other tests for each algo (so they're actually distributed across several folders, but not in a dedicated MG folder anymore). I thought it would be better that way to increase visibility - especially during refactoring work or when new tests are added - and since they're automatically excluded in SG environments anyway when run. I'm also hoping that in a later revision, we'll be able to have a single set of tests that can be used for both SG and MG environments, where MG algos are run simply by providing a dask dataframe and additional setup, instead of requiring an entirely different test, and this would push us in that direction.
If our CI environment is only SG, is there any easy way to skip those tests? Likewise, on the MNMG system, how are SG tests skipped?
I do like having everything together, just want to make sure that doesn't add complexity elsewhere
Right now our MG tests are automatically skipped if there's <2 GPUS detected using this decorator, so our current CI scripts and SG dev workflows wouldn't need to change. And today on MG systems, both SG and MG tests are run by default unless we manually exclude the SG tests (which we do). One advantage of separating them by folder is that you can specify just the MG ("dask") folder to exclude SG tests, but it's only a minor convenience IMO since you can do the same thing today using -k mg. Using the proposed folder layout above would require users to use -k mg if they want to skip SG tests, so that would be a minor change to our MG automation script which I'm happy to make for this.
I think in the long run, having SG and MG side-by-side will promote more MG testing, test maintenance, and get us to a single-source solution for both environments sooner.
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.