First Implementation of a Simplex Trie
This implements a simplex trie as presented in [1] as backend data structure for the SimplicialComplex class. This is also used in gudhi's SC implementation. However, they do not expose all functionality we need and the data structure is implemented in native code, so we cannot interact with it directly either.
Using a simplex tree should bring some nice performance improvements over the previous approach and fixes some bugs along the way as well. I will add some comparisons later.
[1] Jean-Daniel Boissonnat and Clément Maria. The Simplex Tree: An Efficient Data Structure for General Simplicial Complexes. Algorithmica, pages 1–22, 2014
@mhajij The tests fail because coseg loads a pickled state of SimplicialComplex with internal properties. This is (unrelated to this pull request) a bad idea, as any change of the data structure may lead to errors, or worse undetected inconsistencies.
@mhajij The tests fail because
cosegloads a pickled state ofSimplicialComplexwith internal properties. This is (unrelated to this pull request) a bad idea, as any change of the data structure may lead to errors, or worse undetected inconsistencies.
@ffl096 I am not sure we should merge this pull request now because the ICML challenge participants might have used that dataset and I think we need to merge the pull request they have their first before we merge this particular pull request. What do you think?
This is a draft pull request, it is not to be merged right now regardless :)
However, just to clarify: I do not propose to remove the coseg dataset. We have to think about a reasonable data format to deliver the dataset that does not rely on pickle. Ideally, the return value of the coseg function should stay exactly the same.
SimplicialComplex objects in this pr are compatible to the previous implementation as long as the user does not access internal state. The ICML submissions should all be fine.
This is a draft pull request, it is not to be merged right now regardless :)
However, just to clarify: I do not propose to remove the
cosegdataset. We have to think about a reasonable data format to deliver the dataset that does not rely on pickle. Ideally, the return value of thecosegfunction should stay exactly the same.SimplicialComplexobjects in this pr are compatible to the previous implementation as long as the user does not access internal state. The ICML submissions should all be fine.
we need to create a Data object to be utilized in the higher order context. I think the one available in torch is good enough.
This is an example on how it can be used in a higher order DL model https://github.com/pyt-team/TopoModelX/blob/569bd193f81d47e04891376676c034e90cc07554/tutorials/combinatorial/hmc_train.ipynb
@ffl096 I think we can merge this now, testing is failing however, can you please take care of it so we can merge ? also lint.
The dataset issue still stands and is outside of the scope to be fixed here. We cannot reliably use pickled objects as data objects.
The dataset issue still stands and is outside of the scope to be fixed here. We cannot reliably use pickled objects as data objects.
I cannot merge wihout passing the tests, what do you think we should do? should we fix the dataset issues first?
According to git blase, the coseg dataset downloaded from here was preprocessed by you, right? This repo does not contain this preprocessing script, can you provide that to me? Same for shrec_16.
@ffl096 What do you want to do with this PR ? I think we need to have SC faster and implemented correctly but many code relies on the datasets-- what do you suggest?
As outlined above, the dataset structure has to be overhauled completely. This is outside of the scope of this pull request though, and needs to be done regardless. The current system is highly unstable. Once that is done, this pull request is good to be merged.
Codecov Report
Attention: Patch coverage is 99.63100% with 1 line in your changes missing coverage. Please review.
Project coverage is 97.89%. Comparing base (
5b2284b) to head (d6dad04). Report is 2 commits behind head on main.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| toponetx/classes/simplicial_complex.py | 98.68% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #220 +/- ##
==========================================
+ Coverage 97.83% 97.89% +0.06%
==========================================
Files 38 40 +2
Lines 3558 3663 +105
==========================================
+ Hits 3481 3586 +105
Misses 77 77
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.