esda icon indicating copy to clipboard operation
esda copied to clipboard

Added permutations_array to Moran_Local class

Open martirenom opened this issue 1 year ago • 9 comments

This allows users to define a custom permutations array for the Moran_Local class to be used in crand. This is useful for custom applications. This feature was commented in issue #344.

We have decided to include a new argument to the function Moran_Local called "permutations_array", which is passed later to crand(). This was necessary instead of using "permutations" itself to avoid dealing with replacing entirely permutations as integer, which is used several other places in the code. If alternative, more elegant, solutions to our proposal exist, please feel free to propose them.

martirenom avatar Jul 31 '24 14:07 martirenom

@martirenom thanks for this. Would you be able to add a test suite and documentation so that users have an understanding of the scope of this option and what the intended use is?

sjsrey avatar Aug 01 '24 16:08 sjsrey

I have added a test for the new argument in Moran_Local and crand function, which allows now the use of a user-defirned permutations_array.

As for generating a documentation, I would appreciate if this is done by ESDA developers as have no clue how to do it. In a nutshell, what the new parameter allows is the input of a permutation array that is not totally random (as by default it is in Moran_Local). In our case, we needed to maintain certain linear correlations existing between points in the "map" as we work on genomics and point one is linked by polymer connection to point 2 and so on. Thus, we wanted to maintain this linearity when generating a null model to calculate the empirical p-values of the Moran_Local.

Thanks!

martirenom avatar Aug 02 '24 14:08 martirenom

This is great functionality to have!

I would prefer overloading the permutations argument at the user level, since a permutation array specifies the number of permutations. So, we would just have a check the type of permutations in crand(), construct permutations_array (or assign), and set permutations to permuted_ids.shape[0].

I think this would be better because it allows any crand() powered class to use the functionality immediately, rather than having to add an explicit permutation_array argument to each statistic.

If I'm understanding the statistical theory right, I would add the following to document the method:

permutations should be specified as an array of indices if the user needs to specify additional structure onto the conditional permutation null hypothesis. Common reasons to do this include exchangeability violations, which might then require us to shuffle observations within (but not between) groups, or linearity constraints, which may require certain sequences of observation relationships to be preserved.

Am I understanding the theory correctly? I think the linearity constraint is a special type of an exchangeability violation... but an example of when this might be used would be super helpful. I can construct one for the group exchangeability case.

ljwolf avatar Aug 15 '24 09:08 ljwolf

Hmm... Because of the design of the simulator, we can't necessarily use this for block-wise shuffling, since the block constraint either (a) changes by observation or (b) remains the same for all observations, but refers to positions within each chunk. We could vary the y so that only one group is passed to each compute_chunk() call, but but that would be substantively different than this. So let's park that for now, and move forward only with allowing permutation arrays to be specified as here.

I'll keep pushing to clean this up for merging, but an example notebook where the linear constraints on the permutation array would be great!

ljwolf avatar Aug 15 '24 17:08 ljwolf

The reason why we decided to pass an array with the shuffling was to make it user-defined. That is, the array is build by the user externally from esda... This makes it very flexible. We thought that was simpler and allowed versatile implementations.

martirenom avatar Aug 19 '24 08:08 martirenom

@ljwolf Thanks for approving these changes. Let us know if there is anything necessary from our side in order for it to be merged into the main branch.

martirenom avatar Nov 13 '24 09:11 martirenom

The CI needs to pass before we can merge this. It seems that the changes here are failing at the moment, so it will need to be fixed.

martinfleis avatar Nov 13 '24 10:11 martinfleis

I'm updating this to current main and re-running tests.

ljwolf avatar Sep 10 '25 18:09 ljwolf

Current failure on ubuntu-latest, ci/314-latest.yaml:

FAILED esda/tests/test_moran.py::TestMoranLocal::test_Moran_Local_custom_perms[W] - ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
FAILED esda/tests/test_moran.py::TestMoranLocal::test_Moran_Local_custom_perms[Graph] - ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
FAILED esda/tests/test_moran.py::TestMoranLocalBV::test_plot[Graph] - TypeError: crand() got multiple values for argument 'n_jobs'
FAILED esda/tests/test_moran.py::TestMoranLocalBV::test_plot_combination[W] - TypeError: crand() got multiple values for argument 'n_jobs'
FAILED esda/tests/test_moran.py::TestMoranLocalBV::test_plot_combination[Graph] - TypeError: crand() got multiple values for argument 'n_jobs'
FAILED esda/tests/test_moran.py::TestMoranLocalBV::test_defaults[W] - TypeError: crand() got multiple values for argument 'n_jobs'
FAILED esda/tests/test_moran.py::TestMoranLocalBV::test_defaults[Graph] - TypeError: crand() got multiple values for argument 'n_jobs'
FAILED esda/tests/test_moran.py::TestMoranLocalBV::test_labels[W] - TypeError: crand() got multiple values for argument 'n_jobs'
FAILED esda/tests/test_moran.py::TestMoranLocalBV::test_labels[Graph] - TypeError: crand() got multiple values for argument 'n_jobs'
FAILED esda/tests/test_moran.py::TestMoranLocalBV::test_explore[W] - TypeError: crand() got multiple values for argument 'n_jobs'
FAILED esda/tests/test_moran.py::TestMoranLocalBV::test_explore[Graph] - TypeError: crand() got multiple values for argument 'n_jobs'
FAILED esda/tests/test_moran.py::TestMoranLocalBV::test_plot[W] - TypeError: crand() got multiple values for argument 'n_jobs'

jGaboardi avatar Dec 14 '25 21:12 jGaboardi