pointpats icon indicating copy to clipboard operation
pointpats copied to clipboard

Add Strauss process for anti-clustering and refactor rng handling in random

Open sjsrey opened this issue 6 months ago • 7 comments

sjsrey avatar May 19 '25 16:05 sjsrey

Codecov Report

Attention: Patch coverage is 95.94595% with 3 lines in your changes missing coverage. Please review.

Project coverage is 80.5%. Comparing base (c4203b6) to head (93afa24). Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
pointpats/random.py 95.7% 3 Missing :warning:
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main    #160     +/-   ##
=======================================
+ Coverage   74.5%   80.5%   +6.0%     
=======================================
  Files         12      12             
  Lines       1904    1999     +95     
=======================================
+ Hits        1419    1609    +190     
+ Misses       485     390     -95     
Files with missing lines Coverage Δ
pointpats/distance_statistics.py 83.2% <100.0%> (+0.2%) :arrow_up:
pointpats/random.py 88.4% <95.7%> (+61.7%) :arrow_up:

... and 3 files with indirect coverage changes

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar May 19 '25 16:05 codecov[bot]

Can't say much about the Strauss process but we should revisit the seed/rng logic to follow SPEC7.

@sjsrey – regarding Martin's comment here, from what I can tell all the seed keywords are new. So if that the case, shall simply remove them from this PR?

(this will also close #114)

jGaboardi avatar Jul 22 '25 00:07 jGaboardi

Can't say much about the Strauss process but we should revisit the seed/rng logic to follow SPEC7.

@sjsrey – regarding Martin's comment here, from what I can tell all the seed keywords are new. So if that the case, shall simply remove them from this PR?

(this will also close #114)

Simply removing seed won't work here. Instead we need to refactor this a bit to get the correct behavior.

I can do this for pointpats, but I'm wondering if we want to revisit moving random tooling up into libpysal to ensure consistency throughout the federation?

sjsrey avatar Jul 22 '25 12:07 sjsrey

I can do this for pointpats, but I'm wondering if we want to revisit moving random tooling up into libpysal to ensure consistency throughout the federation?

where would that be consumed and what would it contain?

martinfleis avatar Jul 22 '25 12:07 martinfleis

I can do this for pointpats, but I'm wondering if we want to revisit moving random tooling up into libpysal to ensure consistency throughout the federation?

where would that be consumed and what would it contain?

where:

  • esda
  • inequality
  • giddy
  • segregation
  • pointpats
  • spopt (maybe others)

in terms of what it would contain in libpysal, something like:

import numpy as np

def get_rng(random_state=None):
    """
    Return a NumPy Generator instance using Scientific Python SPEC 7 conventions.

    Parameters
    ----------
    random_state : int, numpy.random.Generator, or None
        - If int: used as seed to create a Generator
        - If Generator: returned as-is
        - If None: a new Generator is created

    Returns
    -------
    numpy.random.Generator
    """
    return np.random.default_rng(random_state)

Then to use it in the downstream packages:

from libpysal import get_rng

rng = get_rng(random_state)
rng.permutation(...)

Might be overkill to centralize this, but the key thing is to adopt a consistent pattern across the ecosystem for spec7.

sjsrey avatar Jul 22 '25 18:07 sjsrey

Might be overkill to centralize this, but the key thing is to adopt a consistent pattern across the ecosystem for spec7.

Overkill? Maybe.

Ideal for centralization & consistency? Absolutely yes IMHO

Getting everyone on board to use the centralized functionality? Good luck LOL

jGaboardi avatar Jul 22 '25 18:07 jGaboardi

It is a one-liner wrapper in another one-liner. I'd rather vote for transparency of using directly np.random.default_rng than hiding it behind get_rng which does not do anything else than passing the arg and returning.

martinfleis avatar Jul 22 '25 19:07 martinfleis