islpy icon indicating copy to clipboard operation
islpy copied to clipboard

Must not use Pybind's implicit object-identity `__hash__`

Open kaushikcfd opened this issue 2 years ago • 8 comments

To Reproduce

  1. Create a file: islpy_hash.py as:
import islpy as isl


a = isl.BasicSet("{[i,j]: 0<=i<=j<10}")

print(hash("hashing of islpy objects isn't Python compliant"))
print(hash(a))
print(hash(a))
  1. Call it as PYTHONHASHSEED=3 python islpy_hash.py.

Observed behavior

(py311_env) [line@line ~]$ PYTHONHASHSEED=3 python islpy_hash.py 
-5820926828285319218
8778101896947
8778101896947
(py311_env) [line@line ~]$ PYTHONHASHSEED=3 python islpy_hash.py 
-5820926828285319218
8727734557427
8727734557427
(py311_env) [line@line ~]$ PYTHONHASHSEED=3 python islpy_hash.py 
-5820926828285319218
8787476382451
8787476382451

Expected behavior

The message printed to stdout must be the same across interpreter runs.

kaushikcfd avatar Dec 28 '22 04:12 kaushikcfd

Does __hash__ have documented cross-run semantics?

inducer avatar Dec 28 '22 05:12 inducer

Yes. https://docs.python.org/3/using/cmdline.html#envvar-PYTHONHASHSEED

isuruf avatar Dec 28 '22 05:12 isuruf

FWIW, isl_set_get_hash returns deterministic results as its seed is a constant.

kaushikcfd avatar Dec 28 '22 05:12 kaushikcfd

But then something's weird. set_get_hash is a direct wrapper of isl_set_get_hash:

        uint32_t  set_get_hash(set const &arg_self)
        {
          isl_ctx *islpy_ctx = nullptr;

                    if (!arg_self.is_valid())
                      throw isl::error(
                        "passed invalid arg to isl_set_get_hash for self");
                    

                        islpy_ctx = isl_set_get_ctx(arg_self.m_data);
                        
if (islpy_ctx) isl_ctx_reset_error(islpy_ctx);
uint32_t result = isl_set_get_hash(arg_self.m_data);
return result;
        }

and

wrap_set.def("__hash__", isl::set_get_hash, "get_hash(self)\n\n:param self: :class:`Set`\n:return: uint32_t \n\n.. warning::\n\n    This function is not part of the officially public isl API. Use at your own risk.");

inducer avatar Dec 28 '22 05:12 inducer

Aah I see. Set.__hash__ is consistent across runs (irrespective of PYTHONHASHSEED). However, BasicSet.__hash__ isn't. This is because ISL does not define isl_basic_set_get_hash and so isl.BasicSet.__hash__ is pybind11_builtins.pybind11_object.__hash__

kaushikcfd avatar Dec 28 '22 11:12 kaushikcfd

Using pybind11_object.__hash__ is even more problematic as:

(py311_env) [line@line temp]$ cat understand_bset_hash.py 
import islpy as isl

a1 = isl.BasicSet("{[i]: 0<=i<512}")
a2 = isl.BasicSet("{[i]: 0<=i<512}")
print(hash(a1))
print(hash(a2))
print(a1 == a2)
print({a1: "hello", a2: "world"})

(py311_env) [line@line temp]$ python understand_bset_hash.py 
8766500567951
8766500487043
True
{BasicSet("{ [i] : 0 <= i <= 511 }"): 'hello', BasicSet("{ [i] : 0 <= i <= 511 }"): 'world'}

I wonder if we should use the upcasting logic for such types for which ISL does not define isl_xxx_get_hash.

kaushikcfd avatar Dec 28 '22 11:12 kaushikcfd

See #103 for a first stab. I'm excited for all the ways in which this breaks loopy. :slightly_smiling_face:

inducer avatar Dec 30 '22 20:12 inducer

Not sure if this is useful, but while looking at this issue, I noticed that isl does offer isl_basic_set_get_hash (see https://repo.or.cz/isl.git/blob/HEAD:/isl_map.c#l11315), but does not expose it in its headers, which is why islpy doesn't pick it up.

Manually adding it to the header and rebuilding islpy seems to make the two examples in this PR run as expected:

$ PYTHONHASHSEED=3 python islpy_hash.py
-2348628950087003433
1863046788
1863046788
$ PYTHONHASHSEED=3 python islpy_hash.py
-2348628950087003433
1863046788
1863046788
$ PYTHONHASHSEED=3 python islpy_hash.py
-2348628950087003433
1863046788
1863046788

$ PYTHONHASHSEED=3 python understand_bset_hash.py 
181834274
181834274
True
{BasicSet("{ [i] : 0 <= i <= 511 }"): 'world'}
$ PYTHONHASHSEED=3 python understand_bset_hash.py 
181834274
181834274
True
{BasicSet("{ [i] : 0 <= i <= 511 }"): 'world'}
$ PYTHONHASHSEED=3 python understand_bset_hash.py 
181834274
181834274
True
{BasicSet("{ [i] : 0 <= i <= 511 }"): 'world'}

matthiasdiener avatar Jul 10 '23 13:07 matthiasdiener