cgal
cgal copied to clipboard
Use sizeof(void*) divisor in Void_handle_hash_function
Summary of Changes
This replaces #6501 . Since no two addresses can be closer than sizeof(void*) apart the address can be divided by at least sizeof(void*), A Handle class that wraps a pointer in the Generic_handle_map is usually this size, but if it's bigger then dividing by a smaller size is ok. It was also suggested to go further and divide the address of the handle by the size of the pointee, which would in theory provide a better distribution in the hashmap, however this doesn't work in practice, because of the way the Unique_hash_map handles collisions.
Release Management
- Affected package(s): Nef_3, Nef_S2
- Issue(s) solved (if any): performance
- License and copyright ownership: Returned to CGAL authors.
Your explanation is a bit confusing. You seem to rely on the fact that the handles all point to objects of size at least sizeof(void*)
. I think this deserves a comment in the code explaining this assumption (as is, the reader may wonder if you are confusing the size of a pointer/handle with the size of what it points to). A runtime assertion that you are only shifting away zeros would be nice, and possibly a static assertion about the size of *h
in Generic_handle_map::operator[]
if it works.
@mglisse It's just because the code takes the addresses. So thay can be /4
on 32bit and /8
on 64bit. Maybe there is a better way to express this intent?
0x6DFED8 = 7208664 / 8 = 901083
0x6DFEE0 = 7208672 / 8 = 901084
0x6DFEE8 = 7208680 / 8 = 901085
The difference between the resulting values is 1. (so no gaps in the hash table)
I think what we missed here is that the "generic" feature in the Generic_handle_map
is that the pointee can be different handle types.
This still looks confusing. A void**
(after casting to an integer type) will be a multiple of sizeof(void*)
on most platforms, but there is no reason for this to apply to some arbitrary void*
. In Generic_handle_map, I don't see what would prevent me from using operator[]
on a char*
, and then I can get 2 of those that differ by only 1.
This still looks confusing. A void** (after casting to an integer type) will be a multiple of sizeof(void*) on most platforms,
I don't think it needs to be comprehensible to a wider audience, it's an internal class, and the changes are intended to improve performance.
I don't see what would prevent me from using
operator[]
on achar*
, and then I can get 2 of those that differ by only 1.
I think within the scope of Nef_S2/Nef_3 is intended to be used with Handle
or iterator classes. While a char*
is a possible pointer that could be used, consider, sizeof(std::string::iterator)
is equal to 8 on 64bit.
I think we are talking at cross purposes. My point is that if you want to know how many zeros you can find at the end of a T*
, what matters is sizeof(T)
, not sizeof(T*)
. std::string::iterator
(which may very well be an alias for char*
) may have size 8, but iterators to 2 consecutive characters differ by 1 (even after casting them to size_t), they are definitely not both multiples of 8. The assumption you are making is not about the fact that h
is a handle, it is about the size of what h
points to.
@mglisse I don't know how to update the code/comment to clarify the point you are making.
Successfully tested in CGAL-5.5-Ic-95
@mglisse I don't know how to update the code/comment to clarify the point you are making.
What your code is doing is assuming a minimal universal alignment requirement for these pointers in a flat memory space.
Unfortunately C/C++ don't assume such a space or such universal alignment requirements, and some compilers support pointers at unusual alignments.
Your assumptions are probably going to work out almost all the time, but they should probably be made explicit.
e.g., by supplying a #define HANDLE_ALIGNMENT or something which can be documented and set to 1 for those odd cases where greater alignment can't be guaranteed.
What your code is doing is assuming a minimal universal alignment requirement for these pointers in a flat memory space.
I think that is already an assumption of Handle_hash_function
:
https://github.com/CGAL/cgal/blob/master/Hash_map/include/CGAL/Handle_hash_function.h#L47
What your code is doing is assuming a minimal universal alignment requirement for these pointers in a flat memory space.
I think that is already an assumption of
Handle_hash_function
:https://github.com/CGAL/cgal/blob/master/Hash_map/include/CGAL/Handle_hash_function.h#L47
That's assuming that because it points at H, it's aligned to multiples of sizeof H, which is slightly risky, but it's coherent, and it makes sense.
But void * can point at anything -- which makes this much less obvious.
My suggestion remains to make your assumptions here obvious for future readers.
But void * can point at anything -- which makes this much less obvious.
It can't point to anything it can only be a handle, the reason to use sizeof(void*)
is because that is the size of the greatest common divisor, which is what I've already put in the comment.
Do I just need to add to the comment, "this assumes a flat memory space and default memory alignment"?
@afabri do you agree that the other PR is a simpler solution with the same result?