openj9 icon indicating copy to clipboard operation
openj9 copied to clipboard

Add default hashcode for value types

Open ehrenjulzert opened this issue 2 years ago • 14 comments

ObjectHash.hpp: Change convertObjectToHash to hash value types according to their value rather than their identity

ObjectModel.hpp: Hash objects differently depending on if they are forwarded pointers or not (since, due to ObjectHash changes, convertObjectToHash will now crash if passed a forwarded pointer)

For issue #15768

Signed-off-by: Ehren Julien-Neitzert [email protected]

ehrenjulzert avatar Sep 13 '22 17:09 ehrenjulzert

@hangshao0 @dmitripivkine My solution is currently working as far as I can tell, but I do have a couple of concerns that may come up later:

1.) Can objects referenced by a valid value type be forwarded pointers? If so, I will have to somehow call hashPotentiallyForwardedObject inside of convertValueObjectAtOffsetToHash instead of calling inlineObjectHashCode as I am doing now. I believe the only way to do this would be to expose some kind of function in the GC that can be called from the VM, since the VM is not supposed to deal with forwarded pointers.

2.) Currently I am treating array types the same as L types, is this correct? I assume that if they are flattened then they will be a Q type like flattened value types, but I am not totally sure.

ehrenjulzert avatar Sep 13 '22 18:09 ehrenjulzert

Can objects referenced by a valid value type be forwarded pointers?

I guess it is only possible if it is inside the GC code. If it is in the VM code, it shouldn't be a forwarded pointer.

Currently I am treating array types the same as L types, is this correct?

Correct. Array is always L type.

hangshao0 avatar Sep 13 '22 18:09 hangshao0

@ehrenjulzert

Can objects referenced by a valid value type be forwarded pointers?

Yes, it can. The problem is FP is only one of methods GC uses for object tracing. The particular GC context is required to do so.

@amicic @tajila As we expected there is recursive call inside convertValueObjectAtOffsetToHash() when VT object points to another VT object. Except GC related complications (how to discover moved objects) I see potential problem of limited stack size for recursive call. I don't think we can guarantee bounded stack usage if it used as-is. Are there alternatives except pre-calculate hash for each VT object (and use it for any VT object pointed at VT object)? We need answer before we decide how we are going to split code between VM and GC

dmitripivkine avatar Sep 13 '22 18:09 dmitripivkine

...pre-calculate hash for each VT object (and use it for any VT object pointed at VT object)

I think there is no need to pre-calculate hash for each VT object but for VT objects have references to another VT object.

dmitripivkine avatar Sep 13 '22 18:09 dmitripivkine

I believe the recursive call issue also applies to VM_ValueTypeHelpers::isSubstitutable(): https://github.com/eclipse-openj9/openj9/blob/df2f3a3524957ff9e59c6861992f8195beb24aa6/runtime/vm/ValueTypeHelpers.hpp#L69

@ehrenjulzert What is the behaviour of RI if there is a circularity in the value type ? i.e. A value type has a field that references itself directly or indirectly. Note that circularity is not allowed in primitive value types, but it is allowed in non-primitive value classes. (i.e. classes defined via value class)

hangshao0 avatar Sep 13 '22 21:09 hangshao0

What is the behaviour of RI if there is a circularity in the value type ?

I was under the impression that VTs cannot have circular references since they are immutable. For example, the following code:

value class Point {
	public Point point;

	public Point(Point point) {
		this.point = point;
	}

	public static void main(String[] args) {
		Point p1 = new Point(null);
		p1.point = p1;
	}
}

generates this error on RI:

Point.java:10: error: cannot assign a value to final variable point
		p1.point = p1;
		  ^
1 error
error: compilation failed

ehrenjulzert avatar Sep 14 '22 13:09 ehrenjulzert

Yes, as @ehrenjulzert said, given that VTs are immutable there cannot be a cycle in the link w.r.t to hascode computation as there would need to be an identity object (IO) in between for this to occur, and the IO would end the recursion.

The only scenario that would be costly computationally would be list like structures created as VTs. These would require a full walk of the list to compute the hashcode which would potentially stack overflow. If this is done on a java vmthread we can insert OS stack guard checks (perhaps the same can be introduced for GC threads). Worst case we can put a hard limit on recursion depth and just throw stackoverflow.

I believe these types are less likely to be commonly used so I wouldn't optimize for that use case at this time.

@ehrenjulzert could you do a test with a VT list in the RI implementation to confirm that they would throw a stackoverflow exception with list like VT structures?

The alternative here is to use a iterative algorithm which is more complex to write but doesnt have the stack overflow problem

tajila avatar Sep 15 '22 13:09 tajila

could you do a test with a VT list in the RI implementation

I ran the following code on the RI to hash a list like structure and it caused a stackoverflow:

value class Point {
	public Point point;

	public Point(Point point) {
		this.point = point;
	}

	public static void main(String[] args) {
		Point p = new Point(null);
		for (int i = 0; i < 1000; i++) {
			p = new Point(p);
		}
		System.out.println(p.hashCode());
	}
}

ehrenjulzert avatar Sep 15 '22 14:09 ehrenjulzert

I agree that we don't have to worry about the possible stackoverflow issue for the time being. We have the same behaviour as RI and users may not really hit this issue in real world. If there is really someone complaining about stackoverflow in the future, we can always replace the recursion with iterative DFS or BFS.

hangshao0 avatar Sep 15 '22 15:09 hangshao0

I agree that we don't have to worry about the possible stackoverflow issue for the time being. We have the same behaviour as RI and users may not really hit this issue in real world. If there is really someone complaining about stackoverflow in the future, we can always replace the recursion with DFS or BFS.

I don't think this is acceptable for real implementation. We can postpone decision right now but we can not release this code to customers

dmitripivkine avatar Sep 15 '22 17:09 dmitripivkine

I don't think this is acceptable for real implementation. We can postpone decision right now but we can not release this code to customers

Thats a fair assessment. I would add that while the spec is still being finalized, there is a possibility that hashcode may be disallowed for VTs due to its characteristics. We have seen the spec evolve quite a bit in recent years. This is why I dont want to invest too much into it right now.

tajila avatar Sep 15 '22 19:09 tajila

After talking with Dmitri today it sounds like to make this code work for all GC cases will be pretty difficult and will probably require making changes to the GC. Since we don't want to invest too much into this right now it might be best to pause the work here, and come back to it once we confirm that VTs will need to have hashcodes calculated this way. If we still want to have suboptimal but working VT hashcodes right now I could instead just use reflection to call the existing PrimitiveObjectMethods.primitiveObjectHashCode method (as we discussed a while back in #15741).

ehrenjulzert avatar Sep 16 '22 20:09 ehrenjulzert

If we still want to have suboptimal but working VT hashcodes right now I could instead just use reflection to call the existing PrimitiveObjectMethods.primitiveObjectHashCode method (as we discussed a while back in https://github.com/eclipse-openj9/openj9/pull/15741).

OK. I am fine with this. Please make a comment in the issue https://github.com/eclipse-openj9/openj9/issues/15529 or https://github.com/eclipse-openj9/openj9/issues/15768 explaining this. We normally use issues to track the status of work. It may not be easy for people to find the discussions in this draft PR which might be eventually closed.

hangshao0 avatar Sep 16 '22 20:09 hangshao0

If we still want to have suboptimal but working VT hashcodes right now I could instead just use reflection to call the existing PrimitiveObjectMethods.primitiveObjectHashCode method (as we discussed a while back in #15741).

Just for record I don't think it will work in the context of calling hash code calculation function from GC at object copy time. This method can not be called mid STW GC and GC threads are not Java threads any way

dmitripivkine avatar Sep 16 '22 20:09 dmitripivkine