LvArray Move on resize/insert/emplace...etc

@corbett5 I started to include the call to move for the cases we discussed (resize, reserve), but this is actually a larger issue which we should have a unified approach for. It is really a situation where upon every type of array size/capacity modification that must be done on host, we should move back to host prior to performing the operation. This should be done across the board for all types (Array, ArrayOfArrays, CRSMatrix, SortedArray...).

Pros/Cons:

Pro. The developer can be assured that they will not perform even if operations are not called in a host policy raja loop. There is a lot of benefit to this.
Con. There are a lot of places that we will call move. It seems like we may call move multiple times with the way that the manipulation functions are setup. I don't know if this is a real issue, but I guess move is essentially a no-op if the buffer is already current on host? I think it is fine, but perhaps it isn't.
Is this always what the developer would want? Are there cases where this would be undesirable?

Jun 21 '20 19:06 rrsettgast

Thinking about this more I'm not sure this is desired behavior. For example if you have an Array which is only used on device as scratch space (doesn't store any state between iterations, never accessed on host) then there would be no need to move it back to host after calling resize.

With the current behavior this what happens is calling resize would free the device allocation and reallocate the host allocation then next time the Array is captured/moved to device the "junk" host data would get copied over to device but its fine because the values in the array aren't important between steps.

With the new behavior the Array would get copied to host before the resize and then moved back to device again. Now neither behavior is optimal since in this case there is no need to do any memory movement. I think the proper solution to this problem is to actually support resize and reallocate in different memory spaces. For example

Array1d< int > x( 55 );
x.move( MemorySpace::GPU ):

// Creates a new allocation of size 66 on the device and launches a kernel which
// captures `x.toViewConst()` and populates the new allocation default initializing
// the new values. It then deallocates the old device and host allocations.
x.resize< devicePolicy >( 66 );

We could also do something special for resizeWithoutInitializationOrDestruction which would just call Umpire::reallocate and would not launch a kernel to populate the new values, this would provide the optimal solution for the example I gave above.

Also if we add in these move calls users will come to rely on them being present and it will break their code if we choose to remove them in the future.

Jun 22 '20 20:06 corbett5

After working with it a bit, I am pretty sure this is not the desired behavior. It seems a bit arbitrary where to move the data, and the fact that we are always moving the data back to host when it is not required on host seems wasteful as you point out.

The solution of adding move seems like a rather messy band aid for the problem of performing host calculations outside of a raja::forall....but it is really making the overall problem worse.

I kind of like the resize and reallocate on different memory spaces...but I kind of don't...I don't have a good alternative to suggest right now.

Jun 22 '20 21:06 rrsettgast

@corbett5 @klevzoff After debugging the topology change routines in GEOSX, I am certain that something needs to be done to assist the user of LvArray with this. Let us not worry about the case where someone modifies data outside of a RAJA launch/capture. The model we support is capture by value triggers the move....end of story.

I think that your suggestion about having policy based reallocation functions is a good start. So to lay it all out...when we have registered an LvArray type in the dataRepository, we have a couple of ways to perform reallocation depending on the situation. Lets lay out 2 situations:

registerWrapper( "data1", m_data1 )->setSizedFromParent(1);

registerWrapper( "data2", m_data2 )->setSizedFromParent(0);

So the first is the standard mesh data case, and the second is something like a set of indices, etc.

In the case of m_data0 we ideally would only resize the data through a call to Group::resize. For this we can just use Group::resize or Wrapper::resize to call a move and touch. However, I think there is a question about behavior. If there is a policy specified as you suggested above, what do we do when the data is current on one space, but the resize targets the other space?

Do we move it to the target space, then resize?
Would it be better to just not specify the target space, and resize in the space where the data is current? This sounds appealing to me.
If the data is previously allocated on both spaces, do we want to do the reallocation on both spaces to make future transfers a copy instead of an allocation and a copy?

The second case is harder to deal with. Some of the difficulty is in the multiple ways to get to m_data2.

There can be a reallocation on the member itself.

m_data2.resize(newSize);

There can be a call to a getter and then call resize on the reference.

Array<> & arrayRef = group.getReference<Array<>>( "data2" );
arrayRef.resize(newSize);

So where does the move go in this case? The implication of 1. would be that we should be embedding the move in Array::resize()...or simply not have a member disallow this case. For 2., we could call move() in getReference(), which would take care of the problem, but potentially result in unneeded move and touch.

Jul 05 '20 17:07 rrsettgast

How about adding an open() and close() method, with 2 bool members m_closed, and m_didRealloc.

Prior to any reallocation, we would have to call open(). This would:

make sure m_closed==true, then set m_closed = false,
set m_didRealloc=false.
call move( targetSpace, false ) (i.e. move without a touch)

Then you can call resize() which would:

check if m_closed==false,
perform the reallocation and set m_didRealloc=true if there is a reallocation.

Then you call close(), which would:

call registerTouch() if m_didRealloc==true.
set m_didRealloc=false, and m_closed=true

Jul 06 '20 18:07 rrsettgast

Do we move it to the target space, then resize?

Yes.

Would it be better to just not specify the target space, and resize in the space where the data is current? This sounds appealing to me.

I don't think that LvArray should handle this. Instead I suggest we add a method MemorySpace lastTouchedSpace() and then Wrapper::resize can use that to pick the appropriate policy to resize with.

If the data is previously allocated on both spaces, do we want to do the reallocation on both spaces to make future transfers a copy instead of an allocation and a copy?

This isn't really important as it doesn't effect the usage. I would suggest that it only does not do the second allocation upon resize. That way you could have an Array that only lives on the device.

The implication of 1. would be that we should be embedding the move in Array::resize()...or simply not have a member disallow this case

There will always be operations that can only be done on the host Array::emplace_back, ArrayOfArrays::insertArray and likewise Array::resize. As we discussed above I don't think we want to put calls to move into these methods. Unless you can guarantee that arrayRef is touched on the host calling emplace_back or resize or any of those other methods is invalid. To help catch these errors we could add another check that is enabled with ARRAY_BOUNDS_CHECK that makes sure that the array is touched on the host before performing any of these host only operations.

we could call move() in getReference()

I'm pretty sure we don't want getReference to do anything other than return a reference to the wrapped object. For instance if you have

void bar( Group & group )
{
  ArrayView<> x = group.getReference< Array<> >( "x" );
  foo( x );
}

If getReference performs a move then bar would have to know what space foo wants x to be in. It shouldn't lead to incorrect behavior but it would lead to a lot of extraneous movement.

How about adding an open() and close() method

I'm not sure what this accomplishes that a policied resize doesn't. For non resize methods like emplace_back that can only be done on the host it seems equivalent to and much more complicated than calling move( MemorySpace::CPU ).

Jul 06 '20 20:07 corbett5

I don't think that LvArray should handle this. Instead I suggest we add a method MemorySpace lastTouchedSpace() and then Wrapper::resize can use that to pick the appropriate policy to resize with.

In the case of a class member m_data1, we would not be going through Wrapper::resize(), so either we have to embed the move into LvArray or we have to require a call to move the data prior to calling m_data1.resize(). The open(), close() proposal tries to put some infrastructure around that process instead of leaving the user to develop the logic....and it leaves LvArray::resize() intact in this regard.

Is the lastTouchedSpace() may not be inclusive of all cases. For instance if the memory was allocated on host, but then transferred to device as a const...where would the last touch be? I would expect it would be last touched on host. If we had a policied resize, and we specified a device policy, would that move it again to device, then realloc there?

If the data is previously allocated on both spaces, do we want to do the reallocation on both spaces to make future transfers a copy instead of an allocation and a copy?

This isn't really important as it doesn't effect the usage. I would suggest that it only does not do the second allocation upon resize. That way you could have an Array that only lives on the device.

If the Array only lives on device, then we wouldn't allocate on host. If the Array lives on both the host and device, then we could allocate on both. No it doesn't affect usage, but it does restrict the allocations to the location of the call to resize().

The implication of 1. would be that we should be embedding the move in Array::resize()...or simply not have a member disallow this case

There will always be operations that can only be done on the host Array::emplace_back, ArrayOfArrays::insertArray and likewise Array::resize. As we discussed above I don't think we want to put calls to move into these methods. Unless you can guarantee that arrayRef is touched on the host calling emplace_back or resize or any of those other methods is invalid. To help catch these errors we could add another check that is enabled with ARRAY_BOUNDS_CHECK that makes sure that the array is touched on the host before performing any of these host only operations.

It would be nice to have some infrastructure support to help the user with the intricacies of these operations in LvArray. The open(), close() proposal gives us a place to put the move, and we can enforce their use in a heterogeneous memory system, and make them no-ops (along with the checks) in a single memory space system.

we could call move() in getReference()

I'm pretty sure we don't want getReference to do anything other than return a reference to the wrapped object. For instance if you have

I am pretty sure too.

How about adding an open() and close() method

I'm not sure what this accomplishes that a policied resize doesn't. For non resize methods like emplace_back that can only be done on the host it seems equivalent to and much more complicated than calling move( MemorySpace::CPU ).

Whatever works. I just don't want to have to find the places where i need to call move without a reasonable error message helping me out.

Jul 06 '20 20:07 rrsettgast

In the case of a class member m_data1, we would not be going through Wrapper::resize(), so either we have to embed the move into LvArray or we have to require a call to move the data prior to calling m_data1.resize(). The open(), close() proposal tries to put some infrastructure around that process instead of leaving the user to develop the logic....and it leaves LvArray::resize() intact in this regard.

You mean m_data2 that is not sized from parent right? So one thing we could do is that any time you get an Array, SortedArray, ArrayOfArrays, etc. from the data repository it gets moved and touched on the host. The only reason to get one of these owner objects is if you're about to resize / insert / whatever. Otherwise you call the const getReference() or referenceAsView which return a view object and doesn't move it.

If the Array only lives on device, then we wouldn't allocate on host. If the Array lives on both the host and device, then we could allocate on both. No it doesn't affect usage, but it does restrict the allocations to the location of the call to resize().

It would be nice for timing purposes to include the second allocation in the resize but currently the only way to get an Array on device is to allocate it on the host and then move it to the device. So any array that has a device allocation will have a host allocation as well. Only allocating in the space specified by the policy would allow device only allocations.

Jul 06 '20 23:07 corbett5

In the case of a class member m_data1, we would not be going through Wrapper::resize(), so either we have to embed the move into LvArray or we have to require a call to move the data prior to calling m_data1.resize(). The open(), close() proposal tries to put some infrastructure around that process instead of leaving the user to develop the logic....and it leaves LvArray::resize() intact in this regard.

You mean m_data2 that is not sized from parent right? So one thing we could do is that any time you get an Array, SortedArray, ArrayOfArrays, etc. from the data repository it gets moved and touched on the host. The only reason to get one of these owner objects is if you're about to resize / insert / whatever. Otherwise you call the const getReference() or referenceAsView which return a view object and doesn't move it.

@corbett5 Yes. sorry. m_data2, but we are not getting it from the repository since it is a member of the class calling the resize. This is the "should we embed the move/touch in the getReference() call, which I think isn't the best idea since you don't actually know if it will be touched. It is possible that the member can be used directly without the data repository being involved at all. For instance, if we had:

class Foo
{
public:
void bar()
{
  ... a bunch of stuff
  m_data.resize( newSize);
}
private:
array1d<real64> m_data;
}

So the data repository is not involved in this call to resize.

If the Array only lives on device, then we wouldn't allocate on host. If the Array lives on both the host and device, then we could allocate on both. No it doesn't affect usage, but it does restrict the allocations to the location of the call to resize().

It would be nice for timing purposes to include the second allocation in the resize but currently the only way to get an Array on device is to allocate it on the host and then move it to the device. So any array that has a device allocation will have a host allocation as well. Only allocating in the space specified by the policy would allow device only allocations.

We should definitely have an option for allocation in a target memory space. We will need this when device memory is larger than host memory.

Jul 07 '20 00:07 rrsettgast