nntrainer
nntrainer copied to clipboard
Lifespan seems not needed for the tensor_pool
Tensor_pool is saving lifespan but it seems there is no need save the lifespan directly.
- there is no public method to query lifespan so there must be internal use.
- Internally, it is only used to check if it is longterm tensor.
- While lifespan does not convey any meaning by having
FORWARD_FUNC_LIFE_SPAN
,BACKWARD_FUNC_LIFESPAN
,ITERATION_LIFESPAN
. - Also, lifespan easily loose it's actual meaning in case of extension or viewing.
What the tensor_pool really need is that querying if a certian tensor is persistent
or managed
or unmanaged
inside tensor pool.
Option 1. We can have a dedicated enum value for this (UNMANAGED, PERSISTENT, MANAGED) Option 2. We can just use exec_order to query the state.
1. exec_order is empty == UNMANAGED
2. exec_order has 0 ~ uint_max == PERSISTENT
3. none of above == MANAGED
Seeing the current code, PERSISTENT and MANAGED is actually the same state, so we don't strictly need to distinguish those two though for option 2.
I prefer to option 2. If there is no caveat I haven't though of.
Please leave some opinions or correct me if I am missing something :)
:octocat: cibot: Thank you for posting issue #1696. The person in charge will reply soon.
AFAIK, we will decide when ths tensors are off-loaded with tensor life span and execution order.
AFAIK, we will decide when ths tensors are off-loaded with tensor life span and execution order.
I think using lifespan outside of tensor is not possible because we simply can't believe the value inside it.
for example, we have a tensor of lifetime = lifespan::FORWARD in layer A, later we extend the lifespan to lifespan::BACKWARD in layer B. the value of tensor lifespan will be lifespan::FORWARD | lifespan::BACKWARD but it no longer conveys any meaningful meaning.
Consider two tensors, which have the same execution order usage.
- Tensor 1 : exec_order - x1, x2 (x2 > x1)
- Tensor 2 : exec_order - x1, x2
Now, let's assign Tensor 1 with lifespan ITERATION_LIFESPAN and Tensor 2 with EPOCH_LIFESPAN.
With these lifespans, Tensor 1 memory can be overwritten after x2, but tensor 2 memory cannot be overwritten after x2.
The information of lifespan and execution order is combined to create validity
information which is passed to the memory pool.
So, exec_order itself is not sufficient to determine the validity of the tensor, and lifespan is needed to evaluate the exact validity.
Consider two tensors, which have the same execution order usage.
- Tensor 1 : exec_order - x1, x2 (x2 > x1)
- Tensor 2 : exec_order - x1, x2
Now, let's assign Tensor 1 with lifespan ITERATION_LIFESPAN and Tensor 2 with EPOCH_LIFESPAN. With these lifespans, Tensor 1 memory can be overwritten after x2, but tensor 2 memory cannot be overwritten after x2. The information of lifespan and execution order is combined to create
validity
information which is passed to the memory pool.So, exec_order itself is not sufficient to determine the validity of the tensor, and lifespan is needed to evaluate the exact validity.
Yes, for this example.. we need three state. 1. UNMANAGED, 2. PERSISTENT, 3. MANAGED. and those can be inferred from exec order if we mean it. but others (like forward, backward lifespan) conveys no meaning.
We cannot define a validity of forward lifespan because we cannot simple trust it's lifespan.