BTAS icon indicating copy to clipboard operation
BTAS copied to clipboard

Tensor::rank and contiguous at compile time

Open shiozaki opened this issue 11 years ago • 18 comments

I think it is very useful for optimization of contract if one could obtain information on rank() and contiguous() at compile time. Otherwise there will be a bunch of if statements in a single function, which may make the code slower (especially if it is used for small tensors). Any ideas?

shiozaki avatar Jun 23 '14 12:06 shiozaki

Do you mean you want Tensor/TensorView to be able to report this at compile time? rank() is easy, you need to use RangeNd whose Index is std::array (or similar), it's rank() is a static constant then (in principle ... the code right now does not use constexpr). Contiguous() will be hard.

evaleev avatar Jun 23 '14 12:06 evaleev

Yes. Each of them would allow us to have one less if statement... It would be nice if the standard says something about it - otherwise I cannot assume that range can report rank, for instance.

shiozaki avatar Jun 23 '14 12:06 shiozaki

(I am implementing contract today)

shiozaki avatar Jun 23 '14 12:06 shiozaki

Do you mean to make TWG spec say that it's possible for rank() to be constexpr?

On Mon, Jun 23, 2014 at 8:54 AM, Toru Shiozaki [email protected] wrote:

Yes. Each of them would allow us to have one less if statement... It would be nice if the standard says something about it - otherwise I cannot assume that range can report rank, for instance.

— Reply to this email directly or view it on GitHub https://github.com/BTAS/BTAS/issues/44#issuecomment-46840870.

web: valeyev.net

evaleev avatar Jun 23 '14 13:06 evaleev

Yes. And contiguous() if possible. I suspect that there is some problem in doing that, but wanted to discuss. Otherwise I just introduce a bunch of if statements in the main contract function (for me it's fine, but not sure if it is the right way)

shiozaki avatar Jun 23 '14 13:06 shiozaki

It appears to me that we should compute the rank using the type of lobound_ instead of lobound_ itself? I am not sure at all if this is standard compliant. @evaleev Do you know if this is a problem with a compiler or not?

 449       //constexpr auto rank() const -> decltype(btas::rank(this->lobound())) {
 450       constexpr size_t rank() const {
 451         using btas::rank;
 452         return rank(lobound_);
 453       }

The error I get (with gcc 4.8) is probably the same as what you get with gcc:

{shiozaki@pro test}$ g++ -I.. -I/usr/local/boost/include test.C
test.C: In function 'int main()':
test.C:120:5: error: non-constant condition for static assertion
     static_assert(x.rank() == 3, "default Range rank");
     ^
test.C:120:26: error: call to non-constexpr function 'constexpr size_t btas::BaseRangeNd<_Derived>::rank() const [with _Derived = btas::RangeNd<(CBLAS_ORDER)0u, std::array<long unsigned int, 3ul> >; size_t = long unsigned int]'
     static_assert(x.rank() == 3, "default Range rank");
                          ^
In file included from test.C:6:0:
../btas/range.h:450:24: note: 'constexpr size_t btas::BaseRangeNd<_Derived>::rank() const [with _Derived = btas::RangeNd<(CBLAS_ORDER)0u, std::array<long unsigned int, 3ul> >; size_t = long unsigned int]' is not usable as a constexpr function because:
       constexpr size_t rank() const {
                        ^
../btas/range.h:450:24: error: enclosing class of constexpr non-static member function 'constexpr size_t btas::BaseRangeNd<_Derived>::rank() const [with _Derived = btas::RangeNd<(CBLAS_ORDER)0u, std::array<long unsigned int, 3ul> >; size_t = long unsigned int]' is not a literal type
../btas/range.h:242:11: note: 'btas::BaseRangeNd<btas::RangeNd<(CBLAS_ORDER)0u, std::array<long unsigned int, 3ul> > >' is not literal because:
     class BaseRangeNd {
           ^
../btas/range.h:242:11: note:   'btas::BaseRangeNd<btas::RangeNd<(CBLAS_ORDER)0u, std::array<long unsigned int, 3ul> > >' is not an aggregate, does not have a trivial default constructor, and has no constexpr constructor that is not a copy or move constructor

shiozaki avatar Jun 23 '14 14:06 shiozaki

Yes, we are using 4.8 gcc, same errors here. To me this looks like a different interpretation of how constexpr works by clang and gcc. Not sure which is correct, I'm a constexpr newbie. That I could just throw constexpr on a function even if it not constexpr for some template params is pretty strange to me.

On Mon, Jun 23, 2014 at 10:58 AM, Toru Shiozaki [email protected] wrote:

It appears to me that we should compute the rank using the type of lobound_ instead of lobound_ itself? I am not sure at all if this is standard compliant. @evaleev https://github.com/evaleev Do you know if this is a problem with a compiler or not?

449 //constexpr auto rank() const -> decltype(btas::rank(this->lobound())) { 450 constexpr size_t rank() const { 451 using btas::rank; 452 return rank(lobound_); 453 }

The error I get (with gcc 4.8) is probably the same as what you get with gcc:

{shiozaki@pro test}$ g++ -I.. -I/usr/local/boost/include test.C test.C: In function 'int main()': test.C:120:5: error: non-constant condition for static assertion static_assert(x.rank() == 3, "default Range rank"); ^ test.C:120:26: error: call to non-constexpr function 'constexpr size_t btas::BaseRangeNd<_Derived>::rank() const [with _Derived = btas::RangeNd<(CBLAS_ORDER)0u, std::array<long unsigned int, 3ul> >; size_t = long unsigned int]' static_assert(x.rank() == 3, "default Range rank"); ^ In file included from test.C:6:0: ../btas/range.h:450:24: note: 'constexpr size_t btas::BaseRangeNd<_Derived>::rank() const [with _Derived = btas::RangeNd<(CBLAS_ORDER)0u, std::array<long unsigned int, 3ul> >; size_t = long unsigned int]' is not usable as a constexpr function because: constexpr size_t rank() const { ^ ../btas/range.h:450:24: error: enclosing class of constexpr non-static member function 'constexpr size_t btas::BaseRangeNd<_Derived>::rank() const [with _Derived = btas::RangeNd<(CBLAS_ORDER)0u, std::array<long unsigned int, 3ul> >; size_t = long unsigned int]' is not a literal type ../btas/range.h:242:11: note: 'btas::BaseRangeNd<btas::RangeNd<(CBLAS_ORDER)0u, std::array<long unsigned int, 3ul> > >' is not literal because: class BaseRangeNd { ^ ../btas/range.h:242:11: note: 'btas::BaseRangeNd<btas::RangeNd<(CBLAS_ORDER)0u, std::array<long unsigned int, 3ul> > >' is not an aggregate, does not have a trivial default constructor, and has no constexpr constructor that is not a copy or move constructor

— Reply to this email directly or view it on GitHub https://github.com/BTAS/BTAS/issues/44#issuecomment-46856198.

web: valeyev.net

evaleev avatar Jun 23 '14 15:06 evaleev

Whenever I used constexpr I used it with static functions. I tried a bit to convert rank() to static but have not been successful yet.

shiozaki avatar Jun 23 '14 15:06 shiozaki

I need to read the standard, unfortunately, to understand which compiler is correct. It is possible to declare nonstatic members as constexpr, however ... and rank() must be a nonstatic member, otherwise it will not work for dynamically-sized Index types.

On Mon, Jun 23, 2014 at 11:16 AM, Toru Shiozaki [email protected] wrote:

Whenever I used constexpr I used it with static functions. I tried a bit to convert rank() to static but have not been successful yet.

— Reply to this email directly or view it on GitHub https://github.com/BTAS/BTAS/issues/44#issuecomment-46858645.

web: valeyev.net

evaleev avatar Jun 23 '14 16:06 evaleev

Re: contract.

In our discussion the other day, my understanding is that TensorBase refers to a general slice (previously a TensorView, and not necessarily contiguous) and Tensor is derived from TensorBase and IS contiguous. So, if you are writing contract for Tensor then it should be able to assume contiguous memory. Contract for TensorBase in principle can only use the iterator interface.

I think contiguous is important enough that it should in addition be a property of Tensor/TensorBase or their corresponding ranges. This would allow contract for TensorBase to have an if statement.

Also, tensors with compile time rank can be supported by the contract interface (since it is templated on tensorA, tensorB, tensorC types) and should be able to conform to the tensor concept, so if you want to use fix-rank Tensors, you can write the specialized contract for them. Naoki's original library was for tensors with compile time rank.

gkc1000 avatar Jun 24 '14 08:06 gkc1000

For TensorBase,

My idea was that "iterator" is templated in TensorBase class, and Tensor is derived from TensorBase w/ contiguous iterator. TensorBase should have the reference of storage and wrapper function to return TensorViewIterator (e.g. begin()), and Tensor class provides allocator for contiguous storage in addition. Then, consecutiveness is naturally determined by TensorBase class, and "generic" tensor functions depend only on TensorBase and its iterator traits.

Maybe this is not the right place...

On Tue, Jun 24, 2014 at 5:10 PM, gkc1000 [email protected] wrote:

Re: contract.

In our discussion the other day, my understanding is that TensorBase refers to a general slice (previously a TensorView, and not necessarily contiguous) and Tensor is derived from TensorBase and IS contiguous. So, if you are writing contract for Tensor then it should be able to assume contiguous memory. Contract for TensorBase in principle can only use the iterator interface.

I think contiguous is important enough that it should in addition be a property of Tensor/TensorBase or their corresponding ranges. This would allow contract for TensorBase to have an if statement.

Also, tensors with compile time rank can be supported by the contract interface (since it is templated on tensorA, tensorB, tensorC types) and should be able to conform to the tensor concept, so if you want to use fix-rank Tensors, you can write the specialized contract for them. Naoki's original library was for tensors with compile time rank.

— Reply to this email directly or view it on GitHub https://github.com/BTAS/BTAS/issues/44#issuecomment-46943225.

naokin avatar Jun 24 '14 11:06 naokin

Note that slices can also be contiguous ... I'm not sure if Toru or Miles use that in their use cases.

I think to be able to properly detect the contiguity at compile-time we'd have to go the way of Eigen and make dimensions template params, with magic value (-0 ???) for dynamic size.

On Tue, Jun 24, 2014 at 7:02 AM, Naoki Nakatani [email protected] wrote:

For TensorBase,

My idea was that "iterator" is templated in TensorBase class, and Tensor is derived from TensorBase w/ contiguous iterator. TensorBase should have the reference of storage and wrapper function to return TensorViewIterator (e.g. begin()), and Tensor class provides allocator for contiguous storage in addition. Then, consecutiveness is naturally determined by TensorBase class, and "generic" tensor functions depend only on TensorBase and its iterator traits.

Maybe this is not the right place...

On Tue, Jun 24, 2014 at 5:10 PM, gkc1000 [email protected] wrote:

Re: contract.

In our discussion the other day, my understanding is that TensorBase refers to a general slice (previously a TensorView, and not necessarily contiguous) and Tensor is derived from TensorBase and IS contiguous. So, if you are writing contract for Tensor then it should be able to assume contiguous memory. Contract for TensorBase in principle can only use the iterator interface.

I think contiguous is important enough that it should in addition be a property of Tensor/TensorBase or their corresponding ranges. This would allow contract for TensorBase to have an if statement.

Also, tensors with compile time rank can be supported by the contract interface (since it is templated on tensorA, tensorB, tensorC types) and should be able to conform to the tensor concept, so if you want to use fix-rank Tensors, you can write the specialized contract for them. Naoki's original library was for tensors with compile time rank.

— Reply to this email directly or view it on GitHub https://github.com/BTAS/BTAS/issues/44#issuecomment-46943225.

— Reply to this email directly or view it on GitHub https://github.com/BTAS/BTAS/issues/44#issuecomment-46957903.

web: valeyev.net

evaleev avatar Jun 24 '14 14:06 evaleev

I do have contiguous slices quite often. E.g., the occupied MO coefficient matrices sliced from the entire MO (and I use column major).

shiozaki avatar Jun 24 '14 14:06 shiozaki

Yes, slices can be contiguous. This is why I suggest that slices (i.e. TensorBases?) should have a runtime function/property that detects contiguousness. This adds overhead during construction but is probably worth it. Then "if" statements can be used to ensure optimal contraction for slices, in the generic contract function that works on TensorBase. However, for Tensors, I think there should be a separate contract function (template specialization for Tensor) since we know that Tensors are always contiguous.

Maybe Tensor is a bad name - it should be DenseTensor.

gkc1000 avatar Jun 24 '14 14:06 gkc1000

At the moment most of the use cases for slicing I foresee in ITensor involve using slices/TensorBases as intermediate objects when get assigned to new dense Tensors. For these cases it would be great if the Tensor constructor automatically checks if TensorBase::contig()==true and if so does a straight std::copy of the memory.

While on the topic of naming, I'd suggest keeping the name of TensorView something like "TensorView", "TensorRef", "TensorSlice" which are descriptive of its non-owning behavior. The name "TensorBase" describes more about its implementation rather than its behavior or purpose.

emstoudenmire avatar Jun 24 '14 15:06 emstoudenmire

I was wondering if there's no way to detect contiguousness of slice at the compile time? probably this is true, and if so, I agree that TensorBase (or Storage should be better) should have a runtime function to check contiguousness.

For the name, my concern is that TensorBase in principle doesn't have constructors (except for copy & move) and TensorView or TensorSlice would also be a derived class from TensorBase, which has various constructors. So, TensorBase cannot be used explicitly by user but can be used as "const TensorBase<T> b = permute(a, {j, k, i})" (a is "const Tensor<T>&"), calling copy constructor of TensorBase which doesn't copy elements at this time.

Toru: Is this a right way? I think it should be "const TensorBase<T>& b = permute(a, {j, k, i})" but this is of course not allowed, or use of "const TensorBase<T>& b = a" makes sense to me.

I prefer to use "Tensor" rather than "DenseTensor" since it's simple. I also think that one way to have sparse tensor is to use Tensor of Tensor, hence Tensor is not really restricted to use as DenseTensor. I feel we will have Tensor and SparseTensor (and possibly BlockSparseTensor).

On Wed, Jun 25, 2014 at 12:14 AM, Miles [email protected] wrote:

At the moment most of the use cases for slicing I foresee in ITensor involve using slices/TensorBases as intermediate objects when get assigned to new dense Tensors. For these cases it would be great if the Tensor constructor automatically checks if TensorBase::contig()==true and if so does a straight std::copy of the memory.

While on the topic of naming, I'd suggest keeping the name of TensorView something like "TensorView", "TensorRef", "TensorSlice" which are descriptive of its non-owning behavior. The name "TensorBase" describes more about its implementation rather than its behavior or purpose.

— Reply to this email directly or view it on GitHub https://github.com/BTAS/BTAS/issues/44#issuecomment-46984431.

naokin avatar Jun 24 '14 16:06 naokin

Oh, ok I thought TensorBase was completely replacing TensorView - if it's going to be derived from TensorBase then the name sounds fine to me.

emstoudenmire avatar Jun 24 '14 17:06 emstoudenmire

@naokin I think what you wrote is fine with me.

shiozaki avatar Jun 25 '14 16:06 shiozaki