Cytnx icon indicating copy to clipboard operation
Cytnx copied to clipboard

Refer to indices by cytnx_int64 or cytnx_uint64

Open manuschneider opened this issue 1 year ago • 6 comments

When referring to positive numbers like index or block numbers, the data structure is either a cytnx_int64 or a cytnx_uint64. For example, UniTensor.permute(std::vector<cytnx_int64>) and UniTensor.at(std::vector<cytnx_uint64>). For UniTensor.get_block() both versions exist and the difference (if any) is not obvious to me.

This is a bit confusing and inconstant. While it should not cause problems on the Python side, the mix leads to difficulties with C++ because the two types are not automatically converted, at least when combined in vectors.

Is there a reason for different conventions? Could everything be changed to one convention? How can this be done without breaking compatibility with existing code, tests, etc?

manuschneider avatar Aug 20 '24 17:08 manuschneider

There is. Originally some of the API are design for large Tensor. Meaning that if you don't need to support -1 for indexing like python, the C++ side can utilize full 64bit address indexing. the API that using int64 is only to make compatible with Python -1 indexing

kaihsin avatar Aug 21 '24 03:08 kaihsin

I see. But perumte does not work with negative indices anyways in the current implementation, at least for BlockUniTensor

manuschneider avatar Aug 21 '24 08:08 manuschneider

There is. Originally some of the API are design for large Tensor. Meaning that if you don't need to support -1 for indexing like python, the C++ side can utilize full 64bit address indexing. the API that using int64 is only to make compatible with Python -1 indexing

Can we do it at Pybind11 level to make C++ code consistent?

yingjerkao avatar Aug 29 '24 03:08 yingjerkao

Here is a crop from the Google C++ style guide that we apply. In short, only use unsigned integers when representing a bit pattern and use size_t and ptrdiff_t when appropriate.

Unsigned integers are good for representing bitfields and modular arithmetic. Because of historical accident, the C++ standard also uses unsigned integers to represent the size of containers - many members of the standards body believe this to be a mistake, but it is effectively impossible to fix at this point. The fact that unsigned arithmetic doesn't model the behavior of a simple integer, but is instead defined by the standard to model modular arithmetic (wrapping around on overflow/underflow), means that a significant class of bugs cannot be diagnosed by the compiler. In other cases, the defined behavior impedes optimization.

That said, mixing signedness of integer types is responsible for an equally large class of problems. The best advice we can provide: try to use iterators and containers rather than pointers and sizes, try not to mix signedness, and try to avoid unsigned types (except for representing bitfields or modular arithmetic). Do not use an unsigned type merely to assert that a variable is non-negative.

IvanaGyro avatar Sep 01 '24 03:09 IvanaGyro

I suggest use int (or some other signed type) for everything. But note that size_t is unsigned, so don't use that! use ssize_t.

C++20 has std::ssize() function for containers, so it is safe to write eg Container<T> v; for (auto i = std::ssize(v)-1; i > 0; --i) { ... }

The MPToolkit is currently C++14 and not ready to use C++20 yet (not enough compiler support), but meanwhile I'm just going to add some ssize() function myself.

In new code, my feeling is always use a signed integer, unless it is actually bit-flipping. Having a container using size_type as a signed type will work fine, and if you mix it with standard containers it will harmlessly convert to an unsigned type. But always use std::ssize() when getting the size of containers when it needs to work with std containers as well, and similarly wrap other existing uses of unsigned types.

See also: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1428r0.pdf

Bjarne also advocates for making all .size() functions return a signed int: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1491r0.pdf

ianmccul avatar Sep 01 '24 05:09 ianmccul

related issue: #167

IvanaGyro avatar Nov 09 '24 17:11 IvanaGyro