pytorch_sparse icon indicating copy to clipboard operation
pytorch_sparse copied to clipboard

Indexing rows and columns at the same time

Open Chrixtar opened this issue 2 years ago • 4 comments

Hello everyone,

first up: I tried both torch.sparse as well as pytorch_sparse for my project and prefer the latter one any day. So keep up the good work!

However, I noticed that indexing t[r, c] a tensor t with two LongTensor as rows r and columns c deviates from the indexing of PyTorch. I expected to obtain the values at the coordinates specified by the two tensors at corresponding indices, i.e., a 1D torch.Tensor of the same length as the two index tensors. Instead, the output corresponds to t[r][:, c] in case of a torch.Tensor. Is there any fast way of indexing as described above (with autograd support)?

If it helps: In my case, I know for sure that all these coordinates are nonzero in the indexed tensor.

Thank you very much for your time.

Chrixtar avatar Mar 22 '22 18:03 Chrixtar

Thank you for the kind words. Our ultimate long-term goal is to merge the functionality of pytorch_sparse into torch.sparse to make it better :)

You are right that indexing is a bit off from regular indexing on dense tensors. Indexing a SparseTensor via 2D-indices is typically quite inefficient since one needs to map the dense index to a sparse one. One trivial implementation I can think of is:

  • Get the offset via rowptr[r]
  • For each offset and each element in c, check if c exists between rowptr[r] till rowptr[r + 1]
  • Add the indices to the original offsets
  • Index select the original sparse entries

I am not sure if there exists a PyTorch solution to compute this effectively. I feel we would need to implement our own C++/CUDA implementation (e.g., via a segment_search) in order to implement this effectively.

rusty1s avatar Mar 23 '22 16:03 rusty1s

Thanks for your quick answer. I think, aligning the indexing to the regular one on dense tensors is definetly important regarding your "ultimate long-term goal".

In my case, I noticed that I can compute a mapping very fast, if I exploit that my SparseTensor is a sorted block-diagonal matrix. However, for one special case, I now settled for the following trivial but inefficient implementation of 2D-indexing:

def edge_index_select(t: SparseTensor, query_row: Tensor, query_col: Tensor) -> Tensor:
    row, col, val = t.coo()
    row_mask = row == query_row.view(-1, 1)
    col_mask = col == query_col.view(-1, 1)
    mask = torch.max(torch.logical_and(row_mask, col_mask), dim=0).values
    return val[mask]

The second step of your suggestion would probably result in a similar computational effort right?

Chrixtar avatar Mar 23 '22 19:03 Chrixtar

Yeah, this is somewhat inefficient since you will search the complete set of non-zero indices to find the correct index. I think one could speed this up via torch.searchsorted. WDYT? Happy to take a PR on this one :)

rusty1s avatar Mar 24 '22 13:03 rusty1s

This issue had no activity for 6 months. It will be closed in 2 weeks unless there is some new activity. Is this issue already resolved?

github-actions[bot] avatar Sep 21 '22 02:09 github-actions[bot]