dgl icon indicating copy to clipboard operation
dgl copied to clipboard

Using `tensor.untyped_storage()` in `InvertibleCheckpoint` class methods.

Open drivanov opened this issue 1 year ago • 27 comments

Description

As suggested in the following warning:

tests/python/pytorch/nn/test_nn.py::test_group_rev_res[idtype0]
  /usr/local/lib/python3.10/dist-packages/dgl/nn/pytorch/conv/grouprevres.py:35: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
    inputs[1].storage().resize_(0)

the use tensor.storage() has been replaced by tensor.untyped_storage()

Checklist

Please feel free to remove inapplicable items for your PR.

  • [x] I've leverage the tools to beautify the python and c++ code.
  • [x] The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
  • [x] To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

drivanov avatar Jan 09 '24 22:01 drivanov

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

dgl-bot avatar Jan 09 '24 22:01 dgl-bot

Commit ID: 007ea93239a8572896445f17c0269125de946422

Build ID: 1

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot avatar Jan 09 '24 22:01 dgl-bot

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

dgl-bot avatar Jan 10 '24 19:01 dgl-bot

Commit ID: ef5d37247990c634256aaae660d60d40cff9f835

Build ID: 2

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot avatar Jan 10 '24 19:01 dgl-bot

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

dgl-bot avatar Jan 17 '24 01:01 dgl-bot

Commit ID: 4283dcbde21dd764d4d58f14acb91ffdfc767d15

Build ID: 3

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot avatar Jan 17 '24 01:01 dgl-bot

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

dgl-bot avatar Jan 19 '24 19:01 dgl-bot

Commit ID: 6a74b84749870feb26c29eb53c2135652a705b9f

Build ID: 4

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot avatar Jan 19 '24 19:01 dgl-bot

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

dgl-bot avatar Jan 22 '24 20:01 dgl-bot

Commit ID: 7cedd70913f465deb306501dbf0c80d53901d4ec

Build ID: 5

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot avatar Jan 22 '24 20:01 dgl-bot

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

dgl-bot avatar Jan 29 '24 18:01 dgl-bot

Commit ID: 9ea5525d855de7fca4e82eb4439fa4de92988f13

Build ID: 6

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot avatar Jan 29 '24 18:01 dgl-bot

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

dgl-bot avatar Feb 01 '24 23:02 dgl-bot

Commit ID: c1f8acde1a0b50f26704985843cb671b15b30f99

Build ID: 7

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot avatar Feb 01 '24 23:02 dgl-bot

@dgl-bot

frozenbugs avatar Feb 09 '24 01:02 frozenbugs

Commit ID: 3bb0556fea621da60fd9498b83e20f63d71e0962

Build ID: 8

Status: ❌ CI test failed in Stage [Torch CPU (Win64) Unit test].

Report path: link

Full logs path: link

dgl-bot avatar Feb 09 '24 03:02 dgl-bot

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

dgl-bot avatar Feb 09 '24 17:02 dgl-bot

Commit ID: bb9d2da73d7e894c509f3c61f437532ca28c9bbc

Build ID: 9

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot avatar Feb 09 '24 17:02 dgl-bot

================================== FAILURES ===================================
_________________________ test_group_rev_res[idtype0] _________________________

idtype = torch.int32

    @parametrize_idtype
    def test_group_rev_res(idtype):
        dev = F.ctx()
    
        num_nodes = 5
        num_edges = 20
        feats = 32
        groups = 2
        g = dgl.rand_graph(num_nodes, num_edges).to(dev)
        h = th.randn(num_nodes, feats).to(dev)
        conv = nn.GraphConv(feats // groups, feats // groups)
        model = nn.GroupRevRes(conv, groups).to(dev)
>       result = model(g, h)

tests\python\pytorch\nn\test_nn.py:2287: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py:1194: in _call_impl
    return forward_call(*input, **kwargs)
python\dgl\nn\pytorch\conv\grouprevres.py:254: in forward
    *(args + tuple([p for p in self.parameters() if p.requires_grad]))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

ctx = <torch.autograd.function.InvertibleCheckpointBackward object at 0x00000052832EAE58>
fn = <bound method GroupRevRes._forward of GroupRevRes(
  (gnn_modules): ModuleList(
    (0): GraphConv(in=16, out=16, normalization=both, activation=None)
    (1): GraphConv(in=16, out=16, normalization=both, activation=None)
  )
)>
fn_inverse = <bound method GroupRevRes._inverse of GroupRevRes(
  (gnn_modules): ModuleList(
    (0): GraphConv(in=16, out=16, normalization=both, activation=None)
    (1): GraphConv(in=16, out=16, normalization=both, activation=None)
  )
)>
num_inputs = 2
inputs_and_weights = (Graph(num_nodes=5, num_edges=20,
      ndata_schemes={}
      edata_schemes={}), tensor([[ 0.5982, -1.6816, -0.5572, ...ameter containing:
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       requires_grad=True))
inputs = (Graph(num_nodes=5, num_edges=20,
      ndata_schemes={}
      edata_schemes={}), tensor([[ 0.5982, -1.6816, -0.5572, ... 1.5027,  2.4316, -1.7318,  1.3843,
         -1.2294,  0.1610,  0.0136,  1.2388, -2.0080, -0.7917,  1.5043, -0.5614]]))
x = [Graph(num_nodes=5, num_edges=20,
      ndata_schemes={}
      edata_schemes={}), tensor([[ 0.5982, -1.6816, -0.5572, ... 1.5027,  2.4316, -1.7318,  1.3843,
         -1.2294,  0.1610,  0.0136,  1.2388, -2.0080, -0.7917,  1.5043, -0.5614]])]
element = tensor([[ 0.5982, -1.6816, -0.5572, -1.2908, -1.7179, -1.3323,  0.8583, -0.6617,
          1.5634, -2.0408,  0.0528, -...  1.5027,  2.4316, -1.7318,  1.3843,
         -1.2294,  0.1610,  0.0136,  1.2388, -2.0080, -0.7917,  1.5043, -0.5614]])

    @staticmethod
    def forward(ctx, fn, fn_inverse, num_inputs, *inputs_and_weights):
        ctx.fn = fn
        ctx.fn_inverse = fn_inverse
        ctx.weights = inputs_and_weights[num_inputs:]
        inputs = inputs_and_weights[:num_inputs]
        ctx.input_requires_grad = []
    
        with torch.no_grad():
            # Make a detached copy, which shares the storage
            x = []
            for element in inputs:
                if isinstance(element, torch.Tensor):
                    x.append(element.detach())
                    ctx.input_requires_grad.append(element.requires_grad)
                else:
                    x.append(element)
                    ctx.input_requires_grad.append(None)
            # Detach the output, which then allows discarding the intermediary results
            outputs = ctx.fn(*x).detach_()
    
        # clear memory of input node features
>       inputs[1].untyped_storage().resize_(0)
E       AttributeError: 'Tensor' object has no attribute 'untyped_storage'

python\dgl\nn\pytorch\conv\grouprevres.py:35: AttributeError
_________________________ test_group_rev_res[idtype1] _________________________

idtype = torch.int64

    @parametrize_idtype
    def test_group_rev_res(idtype):
        dev = F.ctx()
    
        num_nodes = 5
        num_edges = 20
        feats = 32
        groups = 2
        g = dgl.rand_graph(num_nodes, num_edges).to(dev)
        h = th.randn(num_nodes, feats).to(dev)
        conv = nn.GraphConv(feats // groups, feats // groups)
        model = nn.GroupRevRes(conv, groups).to(dev)
>       result = model(g, h)

tests\python\pytorch\nn\test_nn.py:2287: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py:1194: in _call_impl
    return forward_call(*input, **kwargs)
python\dgl\nn\pytorch\conv\grouprevres.py:254: in forward
    *(args + tuple([p for p in self.parameters() if p.requires_grad]))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

ctx = <torch.autograd.function.InvertibleCheckpointBackward object at 0x000000528B7445E8>
fn = <bound method GroupRevRes._forward of GroupRevRes(
  (gnn_modules): ModuleList(
    (0): GraphConv(in=16, out=16, normalization=both, activation=None)
    (1): GraphConv(in=16, out=16, normalization=both, activation=None)
  )
)>
fn_inverse = <bound method GroupRevRes._inverse of GroupRevRes(
  (gnn_modules): ModuleList(
    (0): GraphConv(in=16, out=16, normalization=both, activation=None)
    (1): GraphConv(in=16, out=16, normalization=both, activation=None)
  )
)>
num_inputs = 2
inputs_and_weights = (Graph(num_nodes=5, num_edges=20,
      ndata_schemes={}
      edata_schemes={}), tensor([[-2.0861e-01,  1.5159e+00, -...ameter containing:
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       requires_grad=True))
inputs = (Graph(num_nodes=5, num_edges=20,
      ndata_schemes={}
      edata_schemes={}), tensor([[-2.0861e-01,  1.5159e+00, -...75e-01,
         -1.4968e-02,  1.2952e+00, -1.2937e+00,  9.9673e-01, -1.9580e-01,
          1.3495e+00,  1.5232e+00]]))
x = [Graph(num_nodes=5, num_edges=20,
      ndata_schemes={}
      edata_schemes={}), tensor([[-2.0861e-01,  1.5159e+00, -...75e-01,
         -1.4968e-02,  1.2952e+00, -1.2937e+00,  9.9673e-01, -1.9580e-01,
          1.3495e+00,  1.5232e+00]])]
element = tensor([[-2.0861e-01,  1.5159e+00, -1.7376e+00, -7.2614e-01,  9.7599e-01,
          3.5593e-01,  4.2687e-01, -1.5182e-...275e-01,
         -1.4968e-02,  1.2952e+00, -1.2937e+00,  9.9673e-01, -1.9580e-01,
          1.3495e+00,  1.5232e+00]])

    @staticmethod
    def forward(ctx, fn, fn_inverse, num_inputs, *inputs_and_weights):
        ctx.fn = fn
        ctx.fn_inverse = fn_inverse
        ctx.weights = inputs_and_weights[num_inputs:]
        inputs = inputs_and_weights[:num_inputs]
        ctx.input_requires_grad = []
    
        with torch.no_grad():
            # Make a detached copy, which shares the storage
            x = []
            for element in inputs:
                if isinstance(element, torch.Tensor):
                    x.append(element.detach())
                    ctx.input_requires_grad.append(element.requires_grad)
                else:
                    x.append(element)
                    ctx.input_requires_grad.append(None)
            # Detach the output, which then allows discarding the intermediary results
            outputs = ctx.fn(*x).detach_()
    
        # clear memory of input node features
>       inputs[1].untyped_storage().resize_(0)
E       AttributeError: 'Tensor' object has no attribute 'untyped_storage'

@drivanov

frozenbugs avatar Feb 19 '24 02:02 frozenbugs

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

dgl-bot avatar Feb 21 '24 18:02 dgl-bot

Commit ID: 77473a7e386b0b22a2c42f53f9ed5ac9106b1cbc

Build ID: 10

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot avatar Feb 21 '24 18:02 dgl-bot

@dgl-bot

frozenbugs avatar Feb 22 '24 05:02 frozenbugs

Commit ID: b2dbdd4effbff8fabf9218588bfe91205309d116

Build ID: 11

Status: ❌ CI test failed in Stage [Torch CPU (Win64) Unit test].

Report path: link

Full logs path: link

dgl-bot avatar Feb 22 '24 05:02 dgl-bot

@frozenbugs : Sorry, I have no idea what is causing this problem. This is what I see in the debugger:

(Pdb) l
 31                 # Detach the output, which then allows discarding the intermediary results
 32                 outputs = ctx.fn(*x).detach_()
 33  
 34             # clear memory of input node features
 35             import pdb; pdb.set_trace()
 36  ->         inputs[1].untyped_storage().resize_(0)
 37  
 38             # store for backward pass
 39             ctx.inputs = [inputs]
 40             ctx.outputs = [outputs]
 41  
(Pdb) type(inputs[1])
<class 'torch.Tensor'>
(Pdb) hasattr(inputs[1], "untyped_storage")
True
(Pdb) 

The only reason I can think of is different versions of Python. You are using Python3.7:

C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py:1194: in _call_impl
    return forward_call(*input, **kwargs)

and in our container, we are using Python3.10

(Pdb) u
> /usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py(1511)_wrapped_call_impl()
-> return self._call_impl(*args, **kwargs)

BTW, perhaps the versions of the “torch” are also different. It's what we are using:

root@ea1fb332897e:/opt/dgl/qa/L0_python_unittests# pip list | grep torch
pytorch-quantization      2.1.2
torch                     2.3.0a0+ebedce2
torch-tensorrt            2.3.0a0
torchdata                 0.7.1a0
torchmetrics              1.3.1
torchtext                 0.17.0a0
torchvision               0.18.0a0

drivanov avatar Feb 26 '24 23:02 drivanov

I am not sure, if this is not urgent, let's table it.

frozenbugs avatar Feb 27 '24 08:02 frozenbugs

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

dgl-bot avatar Mar 07 '24 17:03 dgl-bot

Commit ID: c259d1ba49c179034a1abddf47083cd82c610d70

Build ID: 12

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot avatar Mar 07 '24 17:03 dgl-bot