ng-video-lecture icon indicating copy to clipboard operation
ng-video-lecture copied to clipboard

bug?: m vs model

Open freestylerick opened this issue 1 year ago • 6 comments

I'm curious if gpt.py is buggy (which is my guess and gpt-4's guess as well https://chat.openai.com/share/b50316a4-0f63-4813-8888-9cb3ca68b7f1) or why it's not.

On line 199 of gpt.py we define m. We then use m and model when I think we should be using only one of them (I think 199 should be model = ... and then we only use model. There might be some spots where we have to move tensors so everything is on the same device).

freestylerick avatar May 31 '23 17:05 freestylerick

I agree with your point, especially in some cases. However, this is an issue of coding style preference. Please close the issue.

InfiniteRandomVariable avatar Jun 27 '23 04:06 InfiniteRandomVariable

To be clear - I am asking if this is a bug, not simply a question of style. Are you sure it's not a bug?

freestylerick avatar Jun 30 '23 01:06 freestylerick

I wondered this also. The documentation for to (https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.to) has a note that says "This method modifies the module in-place" so I believe this bug doesn't impact the behavior of the code.

lincolnq avatar Jul 10 '23 17:07 lincolnq

Thanks - I think this would benefit from a unit test.

freestylerick avatar Jul 11 '23 00:07 freestylerick

I am asking if this is a bug, not simply a question of style. Are you sure it's not a bug?

You could verify that it is not really a bug (but a bit sloppy) by something like this:

In [1]: import torch

In [2]: model = torch.nn.Linear(2, 1)

In [3]: model.weight.device
Out[3]: device(type='cpu')

In [4]: m = model.to("cuda")

In [5]: model.weight.device
Out[5]: device(type='cuda', index=0)

In [6]: id(model)
Out[6]: 140022542389776

In [7]: id(m)
Out[7]: 140022542389776

In [8]: model is m
Out[8]: True

bluenote10 avatar Feb 08 '24 19:02 bluenote10

In a perfect world, I'd also like to test that a change on m also propagates to model (and/or vise versa). If someone does this, then I will agree that they are very likely the same.

freestylerick avatar Mar 15 '24 16:03 freestylerick