Person-reid-GAN-pytorch icon indicating copy to clipboard operation
Person-reid-GAN-pytorch copied to clipboard

hello,please tell me how to use generated samples to train

Open liuxiuxuhaodong opened this issue 6 years ago • 13 comments

I make use of market-1501 datasets to train DCGAN and get the number of pictures,but,I do not know how to train baseline by generated samples,such as the name of generated pictures,class ,please tell me,thank you.

liuxiuxuhaodong avatar Jun 25 '18 14:06 liuxiuxuhaodong

hi, try to read the source code of train_baseline.py and prepare.py. you do not need to know all the details of the source code, just know how the model read the dataset(the path), its not difficult, bset wishes!

qiaoguan avatar Jun 25 '18 14:06 qiaoguan

well,I did it,but when i run train_baseline.py.i met a trouble that print Traceback (most recent call last): File "/home/dl/Person-reid-GAN/train_baseline.py", line 339, in os.mkdir(dir_name) FileNotFoundError: [Errno 2] No such file or directory: './model/ft_DesNet121'

liuxiuxuhaodong avatar Jun 26 '18 07:06 liuxiuxuhaodong

just create a new folder(named model)

qiaoguan avatar Jun 26 '18 07:06 qiaoguan

thank you very much!

liuxiuxuhaodong avatar Jun 26 '18 08:06 liuxiuxuhaodong

excuse me,after creat a new folder named model,i run again,then i also meet a new question that printed follw: RuntimeError: cuda runtime error (10) : invalid device ordinal at torch/csrc/cuda/Module.cpp:84 I look for solutions on internet.there is a blog writer who met a similar trouble in https://blog.csdn.net/shincling/article/details/78919282.But i just to learn pytorch.Would you like to help me resolve this problem?

liuxiuxuhaodong avatar Jun 26 '18 09:06 liuxiuxuhaodong

My computer only have one gpu.

liuxiuxuhaodong avatar Jun 26 '18 09:06 liuxiuxuhaodong

just change it to the single GPU-training mode.
torch.cuda.set_device(gpu_ids[0]) and delete the code : model=nn.DataParallel(model,device_ids=[0,1,2]) # multi-GPU for mode details , you can search the internet

qiaoguan avatar Jul 05 '18 08:07 qiaoguan

thank you very much,i have solved my trouble and run it successfully.

liuxiuxuhaodong avatar Jul 05 '18 08:07 liuxiuxuhaodong

I am so sorry to trouble you again,i run train_baseline.py successfully a few days ago,but i just meet a new problem when i run it again.I read your demo ,but i don't find solution,the problem is printed follow: train Loss: 291.3983 Acc: 0.0109 /pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [0,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed. /pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [2,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed. /pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [6,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed. /pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [7,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed. /pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [10,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed. /pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [11,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed. THCudaCheck FAIL file=/pytorch/torch/lib/THC/generated/../THCReduceAll.cuh line=339 error=59 : device-side assert triggered Traceback (most recent call last): File "train_baseline.py", line 346, in num_epochs=130) File "train_baseline.py", line 246, in train_model loss = criterion(outputs,labels,flags) File "/home/dl/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, **kwargs) File "train_baseline.py", line 173, in forward return loss.mean() RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/torch/lib/THC/generated/../THCReduceAll.cuh:339 I don't modify your code , a little confused.

liuxiuxuhaodong avatar Jul 07 '18 06:07 liuxiuxuhaodong

I have solved it ,thank you

liuxiuxuhaodong avatar Jul 07 '18 07:07 liuxiuxuhaodong

I met it too, how did you solve it?

flychen321 avatar Jul 07 '18 09:07 flychen321

me too,anyone solved it?

Vincy-L avatar May 01 '19 07:05 Vincy-L

I am so sorry to trouble you again,i run train_baseline.py successfully a few days ago,but i just meet a new problem when i run it again.I read your demo ,but i don't find solution,the problem is printed follow: train Loss: 291.3983 Acc: 0.0109 /pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [0,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed. /pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [2,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed. /pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [6,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed. /pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [7,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed. /pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [10,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed. /pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [11,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed. THCudaCheck FAIL file=/pytorch/torch/lib/THC/generated/../THCReduceAll.cuh line=339 error=59 : device-side assert triggered Traceback (most recent call last): File "train_baseline.py", line 346, in num_epochs=130) File "train_baseline.py", line 246, in train_model loss = criterion(outputs,labels,flags) File "/home/dl/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, **kwargs) File "train_baseline.py", line 173, in forward return loss.mean() RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/torch/lib/THC/generated/../THCReduceAll.cuh:339 I don't modify your code , a little confused.

can you tell me how to solve it,thank you

Vincy-L avatar May 01 '19 08:05 Vincy-L