about the NWPU dataset
Thanks for your excellent work! How did you handle with the image contains no person? I did not find the relative code in both SHA.py and preprocess_label.py . I seen in other issue you mentioned that this situation may have negative effect in the final result, so how do you deal with this kind of input training images? Directly delete them from the training set?
Currently I directly set the computed density of GT as the clone of the input for compute_density() method, and set the distance to 999 just as other situation that you set, and the training result is very bad. I got mae=406.
Thanks for your excellent work! How did you handle with the image contains no person? I did not find the relative code in both SHA.py and preprocess_label.py . I seen in other issue you mentioned that this situation may have negative effect in the final result, so how do you deal with this kind of input training images? Directly delete them from the training set?
Regarding images with no person, there are three ways to process them during training:
-
If the number of empty images is relatively small, you may treat these images as normal images and crop patches to train the model. In this case, patches with no person function as negative samples, which is helpful to alleviate over-estimation.
-
If the number of empty images is quite large, you may adopt a sampling strategy during training, e.g., sampling empty images with a certain probability.
-
If you do not care about over-estimation, you may simply ignore empty images.
For the NWPU-Crowd dataset, the second way could be a good choice, given that there are many empty images in this dataset.
Currently I directly set the computed density of GT as the clone of the input for compute_density() method, and set the distance to 999 just as other situation that you set, and the training result is very bad. I got mae=406.
Did you correctly process the annotations? An MAE of 406 is abnormal. Additionally, it would be helpful if you could share your training setting.
Here is the training setting I used:
CUDA_VISIBLE_DEVICES='0'
python -m torch.distributed.launch
--nproc_per_node=1
--master_port=10001
--use_env main.py
--lr=0.00001
--backbone="vgg16_bn"
--ce_loss_coef=1.0
--point_loss_coef=5.0
--eos_coef=0.5
--dec_layers=2
--hidden_dim=256
--dim_feedforward=512
--nheads=8
--dropout=0.0
--epochs=1500
--dataset_file="NWPU"
--eval_freq=5
--batch_size=8
--output_dir='pet_model'
I processed the dataset using the code you provided in preprocess_dataset.py, and customize the SHA.py to adjust NWPU dataset. Specially in compute_density() method, I added some code to deal with empty images, as follow:
class NWPU(Dataset): def init(self, data_root, transform=None, train=False, flip=False): self.root_path = data_root
prefix = "train" if train else "val"
file_list = f"{data_root}/{prefix}.txt"
with open(file_list, "r") as f:
f_list = f.readlines()
name_list = []
for i in range(len(f_list)):
fname = f_list[i].split(' ')[0] + '.jpg'
name_list.append(fname)
self.prefix = prefix
self.img_list = os.listdir(f"{data_root}/images/")
# get image and ground-truth list
self.gt_list = {}
for img_name in self.img_list:
if img_name not in name_list:
continue
img_path = f"{data_root}/images/{img_name}"
gt_path = f"{data_root}/jsons/{img_name}"
self.gt_list[img_path] = gt_path.replace("jpg", "json")
self.img_list = sorted(list(self.gt_list.keys()))
self.nSamples = len(self.img_list)
self.transform = transform
self.train = train
self.flip = flip
self.patch_size = 256
def compute_density(self, points):
"""
Compute crowd density:
- defined as the average nearest distance between ground-truth points
"""
points_tensor = torch.from_numpy(points.copy())
if points_tensor.shape[0] == 0:
density = torch.tensor(999.0).reshape(-1)
return density
dist = torch.cdist(points_tensor, points_tensor, p=2)
if points_tensor.shape[0] > 1:
density = dist.sort(dim=1)[0][:,1].mean().reshape(-1)
else:
density = torch.tensor(999.0).reshape(-1)
return density
def __len__(self):
return self.nSamples
def __getitem__(self, index):
assert index <= len(self), 'index range error'
# load image and gt points
img_path = self.img_list[index]
gt_path = self.gt_list[img_path]
img, points = load_data((img_path, gt_path), self.train)
points = np.array(points).astype(float)
# image transform
if self.transform is not None:
img = self.transform(img)
img = torch.Tensor(img)
# random scale
if self.train:
scale_range = [0.8, 1.2]
min_size = min(img.shape[1:])
scale = random.uniform(*scale_range)
# interpolation
if scale * min_size > self.patch_size:
img = torch.nn.functional.upsample_bilinear(img.unsqueeze(0), scale_factor=scale).squeeze(0)
points *= scale
# random crop patch
if self.train:
img, points = random_crop(img, points, patch_size=self.patch_size)
# random flip
if random.random() > 0.5 and self.train and self.flip:
img = torch.flip(img, dims=[2])
if len(points) != 0:
points[:, 1] = self.patch_size - points[:, 1]
# target
target = {}
target['points'] = torch.Tensor(points)
target['labels'] = torch.ones([points.shape[0]]).long()
if self.train:
density = self.compute_density(points)
target['density'] = density
if not self.train:
target['image_path'] = img_path
return img, target
Currently I directly set the computed density of GT as the clone of the input for compute_density() method, and set the distance to 999 just as other situation that you set, and the training result is very bad. I got mae=406.
Did you correctly process the annotations? An MAE of 406 is abnormal. Additionally, it would be helpful if you could share your training setting.
Also, I found that sometime the self attention layer may output the results that contain nan.
The training setting seems fine. Could you confirm that the format of point annotations is (y, x) instead of (x, y)? Wrong annotation format will lead to erroneous model outputs.
The training setting seems fine. Could you confirm that the format of point annotations is (y, x) instead of (x, y)? Wrong annotation format will lead to erroneous model outputs.
I directly load the annotations from original .json files of the official NWPU dataset without any changes
The training setting seems fine. Could you confirm that the format of point annotations is (y, x) instead of (x, y)? Wrong annotation format will lead to erroneous model outputs.
I directly load the annotations from original .json files of the official NWPU dataset without any changes
What about load_data function in class NWPU? There is a flip operation to ensure that the data format is (y, x). Perhaps you did not follow this format, which leads to abnormal model outputs.
You can visualize the outputs of your trained model and check whether the predictions are reasonable.
The training setting seems fine. Could you confirm that the format of point annotations is (y, x) instead of (x, y)? Wrong annotation format will lead to erroneous model outputs.
I directly load the annotations from original .json files of the official NWPU dataset without any changes
What about load_data function in
class NWPU? There is a flip operation to ensure that the data format is (y, x). Perhaps you did not follow this format, which leads to abnormal model outputs.You can visualize the outputs of your trained model and check whether the predictions are reasonable.
That hits. Now the mse reached 88.702 at epoch 224. Thank you very much.
The training setting seems fine. Could you confirm that the format of point annotations is (y, x) instead of (x, y)? Wrong annotation format will lead to erroneous model outputs.
I directly load the annotations from original .json files of the official NWPU dataset without any changes
What about load_data function in
class NWPU? There is a flip operation to ensure that the data format is (y, x). Perhaps you did not follow this format, which leads to abnormal model outputs. You can visualize the outputs of your trained model and check whether the predictions are reasonable.That hits. Now the mse reached 88.702 at epoch 224. Thank you very much.
I am glad to see that the issue has been resolved.
The training setting seems fine. Could you confirm that the format of point annotations is (y, x) instead of (x, y)? Wrong annotation format will lead to erroneous model outputs.
I directly load the annotations from original .json files of the official NWPU dataset without any changes
What about load_data function in
class NWPU? There is a flip operation to ensure that the data format is (y, x). Perhaps you did not follow this format, which leads to abnormal model outputs. You can visualize the outputs of your trained model and check whether the predictions are reasonable.That hits. Now the mse reached 88.702 at epoch 224. Thank you very much.
I am glad to see that the issue has been resolved.
May I ask in which epoch you got the best mae for NWPU dataset?
The training setting seems fine. Could you confirm that the format of point annotations is (y, x) instead of (x, y)? Wrong annotation format will lead to erroneous model outputs.
I directly load the annotations from original .json files of the official NWPU dataset without any changes
What about load_data function in
class NWPU? There is a flip operation to ensure that the data format is (y, x). Perhaps you did not follow this format, which leads to abnormal model outputs. You can visualize the outputs of your trained model and check whether the predictions are reasonable.That hits. Now the mse reached 88.702 at epoch 224. Thank you very much.
I am glad to see that the issue has been resolved.
May I ask in which epoch you got the best mae for NWPU dataset?
I did not recall the precise epoch with the best MAE, but the model should be okay for testing if the validation MAE is around 50.
The training setting seems fine. Could you confirm that the format of point annotations is (y, x) instead of (x, y)? Wrong annotation format will lead to erroneous model outputs.
I directly load the annotations from original .json files of the official NWPU dataset without any changes
What about load_data function in
class NWPU? There is a flip operation to ensure that the data format is (y, x). Perhaps you did not follow this format, which leads to abnormal model outputs. You can visualize the outputs of your trained model and check whether the predictions are reasonable.That hits. Now the mse reached 88.702 at epoch 224. Thank you very much.
I am glad to see that the issue has been resolved.
May I ask in which epoch you got the best mae for NWPU dataset?
I did not recall the precise epoch with the best MAE, but the model should be okay for testing if the validation MAE is around 50.
I got mse around 80, it seems there still have some problems. Maybe something wrong with the data loading process. It would be helpful if you can provide your NWPU class.
The training setting seems fine. Could you confirm that the format of point annotations is (y, x) instead of (x, y)? Wrong annotation format will lead to erroneous model outputs.
I directly load the annotations from original .json files of the official NWPU dataset without any changes
What about load_data function in
class NWPU? There is a flip operation to ensure that the data format is (y, x). Perhaps you did not follow this format, which leads to abnormal model outputs. You can visualize the outputs of your trained model and check whether the predictions are reasonable.That hits. Now the mse reached 88.702 at epoch 224. Thank you very much.
请问您在Shanghai_A数据集上的结果是多少呢
The training setting seems fine. Could you confirm that the format of point annotations is (y, x) instead of (x, y)? Wrong annotation format will lead to erroneous model outputs.
I directly load the annotations from original .json files of the official NWPU dataset without any changes
What about load_data function in
class NWPU? There is a flip operation to ensure that the data format is (y, x). Perhaps you did not follow this format, which leads to abnormal model outputs. You can visualize the outputs of your trained model and check whether the predictions are reasonable.That hits. Now the mse reached 88.702 at epoch 224. Thank you very much.
请问您在Shanghai_A数据集上的结果是多少呢
mae = 49.901 for part A and mae = 6.639 for part B
请问你调整参数了吗?我最近跑出的实验结果一直在52和53多,也不清楚是为什么
I use the same training settings as the author's. Maybe you can increase the total training epochs from 1500 to 3000.
请问你调整参数了吗?我最近跑出的实验结果一直在52和53多,也不清楚是为什么
谢谢,再请教一下,您跑的时候用的python版本和pytorch版本以及显卡是什么呢
------------------ 原始邮件 ------------------ 发件人: "R. @.>; 发送时间: 2025年4月16日(星期三) 中午1:30 收件人: @.>; 抄送: "happy @.>; @.>; 主题: Re: [cxliu0/PET] about the NWPU dataset (Issue #33)
I use the same training settings as the author's. Maybe you can increase the total training epochs from 1500 to 3000.
请问你调整参数了吗?我最近跑出的实验结果一直在52和53多,也不清楚是为什么
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***> BurstLink666 left a comment (cxliu0/PET#33)
I use the same training settings as the author's. Maybe you can increase the total training epochs from 1500 to 3000.
请问你调整参数了吗?我最近跑出的实验结果一直在52和53多,也不清楚是为什么
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
one 3090 GPU, python 3.9.5 with torch 2.6.0+cu124
谢谢,再请教一下,您跑的时候用的python版本和pytorch版本以及显卡是什么呢 …