TCP Figures 2 and 3 of the supplementary material

Hi, I would greatly appreciate it if you could share the code that you used for the visualization in figures 2 and 3 of the supplementary material? (trajectory-guided attention maps, GradCam and EigenCam) Thank you very much in advance!

Jul 12 '23 14:07 mANDm1412

Hi, I could provide some information on how to get those visualizations. For eigen-cam, I use the following function on the 2D feature map from the vision-backbone and then visualize it with the function show_cam_on_image from pytorch-grad-cam

def get_2d_projection(activation_batch):
	# TBD: use pytorch batch svd implementation
	activation_batch[np.isnan(activation_batch)] = 0
	projections = []
	for activations in activation_batch:
		reshaped_activations = (activations).reshape(
			activations.shape[0], -1).transpose()
		# Centering before the SVD seems to be important here,
		# Otherwise the image returned is negative
		reshaped_activations = reshaped_activations - \
			reshaped_activations.mean(axis=0)
		U, S, VT = np.linalg.svd(reshaped_activations, full_matrices=True)
		projection = reshaped_activations @ VT[0, :]
		projection = projection.reshape(activations.shape[1:])
		projection = np.abs(projection)
		max_v, min_v = np.max(projection), np.min(projection)
		if max_v != min_v:
			projection = (projection - min_v) / (max_v - min_v)

		projections.append(projection)
	return np.float32(projections)

For the Grad-cam, I use the grad-cam in pytorch-grad-cam with the customized Target. Note that the pytorch-grad-cam repo only supports single inputs, so I kept other inputs like velocity, and conditioning as the model’s attributes so you could use them during forward and updated them before calling grad-cam. Something like: model.velocity=velocity, gradcam(model, image).

class Target:
	def __init__(self, gt):
		self.dist_sup = Beta(gt['action_mu'].cuda(), gt['action_sigma'].cuda())
		
	def __call__(self, model_output):
		model_output = model_output.unsqueeze(0)
		mu = model_output[:, :2]
		sigma = model_output[:, 2:]
		dist_pred = Beta(mu, sigma)
		kl_div = torch.distributions.kl_divergence(self.dist_sup, dist_pred)
		return -1 *(torch.mean(kl_div[:, 0]) *0.5 + torch.mean(kl_div[:, 1]) *0.5)

For the visualization in Fig2, you can just resize and visualize the wp_att during model inference.

Jul 20 '23 15:07 penghao-wu

Hi @mANDm1412, were you able to reproduce the Fig 2 results of the supplementary material?

Jun 20 '24 06:06 anantagrg