multinerf icon indicating copy to clipboard operation
multinerf copied to clipboard

Question about evaluation metric(LPIPS) and results.

Open dogyoonlee opened this issue 2 years ago • 3 comments

I appreciate to your awesome work!

I have tried to train the lego scene(blender) with blender_512.gin file.

I render and eval following the included scripts, but I have questions about computing the final quantitative evaluation results.

First, There is no LPIPS metric in this code and similarly, the metric is also removed from jaxnerf code as well.

Is there any reason to remove the LPIPS metric in the code while it is still used to compare the performance of the NeRF-like projects in overall papers?

If I want to evaluate LPIPS performance, is it okay to use the LPIPS code or library as many people is using?

Second, when I conduct the evaluation using eval.py file, the psnr and ssim is computed per each image.

In evaluation, evaluation results is not logged as additional text file.

If I want to log them, should I just set eval_only_once as False?

Third, average score of the metrics are not computed or saved as representative results as performance on your code.

If I want to compare the average results of my trained model with the results of your paper, just compute the average score of them?

Thank you for you attention!

dogyoonlee avatar Apr 24 '23 11:04 dogyoonlee

Hello, may I ask if your question has been resolved? Could you please discuss how to calculate the LPIPS index

Iyeu avatar Oct 02 '24 11:10 Iyeu

You can easily extend the evaluation code to compute various different image quality metrics by modifying the __call__ function defined in the MetricHarness class (in multinerf/internal/image.py). This codebase uses jax so you need to convert jax arrays to other formats first if you want to use open source LPIPS libraries. For example, the widely used PyTorch LPIPS package accepts torch.Tensor as input and the data needs to be in the range [-1, 1].

For logging, I think you can just add a couple lines in eval.py to log the metric values to a text file.

bchao1 avatar Dec 14 '24 08:12 bchao1

Please refer to the following code snippet to extend the MetricHarness class to compute your custom-defined metrics. The modifications will be directly reflected when you execute the eval.py script.

import lpips # https://github.com/richzhang/PerceptualSimilarity

class MetricHarness:
  """A helper class for evaluating several error metrics."""
  # add LPIPS here
  def __init__(self):
    self.ssim_fn = jax.jit(dm_pix.ssim)
    self.lpips_fn = lpips.LPIPS(net="vgg").cuda()
  
  def compute_lpips(self, rgb_pred, rgb_gt):
    with torch.no_grad():
      rgb_pred = np.array(rgb_pred, copy=True) # convert jax.array to numpy.array with writeable flag
      rgb_pred = torch.tensor(rgb_pred).permute(2, 0, 1).unsqueeze(0).cuda().float() # [1, 3, H, W]
      rgb_gt = torch.tensor(rgb_gt).permute(2, 0, 1).unsqueeze(0).cuda().float() # [1, 3, H, W]
      # Normalize [0, 1] to [-1, 1]
      rgb_pred = (rgb_pred * 2) - 1
      rgb_gt = (rgb_gt * 2) - 1
      return self.lpips_fn(rgb_pred, rgb_gt).item()

  def __call__(self, rgb_pred, rgb_gt, name_fn=lambda s: s):
    """Evaluate the error between a predicted rgb image and the true image."""
    psnr = float(mse_to_psnr(((rgb_pred - rgb_gt)**2).mean()))
    ssim = float(self.ssim_fn(rgb_pred, rgb_gt))
    lpips = float(self.compute_lpips(rgb_pred, rgb_gt))

    return {
        name_fn('psnr'): psnr,
        name_fn('ssim'): ssim,
        name_fn('lpips'): lpips,
    }

bchao1 avatar Dec 23 '24 00:12 bchao1