energy-based-scene-graph icon indicating copy to clipboard operation
energy-based-scene-graph copied to clipboard

Script to generate predicted graphs given the image

Open aleSuglia opened this issue 3 years ago • 1 comments

Hi there,

Thanks for releasing your code to the public. I was wondering what would be the best way to define a predictor that, given an image, returns the predicted scene graph. How would you implement this? Following the test scripts, I can see that the inference method together with compute_energy_on_dataset are responsible for generating a predicted graph. However, I can see that the targets are always required in input. Is there any way to predict the graph without any supervision?

Also, on a side note, it looks like your code can be run only with a GPU and apex support. Is there any way to run this code on CPU as well?

Thanks, Alessandro

aleSuglia avatar Jul 13 '21 09:07 aleSuglia

@mods333 Any updates on this? Unfortunately, I'm not able to infer all the elements of a scene graph from the code released. For instance, I'm trying to use the function detection2graph but it seems not working with the current output from the model. I'm calling the model as follows:

images, targets, image_ids = batch
targets = [target.to(device) for target in targets]
output = base_model(images.to(device), targets)

However, output is a tuple with two elements:

  • list of length 1 with a BoxList object
  • torch.Tensor of shape (num detections, 4096)

Can you please explain how this output can be used to call the function above? In general, could you please clarify how to use this model in a real setup where no ground-truth information are provided?

aleSuglia avatar Jul 14 '21 11:07 aleSuglia