autodistill-grounding-dino Format of labels is xyxy

trafficstars

Hi,

I am using autodistill-grounding-dino to annotate imgaes for yolov8 training. Yolov8 model expects the labels in format (class x_center y_center width height).

Is the format returned by autodistill-grounding-dino the same or it is (xyxy width height)?
Are the labels normalized? I get an error of out of bound labels while training for yolov8

I found that the boxes are converted to xyxy format in post process. Does this affect the final labels generated?

def post_process_result( source_h: int, source_w: int, boxes: torch.Tensor, logits: torch.Tensor ) -> sv.Detections: boxes = boxes * torch.Tensor([source_w, source_h, source_w, source_h]) xyxy = box_convert(boxes=boxes, in_fmt="cxcywh", out_fmt="xyxy").numpy() confidence = logits.numpy() return sv.Detections(xyxy=xyxy, confidence=confidence)

Dec 20 '23 15:12 Mars-204

Hello! How are you saving the labels? Are you using the .label() function in Autodistill, or writing your own logic?

Dec 21 '23 15:12 capjamesg

Reference: https://docs.autodistill.com/reference/base-models/detection/#autodistill.detection.detection_base_model.DetectionBaseModel.label

Dec 21 '23 15:12 capjamesg

I am using .label() function in Autodistill

Dec 21 '23 17:12 Mars-204

Is the format returned by autodistill-grounding-dino the same or it is (xyxy width height)?

It should be converted to YOLOv8.

Are the labels normalized? I get an error of out of bound labels while training for yolov8

Base models return pixel coordinates rather than normalized values from 0-1.

Can you share your code so I can replicate your issue?

Dec 28 '23 10:12 capjamesg

I have checked the code and it returns the xyxy width height format. For converting to YOLOv8 format (x_center y_center width height) I have to modify the code.

-- Background: I am trying to label 'persons' from intensity images of .pgm format. I modified the source code to handle the .pgm images

-- Main function to annotate folder data

def annotator(root_folder, save_dir, sam=False, dino=True):
  if sam:
    base_model_sam = GroundedSAM(ontology=CaptionOntology({"all person": "person"}))
    base_model_sam.label(
        # input_folder=r"C:\work\masterarbiet\3d-object-detection-and-tracking-using-dl\data\data_collection\manthan-test",
        input_folder="./images",
        output_folder="./dataset_sam",
        extension=".pgm"
      )
    
  folder_name = root_folder /  str(root_folder.name + "_intensity")  
  os.makedirs(folder_name, exist_ok=True)
  intenisty_images = list(root_folder.glob("*inten.pgm"))

  for im in intenisty_images:
    shutil.copy(im, folder_name)

  if dino:
    base_model_dino = GroundingDINO(ontology=CaptionOntology({"all person": "person"}), box_threshold=0.25)
    # label all images in a folder called "context_images"

    base_model_dino.label(input_folder=str(folder_name),
        output_folder=str(save_dir),
        extension=".pgm")

-- Changes to handle the .pgm images

 def predict(self, input: str) -> sv.Detections:
        # image = load_image(input, return_format="cv2")
        image_source, image = load_image(input)  # load_input() method from groundingdino.util.inference

Dec 28 '23 13:12 Mars-204

It is a bit hard to interpret your message. Can you use the backtick character to format your code (`)? If you are proposing a complete solution, feel free to submit it as a PR to the package and I'll review!

Jan 16 '24 15:01 capjamesg

autodistill-grounding-dino autodistill-grounding-dino copied to clipboard

Format of labels is xyxy

autodistill-grounding-dino
autodistill-grounding-dino copied to clipboard