tanzhenyu

Results 15 issues of tanzhenyu

Transformer models are becoming more and more relevant. Let's support it by having ViT first. citing https://arxiv.org/pdf/2010.11929.pdf

contribution-welcome

Related to https://github.com/keras-team/keras-cv/issues/668, this is another transformer based model which we'd like to be supported. For simplicity, focus on only classification task, having in mind that future detection tasks might...

type:feature

Other than RetinaNet, the library should support a heavier model that allows trade-off between performance / speed. We will start with core components first, and decide what kind of meta...

type:feature

Completing the story of `Model.fit` for FasterRCNN. This is achieved by overriding `train_step` and `test_step`. Testing it with Pascal VOC dataset, the performance / quality is similar to CTL. Currently...

Today, in order to augment_bbox, users are required to pass `[y_min, x_min, y_max, x_max, class_id]` to the layer input, i.e., concat `gt_boxes` and `gt_classes`. This is a little anti user...

Running `python examples/models/object_detection/retina_net/basic/pascal_voc/train.py` fails at: ``` Traceback (most recent call last): File "/home/overflow/code/keras-cv/examples/models/object_detection/retina_net/basic/pascal_voc/train.py", line 98, in visualize_dataset(dataset, bounding_box_format="xywh") File "/home/overflow/code/keras-cv/examples/models/object_detection/retina_net/basic/pascal_voc/train.py", line 87, in visualize_dataset boxes = keras_cv.bounding_box.convert_format( File "/home/overflow/code/keras-cv/keras_cv/bounding_box/converters.py", line...

During the development of FasterRCNN, I have noticed reasonable loss and reasonable prediction outcome, but always get low mAP metrics (~0.18). The evaluation loop takes ~13 minutes. So I instead...

object-detection-landing
high-priority

Here's our implementation: https://github.com/keras-team/keras-cv/blame/master/keras_cv/losses/smooth_l1.py#L39-L42 Here's a definition used by PyTorch: https://pytorch.org/docs/stable/generated/torch.nn.SmoothL1Loss.html A beta denominator is missing. It still works fine if the cutoff is 1, but anything else would be...

i.e., from: x, y, w, h to: (x-x_a) / w_a (y-y_a) / h_a log(w / w_a) log(h / h_a) potentially allow box variance as well

object-detection-landing