keras-cv icon indicating copy to clipboard operation
keras-cv copied to clipboard

Add YOLO-V3 model

Open innat opened this issue 1 year ago • 9 comments

Short Description

AKA, You Only Look Once. A strong object detection model, described in the following paper.

Papers

https://arxiv.org/abs/1804.02767?fbclid=IwAR3A2nyK8EPa-JcoGp_N6tNqVkKOmy2J1ip5AYcEki5FzkZ62E3z6tbNSy0

Published on: 2018 Cited by 14236 (until)

Existing Implementations

Here is one well-know and strong reference

https://github.com/zzh8829/yolov3-tf2

Other Information

https://github.com/pythonlessons/TensorFlow-2.x-YOLOv3

innat avatar Jul 26 '22 06:07 innat

I'm not sure we'll be supporting the earlier YOLOs, such as v3

LukeWood avatar Jul 26 '22 16:07 LukeWood

Is there any reason why v3 might not be supported despite having smart citation number? Also, recent yolo based models like yolo-v4, yolo-r, yolo-x, yolo-v7, all contains a reference of v3 model.

innat avatar Jul 26 '22 18:07 innat

is there ever a case where users would want to use v3 over v7?

LukeWood avatar Jul 26 '22 18:07 LukeWood

I'm with LukeWood. There's no need to have v3 support over faster, more accurate models.

DavidLandup0 avatar Jul 28 '22 09:07 DavidLandup0

Is there ever a case where users would want to use v3 over v7? There's no need to have v3 support over faster, more accurate models.

IMHO, it's a bad argument. If v7 is included now, and in the future when more new yolo versions will be available, that doesn't mean we will remove the older version.

YOLO-V3 is maintained in the tf-model-garden, and I don't see any reasonable cause not to have it in keras-cv.

innat avatar Jul 28 '22 10:07 innat

The garden isn't up to date. Here's an excerpt of the description there:

YOLO v3 and v4 serve as the most up to date and capable versions of the YOLO network group.

v4 isn't "official" (by Redmon et al.) so it's not like they limited themselves to the original authors. v5 should be in the list - it's been there since 2020. v7 is understandably not there, it was released a few weeks ago.

They seem to want to add the latest (even though they aren't). Just as YOLOv7 should be added now that it's new and up-to-date. Stuff that's added once tends to get kept for longer than the advancement in technology would mandate. Yeah, if we add v7 now, we won't remove it in the future, but we'd want to add vN that's relevant at that time and discourage use of v7 when it's clearly and objectively performing worse than alternatives.

The point of KerasCV is to make training SoTA models easier. YOLOv3 isn't SoTA. YOLOv7 is.

To that end, I'd argue that VGGNets shouldn't be part of newer frameworks, since they've long been outdone, and the methodology is old and inefficient enough that it's bad practice to use them in modern settings. The design choices have shifted away from VGGNets enough to (IMO) warrant their avoidance.

Many people looking to apply CV to their domain don't want to spend days in researching the best architectures or use cases. IMO, If we want to lower the barrier to entry to most, we'll want to take the guesswork out of the equation. Way too many papers in medicine and biology use VGGNets in 2022 and most of them could be improved with simple, standard preprocessing and augmentation steps, as well as newer architectures. Supporting outdated models, IMO, makes it harder for CV to be adopted by the wider community (and other fields of science).

DavidLandup0 avatar Jul 28 '22 11:07 DavidLandup0

I think that we are a little bit loosing the scope of Keras-cv manifesto in this thread.

The main goal here is to share reusable subcomponents to build the next network in the same/proxy family.

So I think it is not important if we start with V3 or V7 but is that tomorrow we have reusable components and the internal API (from V3/V4/V7 ?) to quickly build V8.

Cause here SOTA is really a quickly moving target and often many papers build over the previous archs introducing new elements.

If we have a strong internal modularity we could be on par of the competitor to release new SOTA model.

So the point is if we start with V7 are we going to have 50%, 60%, 70%, 80% of reusable modules from the previous Yolo versions?

bhack avatar Jul 28 '22 12:07 bhack

I think that we are a little bit loosing the scope of Keras-cv manifesto in this thread.

The main goal here is to share reusable subcomponents to build the next network in the same/proxy family.

So I think it is not important if we start with V3 or V7 but is that tomorrow we have reusable components and the internal API (from V3/V4/V7 ?) to quickly build V8.

Cause here SOTA is really a quickly moving target and often many papers build over the previous archs introducing new elements.

If we have a strong internal modularity we could be on par of the competitor to release new SOTA model.

So the point is if we start with V7 are we going to have 50%, 60%, 70%, 80% of reusable modules from the previous Yolo versions?

Great points Stefano! I think thats a great take on the situation.

LukeWood avatar Jul 28 '22 16:07 LukeWood

Great pointers by @bhack. To add two cents to this exact philosophy - @soumik12345 and I are working on implementing YOLOv2 (to learn and share mostly but also come up with modular APIs).

  • Most research papers are built on top of each other. YOLOv3 was possible because of v2, and this can go on and on. As long as we have the subcomponent to build YOLOv2, with a few more subcomponents, we can build v3 and so on.

  • Deep learning today is still about "trying" new techniques, primarily not backed by solid theory but by empirical evidence - by providing subcomponents users can try to mix and who knows, we get a new SOTA out of it.

ayulockin avatar Aug 07 '22 06:08 ayulockin

follow-up comment here, we should support YOLOv3 as it is a great demonstration for "reusable components", and YOLOv3 is definitely still heavily used, compared to more recent models (2022 is so-called the year of YOLO)

tanzhenyu avatar Oct 18 '22 16:10 tanzhenyu

is there ever a case where users would want to use v3 over v7?

Most of the research projects in my lab use YoloV3 for object detection. I am not sure about why, but I'll post here after I ask my supervisor.

EDIT: Because it's popular. More reference implementations, most benchmarks/papers use it and more help if we want to do something extra.

hnanacc avatar Nov 04 '22 19:11 hnanacc

I'm working on a related task, if someone hasn't already started the work, I can take this up.

hnanacc avatar Nov 11 '22 15:11 hnanacc

I'm working on a related task, if someone hasn't already started the work, I can take this up.

That's great, please go ahead. We don't assign issue for now so that anyone with bandwidth can take it. One thing I would like to ask is, please make sure you can re-use components that is currently being used in FasterRCNN, such as box matchers, anchor generators, _targer_gather, etc

tanzhenyu avatar Nov 11 '22 15:11 tanzhenyu

I'm working on a related task, if someone hasn't already started the work, I can take this up.

That's great, please go ahead. We don't assign issue for now so that anyone with bandwidth can take it. One thing I would like to ask is, please make sure you can re-use components that is currently being used in FasterRCNN, such as box matchers, anchor generators, _targer_gather, etc

Great, I will post my approach before starting the implementation, just to make sure I'm following all the pointers.

hnanacc avatar Nov 11 '22 15:11 hnanacc

Hello, I was trying work with YoloV3 on Keras as well, using https://github.com/zzh8829/yolov3-tf2 by @zzh8829 The main struggle that I found is how to compute validation metrics during training

Is there a way to have some kind of dynamic graph where the NMS is not used during the training, and using only when computing metrics?

YELKHATTABI avatar Mar 14 '23 17:03 YELKHATTABI

given that YOLOV8 is available now, this is not high-value to add

ianstenbit avatar Aug 04 '23 16:08 ianstenbit