ByteTrack icon indicating copy to clipboard operation
ByteTrack copied to clipboard

Clarification on fps

Open lpkoh opened this issue 3 years ago • 4 comments

Thank you so much for this. This repo is amazing.

Can I clarify about the fps numbers being declared?

As I understand, when I run ./demo_track currently, the process is something like:

  1. Import yolox model from yolox package
  2. Read video palace.mp4 (1280 x 720) and get frame
  3. Reshape video frame into desired input size (1440 x 800, e.g. for x version of yolox)
  4. Run object detection using imported yolox model
  5. Pass object detection to tracker for tracking And the number of frames that can pass through steps 1 to 5 in a second is the fps displayed on the video output being declared. Is this correct, or is the fps for simply the tracking stage, for e.g.? Like, do we count resizing and object detection time in fps?

Also, in the base demo, the yolox model is not converted to tensorrt, neither is the tracker right? Does this mean we can increase fps relative to what is shown on the video output from demo by:

  1. Improving pre processing speed
  2. Using a tensorrt optimized yolo detection model
  3. Using a tensorrt optimized tracker model

Your clarification would be super useful.

lpkoh avatar Nov 25 '21 06:11 lpkoh

We have released a TensorRT + C++ implementation of ByteTrack and the speed is much faster.

ifzhang avatar Nov 25 '21 08:11 ifzhang

We have released a TensorRT + C++ implementation of ByteTrack and the speed is much faster.

Hi yes, thank you for that, I will give it a try.

I was looking more to find out what you mean when you say "fps" in the palace video output. Is "fps" mean the entire process from video loading to preprocessing to detection to tracking, or is it referring just to time taken for tracking?

lpkoh avatar Nov 25 '21 09:11 lpkoh

We have released a TensorRT + C++ implementation of ByteTrack and the speed is much faster.

Hi yes, thank you for that, I will give it a try.

I was looking more to find out what you mean when you say "fps" in the palace video output. Is "fps" mean the entire process from video loading to preprocessing to detection to tracking, or is it referring just to time taken for tracking?

I actually have the same question. It's unclear whether the FPS value is related to the execution of the whole pipeline or just a specific step of it (which would make no sense, in my opinion). I'm running ByteTracker as a part of a much bigger project and the code is heavily modified, but on my V100 I get something like 3-4 FPS max to do from step 2 to 5 (so measured starting from the frame acquisition to the generation of the final result).

vcozzolino avatar Apr 06 '22 11:04 vcozzolino

if I understand your question the FPS just for part tracking not inclue detection (i.e not inclue yolox)

LamnouarMohamed avatar Apr 06 '22 12:04 LamnouarMohamed