evadb icon indicating copy to clipboard operation
evadb copied to clipboard

feat: object tracking

Open gaurav274 opened this issue 2 years ago • 10 comments

Supported query:

SELECT id, T.iids, T.bboxes, T.scores, T.labels
FROM MyVideo JOIN LATERAL EXTRACT_OBJECT(data, YoloV5, NorFairTracker)
    AS T(iids, labels, bboxes, scores)
WHERE id < 30;

gaurav274 avatar Jan 20 '23 08:01 gaurav274

  1. DataFrame object has no attribute append
  2. Cannot infer io signature from the decorator for <class 'util.DummyObjectDetectorDecorators'>

jarulraj avatar Apr 04 '23 00:04 jarulraj

Can we use the SEGMENT construct to track sequences within a segment? Since tracking is done on a sequence of consecutive frames, calling an object tracker on arbitrary frames in the videos seems unintuitive.

I don't understand what you mean by arbitrary frames. We are always calling the tracker on consecutive frames and the tracker ensures it provides a unique id to any new object. Reg SEGMENT: I was thinking of using GROUP BY on object ids returned by the tracker to construct actual tracks and maybe support some geometric predicates on it.

gaurav274 avatar Apr 05 '23 21:04 gaurav274

One-off invocations of the tracker would generate different object-ids, correct? Let's say I run the object tracker on frames 1 to 30, then 70 to 100 in a continuous scene. Would the same objects have different object ids? The follow-up question would be, can we merge them in a subsequent call? Seems like a good use-case for supporting re-identification.

Yes, this can happen. However, if we add support for a tracker that also takes into about reid features, we should be able to achieve it. This can be an extension of the current implementation.

I'm deliberately sending the frame data to the tracker for extracting reid features if required.

gaurav274 avatar Apr 05 '23 21:04 gaurav274

@pchunduri6 Let me know your thoughts.

gaurav274 avatar Apr 06 '23 03:04 gaurav274

One-off invocations of the tracker would generate different object-ids, correct? Let's say I run the object tracker on frames 1 to 30, then 70 to 100 in a continuous scene. Would the same objects have different object ids? The follow-up question would be, can we merge them in a subsequent call? Seems like a good use-case for supporting re-identification.

Yes, this can happen. However, if we add support for a tracker that also takes into about reid features, we should be able to achieve it. This can be an extension of the current implementation.

I'm deliberately sending the frame data to the tracker for extracting reid features if required.

This leads to an interesting problem. Predicate pushdown may not work in this case. For example, when we have WHERE id < 30 or id > 70. Filtering the frame first will give us different results.

xzdandy avatar Apr 06 '23 04:04 xzdandy

One-off invocations of the tracker would generate different object-ids, correct? Let's say I run the object tracker on frames 1 to 30, then 70 to 100 in a continuous scene. Would the same objects have different object ids? The follow-up question would be, can we merge them in a subsequent call? Seems like a good use-case for supporting re-identification.

Yes, this can happen. However, if we add support for a tracker that also takes into about reid features, we should be able to achieve it. This can be an extension of the current implementation. I'm deliberately sending the frame data to the tracker for extracting reid features if required.

This leads to an interesting problem. Predicate pushdown may not work in this case. For example, when we have WHERE id < 30 or id > 70. Filtering the frame first will give us different results.

Yup, that is true. But isn't the user deliberately only wanting to track the objects from 30-70? We can add it as a warning in the documentation. I haven't worked on documentation yet. Was thinking of doing it in another PR.

gaurav274 avatar Apr 06 '23 05:04 gaurav274

One-off invocations of the tracker would generate different object-ids, correct? Let's say I run the object tracker on frames 1 to 30, then 70 to 100 in a continuous scene. Would the same objects have different object ids? The follow-up question would be, can we merge them in a subsequent call? Seems like a good use-case for supporting re-identification.

Yes, this can happen. However, if we add support for a tracker that also takes into about reid features, we should be able to achieve it. This can be an extension of the current implementation. I'm deliberately sending the frame data to the tracker for extracting reid features if required.

This leads to an interesting problem. Predicate pushdown may not work in this case. For example, when we have WHERE id < 30 or id > 70. Filtering the frame first will give us different results.

Yup, that is true. But isn't the user deliberately only wanting to track the objects from 30-70? We can add it as a warning in the documentation. I haven't worked on documentation yet. Was thinking of doing it in another PR.

This is not a common use case, but just for example. User are interested in vehicles before 6PM or after 7PM. So the query will have WHERE timestamp < 6PM or timestamp > 7PM. Now consider there is a vehicle circling around through the whole time. Without predicate pushdown, the vehicle will have a unique id associated with it (assuming no errors). With predicate push down, the vehicle can have two ids because the location of the vehicle in the last frame for timestamp < 6PM and the first frame for timestamp > 7PM can be quite different. There is also a risk that random vehicles in the first frame for timestamp > 7pm will be matched to random vehicles in the last frame for timestamp < 6PM, which will not happen if we don't do predicate push down.

xzdandy avatar Apr 06 '23 05:04 xzdandy

One-off invocations of the tracker would generate different object-ids, correct? Let's say I run the object tracker on frames 1 to 30, then 70 to 100 in a continuous scene. Would the same objects have different object ids? The follow-up question would be, can we merge them in a subsequent call? Seems like a good use-case for supporting re-identification.

Yes, this can happen. However, if we add support for a tracker that also takes into about reid features, we should be able to achieve it. This can be an extension of the current implementation. I'm deliberately sending the frame data to the tracker for extracting reid features if required.

This leads to an interesting problem. Predicate pushdown may not work in this case. For example, when we have WHERE id < 30 or id > 70. Filtering the frame first will give us different results.

Yup, that is true. But isn't the user deliberately only wanting to track the objects from 30-70? We can add it as a warning in the documentation. I haven't worked on documentation yet. Was thinking of doing it in another PR.

This is not a common use case, but just for example. User are interested in vehicles before 6PM or after 7PM. So the query will have WHERE timestamp < 6PM or timestamp > 7PM. Now consider there is a vehicle circling around through the whole time. Without predicate pushdown, the vehicle will have a unique id associated with it (assuming no errors). With predicate push down, the vehicle can have two ids because the location of the vehicle in the last frame for timestamp < 6PM and the first frame for timestamp > 7PM can be quite different. There is also a risk that random vehicles in the first frame for timestamp > 7pm will be matched to random vehicles in the last frame for timestamp < 6PM, which will not happen if we don't do predicate push down.

  1. Yes, that is a problem. I can't think of a solution right now. Not pushing down the predicate will be super expensive.
  2. I believe trackers should be able to handle it. Most trackers consume frame id to compute flow. But depends on the tracker. What do you suggest?

gaurav274 avatar Apr 06 '23 07:04 gaurav274

Can we use the SEGMENT construct to track sequences within a segment? Since tracking is done on a sequence of consecutive frames, calling an object tracker on arbitrary frames in the videos seems unintuitive.

I don't understand what you mean by arbitrary frames. We are always calling the tracker on consecutive frames and the tracker ensures it provides a unique id to any new object. Reg SEGMENT: I was thinking of using GROUP BY on object ids returned by the tracker to construct actual tracks and maybe support some geometric predicates on it.

The GROUP BY idea looks good to me.

By arbitrary frames, I meant the upstream query could potentially return any frames based on the predicate right. For example, consider the return ids 10, 120, 250, and 300. What would track mean when run on such frames? It's not an implementation or design issue. From a usability standpoint, I was just thinking if there's a way to limit the use of the tracker to specific types of input sequences.

pchunduri6 avatar Apr 06 '23 19:04 pchunduri6

One-off invocations of the tracker would generate different object-ids, correct? Let's say I run the object tracker on frames 1 to 30, then 70 to 100 in a continuous scene. Would the same objects have different object ids? The follow-up question would be, can we merge them in a subsequent call? Seems like a good use-case for supporting re-identification.

Yes, this can happen. However, if we add support for a tracker that also takes into about reid features, we should be able to achieve it. This can be an extension of the current implementation. I'm deliberately sending the frame data to the tracker for extracting reid features if required.

This leads to an interesting problem. Predicate pushdown may not work in this case. For example, when we have WHERE id < 30 or id > 70. Filtering the frame first will give us different results.

Yup, that is true. But isn't the user deliberately only wanting to track the objects from 30-70? We can add it as a warning in the documentation. I haven't worked on documentation yet. Was thinking of doing it in another PR.

This is not a common use case, but just for example. User are interested in vehicles before 6PM or after 7PM. So the query will have WHERE timestamp < 6PM or timestamp > 7PM. Now consider there is a vehicle circling around through the whole time. Without predicate pushdown, the vehicle will have a unique id associated with it (assuming no errors). With predicate push down, the vehicle can have two ids because the location of the vehicle in the last frame for timestamp < 6PM and the first frame for timestamp > 7PM can be quite different. There is also a risk that random vehicles in the first frame for timestamp > 7pm will be matched to random vehicles in the last frame for timestamp < 6PM, which will not happen if we don't do predicate push down.

  1. Yes, that is a problem. I can't think of a solution right now. Not pushing down the predicate will be super expensive.
  2. I believe trackers should be able to handle it. Most trackers consume frame id to compute flow. But depends on the tracker. What do you suggest?

I see. We can print a warning message for now. Re-id is needed I think to achieve the same results with / without predicate push down. Even then, there still can be errors from re-id.

xzdandy avatar Apr 08 '23 05:04 xzdandy

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@gaurav274 Please help checking the Notebook test case.

Sure, looking into it.

gaurav274 avatar May 13 '23 22:05 gaurav274

@xzdandy Hopefully, the changes should fix the build-related issue. Btw, do you think there is a meaningful test case for object tracking? Also, please review that I accidentally didn't remove your changes.

gaurav274 avatar May 14 '23 06:05 gaurav274

@xzdandy Hopefully, the changes should fix the build-related issue. Btw, do you think there is a meaningful test case for object tracking? Also, please review that I accidentally didn't remove your changes.

That is a good point. We should have a test case for EVAtracker abstract class and builtin nor_fair tracker.

xzdandy avatar May 14 '23 07:05 xzdandy

@xzdandy, why have notebooks changed? Feel free to merge it if you feel it is good to go.

gaurav274 avatar May 14 '23 19:05 gaurav274

@xzdandy, why have notebooks changed? Feel free to merge it if you feel it is good to go.

I think running bash script/test/test.sh locally changes the notebook.

xzdandy avatar May 14 '23 19:05 xzdandy