evadb
evadb copied to clipboard
feat: object tracking
Supported query:
SELECT id, T.iids, T.bboxes, T.scores, T.labels
FROM MyVideo JOIN LATERAL EXTRACT_OBJECT(data, YoloV5, NorFairTracker)
AS T(iids, labels, bboxes, scores)
WHERE id < 30;
-
DataFrame
object has no attributeappend
- Cannot infer io signature from the decorator for <class 'util.DummyObjectDetectorDecorators'>
Can we use the
SEGMENT
construct to track sequences within a segment? Since tracking is done on a sequence of consecutive frames, calling an object tracker on arbitrary frames in the videos seems unintuitive.
I don't understand what you mean by arbitrary frames
. We are always calling the tracker on consecutive frames and the tracker ensures it provides a unique id to any new object.
Reg SEGMENT
: I was thinking of using GROUP BY
on object ids returned by the tracker to construct actual tracks and maybe support some geometric predicates on it.
One-off invocations of the tracker would generate different object-ids, correct? Let's say I run the object tracker on frames 1 to 30, then 70 to 100 in a continuous scene. Would the same objects have different object ids? The follow-up question would be, can we merge them in a subsequent call? Seems like a good use-case for supporting re-identification.
Yes, this can happen. However, if we add support for a tracker that also takes into about reid
features, we should be able to achieve it. This can be an extension of the current implementation.
I'm deliberately sending the frame
data to the tracker for extracting reid
features if required.
@pchunduri6 Let me know your thoughts.
One-off invocations of the tracker would generate different object-ids, correct? Let's say I run the object tracker on frames 1 to 30, then 70 to 100 in a continuous scene. Would the same objects have different object ids? The follow-up question would be, can we merge them in a subsequent call? Seems like a good use-case for supporting re-identification.
Yes, this can happen. However, if we add support for a tracker that also takes into about
reid
features, we should be able to achieve it. This can be an extension of the current implementation.I'm deliberately sending the
frame
data to the tracker for extractingreid
features if required.
This leads to an interesting problem. Predicate pushdown may not work in this case. For example, when we have WHERE id < 30 or id > 70
. Filtering the frame first will give us different results.
One-off invocations of the tracker would generate different object-ids, correct? Let's say I run the object tracker on frames 1 to 30, then 70 to 100 in a continuous scene. Would the same objects have different object ids? The follow-up question would be, can we merge them in a subsequent call? Seems like a good use-case for supporting re-identification.
Yes, this can happen. However, if we add support for a tracker that also takes into about
reid
features, we should be able to achieve it. This can be an extension of the current implementation. I'm deliberately sending theframe
data to the tracker for extractingreid
features if required.This leads to an interesting problem. Predicate pushdown may not work in this case. For example, when we have
WHERE id < 30 or id > 70
. Filtering the frame first will give us different results.
Yup, that is true. But isn't the user deliberately only wanting to track the objects from 30-70? We can add it as a warning in the documentation. I haven't worked on documentation yet. Was thinking of doing it in another PR.
One-off invocations of the tracker would generate different object-ids, correct? Let's say I run the object tracker on frames 1 to 30, then 70 to 100 in a continuous scene. Would the same objects have different object ids? The follow-up question would be, can we merge them in a subsequent call? Seems like a good use-case for supporting re-identification.
Yes, this can happen. However, if we add support for a tracker that also takes into about
reid
features, we should be able to achieve it. This can be an extension of the current implementation. I'm deliberately sending theframe
data to the tracker for extractingreid
features if required.This leads to an interesting problem. Predicate pushdown may not work in this case. For example, when we have
WHERE id < 30 or id > 70
. Filtering the frame first will give us different results.Yup, that is true. But isn't the user deliberately only wanting to track the objects from 30-70? We can add it as a warning in the documentation. I haven't worked on documentation yet. Was thinking of doing it in another PR.
This is not a common use case, but just for example. User are interested in vehicles before 6PM or after 7PM. So the query will have WHERE timestamp < 6PM or timestamp > 7PM
. Now consider there is a vehicle circling around through the whole time. Without predicate pushdown, the vehicle will have a unique id associated with it (assuming no errors). With predicate push down, the vehicle can have two ids because the location of the vehicle in the last frame for timestamp < 6PM
and the first frame for timestamp > 7PM
can be quite different.
There is also a risk that random vehicles in the first frame for timestamp > 7pm
will be matched to random vehicles in the last frame for timestamp < 6PM
, which will not happen if we don't do predicate push down.
One-off invocations of the tracker would generate different object-ids, correct? Let's say I run the object tracker on frames 1 to 30, then 70 to 100 in a continuous scene. Would the same objects have different object ids? The follow-up question would be, can we merge them in a subsequent call? Seems like a good use-case for supporting re-identification.
Yes, this can happen. However, if we add support for a tracker that also takes into about
reid
features, we should be able to achieve it. This can be an extension of the current implementation. I'm deliberately sending theframe
data to the tracker for extractingreid
features if required.This leads to an interesting problem. Predicate pushdown may not work in this case. For example, when we have
WHERE id < 30 or id > 70
. Filtering the frame first will give us different results.Yup, that is true. But isn't the user deliberately only wanting to track the objects from 30-70? We can add it as a warning in the documentation. I haven't worked on documentation yet. Was thinking of doing it in another PR.
This is not a common use case, but just for example. User are interested in vehicles before 6PM or after 7PM. So the query will have
WHERE timestamp < 6PM or timestamp > 7PM
. Now consider there is a vehicle circling around through the whole time. Without predicate pushdown, the vehicle will have a unique id associated with it (assuming no errors). With predicate push down, the vehicle can have two ids because the location of the vehicle in the last frame fortimestamp < 6PM
and the first frame fortimestamp > 7PM
can be quite different. There is also a risk that random vehicles in the first frame fortimestamp > 7pm
will be matched to random vehicles in the last frame fortimestamp < 6PM
, which will not happen if we don't do predicate push down.
- Yes, that is a problem. I can't think of a solution right now. Not pushing down the predicate will be super expensive.
- I believe trackers should be able to handle it. Most trackers consume frame id to compute flow. But depends on the tracker. What do you suggest?
Can we use the
SEGMENT
construct to track sequences within a segment? Since tracking is done on a sequence of consecutive frames, calling an object tracker on arbitrary frames in the videos seems unintuitive.I don't understand what you mean by
arbitrary frames
. We are always calling the tracker on consecutive frames and the tracker ensures it provides a unique id to any new object. RegSEGMENT
: I was thinking of usingGROUP BY
on object ids returned by the tracker to construct actual tracks and maybe support some geometric predicates on it.
The GROUP BY
idea looks good to me.
By arbitrary frames, I meant the upstream query could potentially return any frames based on the predicate right. For example, consider the return ids 10, 120, 250, and 300
. What would track mean when run on such frames? It's not an implementation or design issue. From a usability standpoint, I was just thinking if there's a way to limit the use of the tracker to specific types of input sequences.
One-off invocations of the tracker would generate different object-ids, correct? Let's say I run the object tracker on frames 1 to 30, then 70 to 100 in a continuous scene. Would the same objects have different object ids? The follow-up question would be, can we merge them in a subsequent call? Seems like a good use-case for supporting re-identification.
Yes, this can happen. However, if we add support for a tracker that also takes into about
reid
features, we should be able to achieve it. This can be an extension of the current implementation. I'm deliberately sending theframe
data to the tracker for extractingreid
features if required.This leads to an interesting problem. Predicate pushdown may not work in this case. For example, when we have
WHERE id < 30 or id > 70
. Filtering the frame first will give us different results.Yup, that is true. But isn't the user deliberately only wanting to track the objects from 30-70? We can add it as a warning in the documentation. I haven't worked on documentation yet. Was thinking of doing it in another PR.
This is not a common use case, but just for example. User are interested in vehicles before 6PM or after 7PM. So the query will have
WHERE timestamp < 6PM or timestamp > 7PM
. Now consider there is a vehicle circling around through the whole time. Without predicate pushdown, the vehicle will have a unique id associated with it (assuming no errors). With predicate push down, the vehicle can have two ids because the location of the vehicle in the last frame fortimestamp < 6PM
and the first frame fortimestamp > 7PM
can be quite different. There is also a risk that random vehicles in the first frame fortimestamp > 7pm
will be matched to random vehicles in the last frame fortimestamp < 6PM
, which will not happen if we don't do predicate push down.
- Yes, that is a problem. I can't think of a solution right now. Not pushing down the predicate will be super expensive.
- I believe trackers should be able to handle it. Most trackers consume frame id to compute flow. But depends on the tracker. What do you suggest?
I see. We can print a warning message for now. Re-id is needed I think to achieve the same results with / without predicate push down. Even then, there still can be errors from re-id.
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
@gaurav274 Please help checking the Notebook test case.
Sure, looking into it.
@xzdandy Hopefully, the changes should fix the build-related issue. Btw, do you think there is a meaningful test case for object tracking? Also, please review that I accidentally didn't remove your changes.
@xzdandy Hopefully, the changes should fix the build-related issue. Btw, do you think there is a meaningful test case for object tracking? Also, please review that I accidentally didn't remove your changes.
That is a good point. We should have a test case for EVAtracker abstract class and builtin nor_fair tracker.
@xzdandy, why have notebooks changed? Feel free to merge it if you feel it is good to go.
@xzdandy, why have notebooks changed? Feel free to merge it if you feel it is good to go.
I think running bash script/test/test.sh
locally changes the notebook.