Performance decresing during time
While I'm doing batch processing, performance starts degrading during time.
Also, is there any possibility to delete dead tracklets from the trajectories?
How long is your sequence, how large your optimization window?
I'm running it online, so let's suppose that's endless. I have already checked that the major bottleneck was just appending the results of all trajectories. I would like to ask your help to modify mcf code to be able to delete really old trajectories so it could process endless sequences
You must be running it for quite a while, I guess the issue then really is constructing the trajectories not solving the min-cost flow problem.
Anyways, i'll be looking into it some time within the next weeks as I am planning application in "endless" runs as well.
But I guess you are still accumulating old trajectories info, which is intractable in long runs. But if you're approaching this, I guess i can wait :)
I have adapted the mcf library to allow for this use case scenario in branch feature/batch_processing_update. I haven't found the time to update the mcf-tracker to use this API other than in a few dirty tests. Essentially, you have replace the call
# trajectories = self._graph.run_search()
trajectories = self._graph.compute_trajectories()
self._graph.remove_inactive_tracks()
here and change the surrounding code to take a dictionary that maps from track id to trajectory rather than a nested list.
Go ahead and try this out if you like. I won't have my computer with me until next year, so it will take a little while before I can integrate this.
I will give a try next week. Thanks!
How can I get the track id?
It returns a dictionary. The track id is the key, the sequence of nodes in the graph/solver that are on the trajectory is the value.
for track_id, trajectory in trajectories.items():
# do something.
pass
This seems to be working, however there is still a memory leak. I suppose it's because the _graph is still accumulating all the nodes. Do you think it's easy to remove 'old' nodes in the graph in the same function? I can give a try..
I think I know why I'm facing a non-negligible memory-leak. I'm using a feature vector a bit large (4,128), dtype= float32 , and it's being stored in location_attributes_ that is always increasing.
However, I think the graph should be 'cutted' after the processing window passes
Ah, yes. I had only tested with the C++ interface so far. These node attributes need to be pruned as well. There is a num_pruned_locations_ in the BatchProcessor class. All nodes smaller or equal to this value are not actively optimized, but they might still be on a trajectory... Trajectories are deleted when they are fully outside of the optimization window (window_len_). I am not sure if there is a better way to check which locations can be removed other than keeping track of the returned IDs. I will think about it.
Also, I found an error in the code that (1) will be catched by an assertion in debug mode, (2) sometimes causes identity switches although a trajectory could be continued. At the moment I am looking into the second one and will let you know when the code has been updated.
Thank you! Awesome work and awesome support!
I finally found some time to work on this, although not as much as I wanted. In branch feature/mcf_update the MinCostFlowTracker class has been changed to return a dictionary that maps track id to bounding box in process(). This happens only if optimizer_window_len is not None.
Node attributes are also removed when they are not referenced by any trajectory in the cache and are outside of the optimization window. Let me know if you encounter issues. I'll give it a more thorough test some time later.