awpy
awpy copied to clipboard
Added functions to calculate distances between games states and to allow multi-round plotting
I wrote two functions to plot multiple rounds in one gif.
One is for plotting rounds with the exact same players and the other one can be used for any combination of rounds.
For the case with the same players each player (identified by steamid) is assigned a fixed color that he has for all rounds.
In the other case the first round in the list is used as reference and for each following rounds the players are assigned colors by matching them to the closest player from the first round.
I overdid it a bit with the commenting and there are also a couple places with pretty obvious simplification/clarity improvements possible. It probably also makes sense to add more arguments to the functions regarding marker transparency and size for the matched rounds.
Don't know if you think if this would make a decent addition to the package.
Here are two gifs produced by the functions. Ten rounds are probably on the high end and it is probably not wise to plot both teams at the same time though.
The functions in here might also be candidates to be directly added to awpy.
These plus the functions from the pull request are shown off here: https://imgur.com/a/wdEvCDq
I know added the precomputed distances for named areas. Those for all the tiles are here but too big for github.
I also added the functionality to calculate game state distances based on either player positions or position tokens.
I also have a couple of plotting functions here that i used for verifying the results.
Most of what is here seems fine to merge. Thanks for doing some general refactoring, fixing some NAV bugs, and also adding some interesting state/frame distance metrics. I especially think the navigation module contributions here will be helpful. The only comments/concerns I have are the following:
- I can see the use case for plotting
plot_rounds_same_players
but not necessarilyplot_rounds_different_players
(mostly because of the visual clutter). I'm wondering if, given the function's size vs. its propensity to be used, it may be better introduced in an example notebook (maybe like a visualization part 2 notebook). What analytics use case did you envision here? - How long does it take to create the area distance matrix? If it does not take long, we can just let the user call a function to produce it (maybe with the map as a parameter).
- for
get_area_distance_matrix()
andgenerate_tile_distance_matrix()
we should probably use the navigation mesh nomenclature, which I believe in this case would make the function namesget_place_distance_matrix()
andgenerate_area_distance_matrix()
. Each tile is an "area" and a collection of tiles is a "place". Also, is there a reason to useget
versusgenerate
for the two? I can see reasoning for having different as well as the same prefix. - Would you be able to create a small notebook for
examples/
that uses your new navigation functions? It could be called04_Working_with_Navigation_Meshes_Advanced
and just show in a few cells how to use your new functions.
Thanks for the review.
I can see the use case for plotting plot_rounds_same_players but not necessarily plot_rounds_different_players (mostly because of the visual clutter). I'm wondering if, given the function's size vs. its propensity to be used, it may be better introduced in an example notebook (maybe like a visualization part 2 notebook). What analytics use case did you envision here?
I want to try clustering rounds and want to use that function to verify that the results make sense. So i take 10 rounds that were all assigned to the same cluster and plot those together to see if they are similar at least visually. In that case the cluttering shouldnt be so problematic because ideally the rounds are all very close and the colors should hopefully be organized together.
How long does it take to create the area distance matrix? If it does not take long, we can just let the user call a function to produce it (maybe with the map as a parameter).
i generated it after (and made use of) the tile_matrix and there it was really quick. but in general it honestly shouldnt take too long. i can check run it without using the tile_matrix and check to make sure.
or get_area_distance_matrix() and generate_tile_distance_matrix() we should probably use the navigation mesh nomenclature, which I believe in this case would make the function names get_place_distance_matrix() and generate_area_distance_matrix(). Each tile is an "area" and a collection of tiles is a "place". Also, is there a reason to use get versus generate for the two? I can see reasoning for having different as well as the same prefix.
definitely agree here. im pretty sure i also wanst 100% consistent in my use, at least in the comments.
I changed to generate
because ´get´ sounds like you just grab it from somewhere and with the way i have it now doing that is just done via importing them from NAV. and also i felt ´get´ sounds like it will be a fast thing while ´generate_tile_distance_matrix()´ takes ~24h.
Would you be able to create a small notebook for examples/ that uses your new navigation functions? It could be called 04_Working_with_Navigation_Meshes_Advanced and just show in a few cells how to use your new functions.
i can do that no problem. but i can probably only get to it in ~2-3 weeks now because i have a conference the week after this and still have a bunch of things to prepare for that.
Okay, got it. I think seeing it in action to verify things in 04_Working_with_Navigation_Meshes_Advanced
could be a nice touch.
Yea, let me know how fast these functions generally are without TILE_DIST_MATRIX
(TDM). I see two directions from here: (1) if the functions are fast without TDM, we can make TDM a parameter in functions where it is used. (2) if the functions are slow without TDM, we may need to find a place to either store TDM online or rethink how we generate the TDM.
EDIT: If a user creates the TDM once, it should be in their data
dir, right?
Do you think for these dist matrix functions that it would be a good idea to add map as a parameter? Doing so could also cut down on computation time by an order of magnitude (let's assume there are 10 maps).
Regarding the example notebook, great, no rush. Let's hold off on releasing until the notebook is ready, that way people will know how to use the new changes from the start. I think your changes are significant and useful, but they're not trivial, so a notebook would go a long way to helping users understand.
I also may take your PR and reshuffle into a standalone awpy.nav
module. Your work here has been really significant, and we can probably merge all navigation mesh-related functions into their own separate module rather than being under awpy.analytics
, since the navigation work is actually quite separate, as you have demonstrated. What're your thoughts on a standalone nav module?
Without the precomputed values they get slow pretty fast.
position state distance has to calculate 120 mappings x 5 distances per mapping = at total of 600 A* distances for the distance between 2 frames and that takes a bit.
if you want i can check exactly but it gets slow. especially if you want to do something like the i want where you want to calculate the distances between rounds aka the distance between frames for ~20-120 frames per round and then do this for not just 2 rounds but a lot of them.
i think ideally we would find a way to store them online somewhere (the fille tile_dist matrix for all maps is below 1GB but github has a 100mb limit)
If a user creates the TDM once, it should be in their data dir, right?
i would say so
Do you think for these dist matrix functions that it would be a good idea to add map as a parameter? Doing so could also cut down on computation time by an order of magnitude (let's assume there are 10 maps).
it probably makes sense to have a separate one each. you are right. especially if someone is only interested in say train where generating the matrix only takes 30 min. where as overpass took me 15h. so if you dont care about that then you can get very significant reductions.
I also may take your PR and reshuffle into a standalone awpy.nav module.
i also feel that this would be a good idea.
I added an alternative version of the multi-round plotting that doesnt produce a gif but instead the trajectories as lines in a single picture.
These are the gif and png versions. Still havent gotten around to using them to check the actual clustering results. I will probably remove the one that is less useful for that purpose unless they both offer something of value.
(Pasting my comment from the discord here because this is probably where it belongs tbh)
was pretty busy the last weeks so i didnt do anything regarding the PR or the notebook so far. Started doing some things today and am now splitting the production of the distance matrices by map but combine them on import. I also changed the distance and heuristic that the A* uses to the euclidean distances of the area centers. i feel this makes more sense and also gives more sensible results in my opinion. (The first picture is with the old way and the second with the new. What i mean is particularly obvious in T spawn where the old way produces weird results.)
I will probably also change the frame distances to use the average instead of the sum of the individual contributions.
I was also thinking about how to handle plotting with maps with multiple levels (eg. nuke and vertigo)
My first idea is to just plot the levels below each other. This means if the z value is in the lower level we just substract 1024 from the y value and plot it there. Something like this: https://imgur.com/a/lFXjymB
Howeve that would require changing the whole "position_transform" function especially because it now needs x,y,z instead of just x,y,.
What do you think about this?
Did some more changes to the multi round plotting but they are currently only in my other repo because they dont play that nice when doing them directly on frames. https://github.com/JanEricNitschke/CSGOML/blob/main/read_tensorflow_input.py#L1033
Due to their application being pretty limited to this i think it might make sense to remove the multi-round plotting functions from here and just have them in my own repo (along with the supporting functions that are just related to that like get_shortest_distances_mapping
or trajectory_distance
that i just added for this to the nav module). What do you think about that?
Did some more changes to the multi round plotting but they are currently only in my other repo because they dont play that nice when doing them directly on frames. https://github.com/JanEricNitschke/CSGOML/blob/main/read_tensorflow_input.py#L1033
Due to their application being pretty limited to this i think it might make sense to remove the multi-round plotting functions from here and just have them in my own repo (along with the supporting functions that are just related to that like
get_shortest_distances_mapping
ortrajectory_distance
that i just added for this to the nav module). What do you think about that?
I think this makes sense. Let's try to get this PR in, and then I can add a bunch of backlogged changes too.
Is there any other work that needs to be done on the functions or the tests? (Don't worry about documentation for now, we can add that later)
I will remove the plotting stuff and do some small updates on the tests and then this should be ready. I will most likely get all of this done this week.
I will remove the plotting stuff and do some small updates on the tests and then this should be ready. I will most likely get all of this done this week.
Sounds like a good plan. For now let's leave all the nav contributions in awpy.analytics.nav
, but I think next step, which we can do with the notebook, would be to create awpy.nav
Also, thanks so much for putting this together. The functionality of nav is going to blossom from this PR.
If you don't mind being in the acknowledgments, can you edit the README in the last section to reflect/link to you/your contribution?
This should be the final version for now.