awpy icon indicating copy to clipboard operation
awpy copied to clipboard

Added functions to calculate distances between games states and to allow multi-round plotting

Open JanEricNitschke opened this issue 2 years ago • 7 comments

I wrote two functions to plot multiple rounds in one gif.

One is for plotting rounds with the exact same players and the other one can be used for any combination of rounds.

For the case with the same players each player (identified by steamid) is assigned a fixed color that he has for all rounds.

In the other case the first round in the list is used as reference and for each following rounds the players are assigned colors by matching them to the closest player from the first round.

I overdid it a bit with the commenting and there are also a couple places with pretty obvious simplification/clarity improvements possible. It probably also makes sense to add more arguments to the functions regarding marker transparency and size for the matched rounds.

Don't know if you think if this would make a decent addition to the package.

JanEricNitschke avatar Jun 25 '22 15:06 JanEricNitschke

Here are two gifs produced by the functions. Ten rounds are probably on the high end and it is probably not wise to plot both teams at the same time though.

742_rounds_diff_play

742_rounds_same_play

JanEricNitschke avatar Jun 25 '22 15:06 JanEricNitschke

The functions in here might also be candidates to be directly added to awpy.

These plus the functions from the pull request are shown off here: https://imgur.com/a/wdEvCDq

JanEricNitschke avatar Jun 26 '22 15:06 JanEricNitschke

I know added the precomputed distances for named areas. Those for all the tiles are here but too big for github.

I also added the functionality to calculate game state distances based on either player positions or position tokens.

I also have a couple of plotting functions here that i used for verifying the results.

JanEricNitschke avatar Aug 23 '22 14:08 JanEricNitschke

Most of what is here seems fine to merge. Thanks for doing some general refactoring, fixing some NAV bugs, and also adding some interesting state/frame distance metrics. I especially think the navigation module contributions here will be helpful. The only comments/concerns I have are the following:

  • I can see the use case for plotting plot_rounds_same_players but not necessarily plot_rounds_different_players (mostly because of the visual clutter). I'm wondering if, given the function's size vs. its propensity to be used, it may be better introduced in an example notebook (maybe like a visualization part 2 notebook). What analytics use case did you envision here?
  • How long does it take to create the area distance matrix? If it does not take long, we can just let the user call a function to produce it (maybe with the map as a parameter).
  • for get_area_distance_matrix() and generate_tile_distance_matrix() we should probably use the navigation mesh nomenclature, which I believe in this case would make the function names get_place_distance_matrix() and generate_area_distance_matrix(). Each tile is an "area" and a collection of tiles is a "place". Also, is there a reason to use get versus generate for the two? I can see reasoning for having different as well as the same prefix.
  • Would you be able to create a small notebook for examples/ that uses your new navigation functions? It could be called 04_Working_with_Navigation_Meshes_Advanced and just show in a few cells how to use your new functions.

pnxenopoulos avatar Aug 28 '22 16:08 pnxenopoulos

Thanks for the review.

I can see the use case for plotting plot_rounds_same_players but not necessarily plot_rounds_different_players (mostly because of the visual clutter). I'm wondering if, given the function's size vs. its propensity to be used, it may be better introduced in an example notebook (maybe like a visualization part 2 notebook). What analytics use case did you envision here?

I want to try clustering rounds and want to use that function to verify that the results make sense. So i take 10 rounds that were all assigned to the same cluster and plot those together to see if they are similar at least visually. In that case the cluttering shouldnt be so problematic because ideally the rounds are all very close and the colors should hopefully be organized together.

How long does it take to create the area distance matrix? If it does not take long, we can just let the user call a function to produce it (maybe with the map as a parameter).

i generated it after (and made use of) the tile_matrix and there it was really quick. but in general it honestly shouldnt take too long. i can check run it without using the tile_matrix and check to make sure.

or get_area_distance_matrix() and generate_tile_distance_matrix() we should probably use the navigation mesh nomenclature, which I believe in this case would make the function names get_place_distance_matrix() and generate_area_distance_matrix(). Each tile is an "area" and a collection of tiles is a "place". Also, is there a reason to use get versus generate for the two? I can see reasoning for having different as well as the same prefix.

definitely agree here. im pretty sure i also wanst 100% consistent in my use, at least in the comments.

I changed to generate because ´get´ sounds like you just grab it from somewhere and with the way i have it now doing that is just done via importing them from NAV. and also i felt ´get´ sounds like it will be a fast thing while ´generate_tile_distance_matrix()´ takes ~24h.

Would you be able to create a small notebook for examples/ that uses your new navigation functions? It could be called 04_Working_with_Navigation_Meshes_Advanced and just show in a few cells how to use your new functions.

i can do that no problem. but i can probably only get to it in ~2-3 weeks now because i have a conference the week after this and still have a bunch of things to prepare for that.

JanEricNitschke avatar Aug 28 '22 16:08 JanEricNitschke

Okay, got it. I think seeing it in action to verify things in 04_Working_with_Navigation_Meshes_Advanced could be a nice touch.

Yea, let me know how fast these functions generally are without TILE_DIST_MATRIX (TDM). I see two directions from here: (1) if the functions are fast without TDM, we can make TDM a parameter in functions where it is used. (2) if the functions are slow without TDM, we may need to find a place to either store TDM online or rethink how we generate the TDM.

EDIT: If a user creates the TDM once, it should be in their data dir, right?

Do you think for these dist matrix functions that it would be a good idea to add map as a parameter? Doing so could also cut down on computation time by an order of magnitude (let's assume there are 10 maps).

Regarding the example notebook, great, no rush. Let's hold off on releasing until the notebook is ready, that way people will know how to use the new changes from the start. I think your changes are significant and useful, but they're not trivial, so a notebook would go a long way to helping users understand.

I also may take your PR and reshuffle into a standalone awpy.nav module. Your work here has been really significant, and we can probably merge all navigation mesh-related functions into their own separate module rather than being under awpy.analytics, since the navigation work is actually quite separate, as you have demonstrated. What're your thoughts on a standalone nav module?

pnxenopoulos avatar Aug 28 '22 17:08 pnxenopoulos

Without the precomputed values they get slow pretty fast.

position state distance has to calculate 120 mappings x 5 distances per mapping = at total of 600 A* distances for the distance between 2 frames and that takes a bit.

if you want i can check exactly but it gets slow. especially if you want to do something like the i want where you want to calculate the distances between rounds aka the distance between frames for ~20-120 frames per round and then do this for not just 2 rounds but a lot of them.

i think ideally we would find a way to store them online somewhere (the fille tile_dist matrix for all maps is below 1GB but github has a 100mb limit)

If a user creates the TDM once, it should be in their data dir, right?

i would say so

Do you think for these dist matrix functions that it would be a good idea to add map as a parameter? Doing so could also cut down on computation time by an order of magnitude (let's assume there are 10 maps).

it probably makes sense to have a separate one each. you are right. especially if someone is only interested in say train where generating the matrix only takes 30 min. where as overpass took me 15h. so if you dont care about that then you can get very significant reductions.

I also may take your PR and reshuffle into a standalone awpy.nav module.

i also feel that this would be a good idea.

JanEricNitschke avatar Aug 28 '22 17:08 JanEricNitschke

I added an alternative version of the multi-round plotting that doesnt produce a gif but instead the trajectories as lines in a single picture. Test_10_2_rounds_different_players_img Test_10_2_rounds_different_players

These are the gif and png versions. Still havent gotten around to using them to check the actual clustering results. I will probably remove the one that is less useful for that purpose unless they both offer something of value.

JanEricNitschke avatar Sep 26 '22 13:09 JanEricNitschke

(Pasting my comment from the discord here because this is probably where it belongs tbh)

was pretty busy the last weeks so i didnt do anything regarding the PR or the notebook so far. Started doing some things today and am now splitting the production of the distance matrices by map but combine them on import. I also changed the distance and heuristic that the A* uses to the euclidean distances of the area centers. i feel this makes more sense and also gives more sensible results in my opinion. (The first picture is with the old way and the second with the new. What i mean is particularly obvious in T spawn where the old way produces weird results.)

I will probably also change the frame distances to use the average instead of the sum of the individual contributions.

I was also thinking about how to handle plotting with maps with multiple levels (eg. nuke and vertigo) My first idea is to just plot the levels below each other. This means if the z value is in the lower level we just substract 1024 from the y value and plot it there. Something like this: https://imgur.com/a/lFXjymB Howeve that would require changing the whole "position_transform" function especially because it now needs x,y,z instead of just x,y,. What do you think about this? de_train_4015_3131_0_8431 573564887174 de_train_4015_3131_0_8774 032603727595

JanEricNitschke avatar Sep 26 '22 13:09 JanEricNitschke

Did some more changes to the multi round plotting but they are currently only in my other repo because they dont play that nice when doing them directly on frames. https://github.com/JanEricNitschke/CSGOML/blob/main/read_tensorflow_input.py#L1033

Due to their application being pretty limited to this i think it might make sense to remove the multi-round plotting functions from here and just have them in my own repo (along with the supporting functions that are just related to that like get_shortest_distances_mapping or trajectory_distance that i just added for this to the nav module). What do you think about that?

JanEricNitschke avatar Sep 28 '22 14:09 JanEricNitschke

Did some more changes to the multi round plotting but they are currently only in my other repo because they dont play that nice when doing them directly on frames. https://github.com/JanEricNitschke/CSGOML/blob/main/read_tensorflow_input.py#L1033

Due to their application being pretty limited to this i think it might make sense to remove the multi-round plotting functions from here and just have them in my own repo (along with the supporting functions that are just related to that like get_shortest_distances_mapping or trajectory_distance that i just added for this to the nav module). What do you think about that?

I think this makes sense. Let's try to get this PR in, and then I can add a bunch of backlogged changes too.

Is there any other work that needs to be done on the functions or the tests? (Don't worry about documentation for now, we can add that later)

pnxenopoulos avatar Oct 03 '22 20:10 pnxenopoulos

I will remove the plotting stuff and do some small updates on the tests and then this should be ready. I will most likely get all of this done this week.

JanEricNitschke avatar Oct 04 '22 06:10 JanEricNitschke

I will remove the plotting stuff and do some small updates on the tests and then this should be ready. I will most likely get all of this done this week.

Sounds like a good plan. For now let's leave all the nav contributions in awpy.analytics.nav, but I think next step, which we can do with the notebook, would be to create awpy.nav

Also, thanks so much for putting this together. The functionality of nav is going to blossom from this PR.

If you don't mind being in the acknowledgments, can you edit the README in the last section to reflect/link to you/your contribution?

pnxenopoulos avatar Oct 04 '22 13:10 pnxenopoulos

This should be the final version for now.

JanEricNitschke avatar Oct 04 '22 16:10 JanEricNitschke