ggshakeR icon indicating copy to clipboard operation
ggshakeR copied to clipboard

[Feature Request]: UEFA-compliance / autotype compliance

Open abhiamishra opened this issue 2 years ago • 17 comments

Suggest an idea or something that can be improved.

Currently, our functions are compliant to StatsBomb and Opta. UEFA-styled coordinates do exist today, new news for me, and it would be a nice thing to have that implemented.

A nicer thing would if we were able to understand the type of the dataframe automatically,,,,, food for thought!

abhiamishra avatar Aug 04 '22 21:08 abhiamishra

auto-identify dataset type is an interesting idea, but we should probably fix the bugs for the current version and do a CRAN release so that we're free to experiment with the idea on dev

harshkrishna17 avatar Aug 05 '22 12:08 harshkrishna17

For now, we could just add a way to transform UEFA data to opta within the guide itself?

harshkrishna17 avatar Aug 05 '22 12:08 harshkrishna17

Either that or we add UEFA as a data type to all functions

abhiamishra avatar Aug 05 '22 13:08 abhiamishra

Or we create a helper function that transforms datasets from UEFA to one of Opta/StatsBomb

abhiamishra avatar Aug 05 '22 13:08 abhiamishra

Would {ggsoccer} have this functionality? It has a pitch type pitch_international for UEFA coordinates but not sure if its there for the rescale function.

harshkrishna17 avatar Aug 05 '22 14:08 harshkrishna17

Here's a simple internal function we could create to auto identify the dataset type. We can add in the final coordinate value specifications as well. @abhiamishra @Ryo-N7

data_type_identify <- function(data) {
  
  if (max(data$x) > 98 & max(data$x) < 102 &
      max(data$y) > 98 & max(data$y) < 102) {
    print("opta")
  } else 
  
  if (max(data$x) > 118 & max(data$x) < 122 &
      max(data$y) > 78 & max(data$y) < 82) {
    print("statsbomb")
  }
  
}

harshkrishna17 avatar Aug 06 '22 08:08 harshkrishna17

Updated function

data_type_identify <- function(data) {
  
  data <- data %>%
    select(x, y, finalX, finalY) %>%
    na.omit()
  
  if (max(data$x) > 98 & max(data$x) < 102 &
      max(data$y) > 98 & max(data$y) < 102 &
      max(data$finalX) > 98 & max(data$finalX) < 102 &
      max(data$finalY) > 98 & max(data$finalY) < 102) {
    data_type <- "opta"
  } 
  else 
    
    if (max(data$x) > 118 & max(data$x) < 122 &
        max(data$y) > 78 & max(data$y) < 82 &
        max(data$finalX) > 118 & max(data$finalX) < 122 &
        max(data$finalY) > 78 & max(data$finalY) < 82) {
      data_type <- "statsbomb"
    } 
  else
    
    if (max(data$x) > 103 & max(data$x) < 107 &
        max(data$y) > 66 & max(data$y) < 70 &
        max(data$finalX) > 103 & max(data$finalX) < 107 &
        max(data$finalY) > 66 & max(data$finalY) < 70) {
      data_type <- "international"
    }
  
  return(data_type)
  
}

harshkrishna17 avatar Aug 08 '22 04:08 harshkrishna17

Suggest an idea or something that can be improved.

Currently, our functions are compliant to StatsBomb and Opta. UEFA-styled coordinates do exist today, new news for me, and it would be a nice thing to have that implemented.

A nicer thing would if we were able to understand the type of the dataframe automatically,,,,, food for thought!

Can't we use the uniqueness in the column names of each data frame type to find out the type of data (UEFA, Opta, SB etc) and then return the pitch accordingly?

rithwikrajendran avatar Aug 09 '22 09:08 rithwikrajendran

Suggest an idea or something that can be improved.

Currently, our functions are compliant to StatsBomb and Opta. UEFA-styled coordinates do exist today, new news for me, and it would be a nice thing to have that implemented. A nicer thing would if we were able to understand the type of the dataframe automatically,,,,, food for thought!

Can't we use the uniqueness in the column names of each data frame type to find out the type of data (UEFA, Opta, SB etc) and then return the pitch accordingly?

That cant work unfortunately as we already require the dataset to be inputted into the functions with specific column names

harshkrishna17 avatar Aug 09 '22 10:08 harshkrishna17

0734bf5c-94b6-4c81-bd18-2e9424120c94

I don't fully understand why the passplot functions scales them fine but the pass networks don't scale properly. The prog passes one was done with the same dataset as the pass network.

601904c1-6c85-4698-8f11-b2693f41e9b4

robbiejdunne avatar Aug 13 '22 06:08 robbiejdunne

Hey @robbiejdunne could you please be a little more specific? As in what is the problem with scaling. I would also recommend you to create a new issue for this problem, and continue the conversation there

harshkrishna17 avatar Aug 13 '22 09:08 harshkrishna17

This is part of the original issue with international/opta/statsbomb data. The pass network's dimensions only go as high as 68 in this dataset. But they seem to be just fine for the progressive passes. Does that make sense?

robbiejdunne avatar Aug 13 '22 09:08 robbiejdunne

I dont think they're fine for the progressive passes, rather the size of the pitch in plot_pass is making it appear right when it might not be.

Try running this on your dataset before running the passnet function.

data <- data %>%
  mutate(x = x * 105/100,
         finalX = finalX * 105/100,
         y = y * 68/100,
         finalY = finalY * 68/100)

This should make it work

harshkrishna17 avatar Aug 13 '22 09:08 harshkrishna17

I'm going to play around with the function Harsh made

abhiamishra avatar Aug 15 '22 18:08 abhiamishra

function seems to be working well.

im going to add a sample UEFA dataset and push the auto-type compliance. Will take some time because I have to change tests as well,,,

abhiamishra avatar Aug 16 '22 17:08 abhiamishra

quick fixes that need to be implemented: for some functions, they only look at x,y and we have, in our guides, the user selecting only x and y. As such, the function needs to be updated to be able to find the type based on just the (x,y).

abhiamishra avatar Aug 16 '22 17:08 abhiamishra

I am going to start working on this week, test out a much more improved functionality so it can account for the case I referenced above ^

abhiamishra avatar Sep 05 '22 19:09 abhiamishra