Calculations for delta.pitch and delta.field differ from previous calculations
I compared running makeWAR() on the May data set to the MayProcessed data set and noticed that the delta.field and delta.pitch columns in the New MayProcessed data set differed from the original MayProcessed data set. They actually look transposed which you can see below. I did this using dplyr 0.5.0, but I first noticed it when testing makeWAR() after refactoring for dplyr 0.7.0 .
>NewMayProcessed <- makeWAR(May)
>head(NewMayProcessed$openWARPlays[,c(1:5,16, 19:23)])
batterId start1B start2B start3B pitcherId gameId delta delta.field delta.pitch delta.br delta.bat
1 476704 <NA> <NA> <NA> 450351 gid_2013_05_01_anamlb_oakmlb_1 0.3789624 NA 0.37896244 NA 0.3789624
2 519083 476704 <NA> <NA> 450351 gid_2013_05_01_anamlb_oakmlb_1 -0.2055008 -0.04671768 -0.15878313 0.03238909 -0.2378899
3 452234 <NA> 476704 <NA> 450351 gid_2013_05_01_anamlb_oakmlb_1 -0.3296470 NA -0.32964703 0.04026076 -0.3699078
4 493316 <NA> 476704 <NA> 450351 gid_2013_05_01_anamlb_oakmlb_1 0.2032371 0.11692098 0.08631608 -0.53123407 0.7344711
5 518626 493316 <NA> 476704 450351 gid_2013_05_01_anamlb_oakmlb_1 0.1956572 NA 0.19565721 -0.01790497 0.2135622
6 474384 518626 493316 476704 450351 gid_2013_05_01_anamlb_oakmlb_1 -0.7097701 -0.36090191 -0.34886821 0.01234560 -0.7221157
> head(MayProcessed$openWARPlays[,c(1:5,16, 19:23)])
batterId start1B start2B start3B pitcherId gameId delta delta.field delta.pitch delta.br delta.bat
1 476704 <NA> <NA> <NA> 450351 gid_2013_05_01_anamlb_oakmlb_1 0.3789624 NA 0.3789624 NA 0.3789624
2 519083 476704 <NA> <NA> 450351 gid_2013_05_01_anamlb_oakmlb_1 -0.2055008 -0.1588469 -0.0466539 0.03238909 -0.2378899
3 452234 <NA> 476704 <NA> 450351 gid_2013_05_01_anamlb_oakmlb_1 -0.3296470 NA -0.3296470 0.04026076 -0.3699078
4 493316 <NA> 476704 <NA> 450351 gid_2013_05_01_anamlb_oakmlb_1 0.2032371 0.1169279 0.0863092 -0.53123407 0.7344711
5 518626 493316 <NA> 476704 450351 gid_2013_05_01_anamlb_oakmlb_1 0.1956572 NA 0.1956572 -0.01790497 0.2135622
6 474384 518626 493316 476704 450351 gid_2013_05_01_anamlb_oakmlb_1 -0.7097701 -0.3487953 -0.3609748 0.01234560 -0.7221157
The original MayProcessed data set was added over 2 years ago, and there have been quite a few changes to makeWAR() since then. I imagine this happened when openWAR and dplyrized. I'm pretty sure it has to do with [Line 140].(https://github.com/beanumber/openWAR/blob/master/R/makeWAR.R#L140)
x$data <- mutate_(x$data, delta.pitch = ~ifelse(is.na(delta.field), delta, delta - delta.field))
So I guess it boils down to which data set is correct? Is it the original MayProcessed data set?
I actually think this occurring in makeWARFielding, specifically Lines 365-366.
delta.field <- with(data, ifelse(endOuts == startOuts,
delta * p.hat, delta * (1 - p.hat)))
OK, thanks, I will take a look. The dplyr update broke all of my other packages too!
I bet its breaking a lot of packages in the R universe. Honestly, I don't think the tidyeval is all that tidy. It makes things a lot more convoluted, but I'm also just not used to it yet.