cricketr icon indicating copy to clipboard operation
cricketr copied to clipboard

Issue in batsmanDismissals()

Open npranav10 opened this issue 5 years ago • 7 comments

First of all , I would like to appreciate your dedication in framing this package!. Hats Off!

Coming to the issues, I have been trying to analyze Vijay Shankar's ODI dismissals. (http://stats.espncricinfo.com/ci/engine/player/477021.html?class=2;filter=advanced;orderby=start;template=results;type=batting;view=innings). and I came across 2 following issues:

(I have traced the function)

  1. batsman <- clean(file) . After execution of this line only one record stays in his data image image

Similarly with Rishab Pant image image

  1. If I manually remove the above line (batsman <- clean(file) and continue execution line by line, It looks like there is an issue that might occur with any player i.e The individual dismissal type % is calculated with denominator containing total no of innings played rather than total number of times dismissed

What I mean is that His stats should read Not out : 0 % Run out : 40 % Caught : 60 % Refer image

as opposed to current metrics displaying Not out : 44% Run out : 22% Caught : 33% image

npranav10 avatar May 06 '19 05:05 npranav10

Will check. Not out shows up as a '*" which I remove. Yes transformations have to be done. Will look into it. Currently caught up in a couple of things. We cannot remove clean(file). We have to make it work with that.

Ganesh

tvganesh avatar May 06 '19 06:05 tvganesh

Sure Mr Ganesh. I Will keep an eye on this page.

Pranav

npranav10 avatar May 06 '19 06:05 npranav10

Looking at the data I see rows which have been removed have Mins as '-'. This is NA which R removes in clean(file). Did you check why rows 6,7,8,9 for Vijay has Mins as '-'?

tvganesh avatar May 06 '19 06:05 tvganesh

ESPNCricinfo (Match Scorecard) doesn't have the minutes played statistics for India's home series vs Australia

npranav10 avatar May 06 '19 06:05 npranav10

Looking at the data I see rows which have been removed have Mins as '-'. This is NA which R removes in clean(file). Did you check why rows 6,7,8,9 for Vijay has Mins as '-'?

I can confirm that the issue exists only for players where there is "-" in "Mins" column for the innings they have batted.

image

image

Can't the clean function be executed without considering the Mins column? Like replacing first line in clean function with df <- read.csv(file, stringsAsFactor = FALSE) df = df[c(-3)] This works fine for me. image But have to check whether this holds good for other batsman functions too

npranav10 avatar May 06 '19 08:05 npranav10

You can make your own function to only look at the dismissals column without the clean function. I may not add this to the package as this is an issue with the data. I cannot keep the package generic if I do changes which are unique.

tvganesh avatar May 06 '19 09:05 tvganesh

What I actually thought was , given the fact that "Mins" data is inconsistent in ESPNCricinfo Statsguru, why do we need to consider "Mins" data at all. Why dont we drop it for all functions? Then it becomes easier to fill NAs for all other "-"s.

npranav10 avatar May 06 '19 13:05 npranav10