tidycode
tidycode copied to clipboard
Meaning of the "score" column
Hi Dr.McGowan,
I'm trying to understand the score column of classification_tbl.csv file, but I couldn't find any documentation of the meaning of the variable and its role. I'd really appreciate it if you can explain this variable or point me towards where I can find information on this column. Thank you
The score
is the prevalence of the given classification for the function in question (depending on the “lexicon” which in this case is either members of the Leek Lab who classified several R functions or “crowd source” where we allowed anyone to classify R functions using a application we developed). For example, if the function is library
, the classification is setup
, the lexicon was “crowdsource”, and the score is 0.67, that means 67% of the crowdsourced participants classified the library
function as setup
.
this paper has a few examples: https://journal.r-project.org/archive/2020/RJ-2020-011/RJ-2020-011.pdf