scholar
scholar copied to clipboard
Suggested h-index change per year plots
Hi,
I love the package. I made a couple of additional functions to how the change in h-index over time. The code could certainly be tidied a lot, but it's functional.
get_yearly_publications <- function(id){
pub.list <- NULL
for (i in scholar::get_publications(id)$pubid){
#print(i)
pub.list <- rbind(pub.list,scholar::get_article_cite_history(id,i))
}
years <- min(pub.list[,1]):max(pub.list[,1])
papers <- unique(pub.list[,3])
pub.table <- array(dim=c(length(years),
length(papers)),
dimnames=list(years,
papers))
for (i in 1:nrow(pub.list)){pub.table[as.character(pub.list[i,1]),pub.list[i,3]] <- pub.list[i,2]}
pub.table
}
h_by_year <- function(pub.table){
hyear <- NULL
for(i in 2:nrow(pub.table)){
h <- NULL
for (j in 1:ncol(pub.table)){h <- append(h,(sort(colSums(pub.table[1:i,],na.rm = 1),decreasing = TRUE)[j]>=j))}
hyear[i] <- sum(h)
}
hyear[is.na(hyear)] <-0
names(hyear) <- rownames(pub.table)
hyear
}
plot_hyear_full <- function(pub.table){
plot(colSums(pub.table,na.rm = 1),
type="l",
xlab="Paper rank",
ylab = "Citations per paper")
abline(0,1,col="grey")
for (i in nrow(pub.table):2){
lines(sort(colSums(pub.table[1:i,],na.rm = 1),decreasing = TRUE),
col=colorRampPalette(c("lightblue", "darkblue"))(nrow(pub.table))[i])
}
lines(rep(0,ncol(pub.table)))
hyear <- h_by_year(pub.table)
text(ncol(pub.table)*0.95,max(colSums(pub.table,na.rm = TRUE))*0.95,
paste("H =",hyear[length(hyear)]))
}
pub.table <- get_yearly_publications(ID)
hyear <- h_by_year(pub.table)
plot_hyear_full(pub.table)
plot(names(hyear),
hyear,
type="b",
xlab="Year",
ylab = "H index")
Hi @TS404, late response, but these plots are certainly nice. One option might have been to contribute a vignette with these as examples. They do need a bit of cleanup. In particular there seem to be multiple calls to get_publications
inside the for loop in your first function.
Yes, I was unable to work out a way to avoid the loop (which is by fast the slowest part and also risks triggering throttling on the server side). Any ideas? I'd be happy to help put together a vignette once the code is a bit tighter.
My mistake — I misread — there’s only a single call to get_publications and the multiple calls to get_article_cite_history are inevitable.