DataStructures.jl icon indicating copy to clipboard operation
DataStructures.jl copied to clipboard

accumulator and nlargest / most common

Open ym-han opened this issue 4 years ago • 2 comments

  1. The order of the arguments for nlargest(acc::Accumulator, [n]) is different from other variants of nlargest, e.g. nlargest(n, arr; kw...). I find this a bit confusing and unexpected: is it not possible to use the same convention throughout?
  2. Relatedly, the code for nlargest for accumulators seems different from the implementation of most_common in the python counter class. It would be helpful if there were some documentation on the design decisions and time/space complexities.
  3. Finally, I think it'd might be worth mentioning the nlargest function in the docs for accumulators; right now I think the only mention of nlargest is on the page 'functions using heaps', and it isn't obvious from that that it'd be possible to do it with accumulators.

I could potentially help with some of these.

ym-han avatar Dec 20 '20 19:12 ym-han

We should also check if we need to have nlargest or if partialsort world do.

oxinabox avatar Dec 20 '20 20:12 oxinabox

nlargest for accumulators is currently implemented using partialsort!. I think it'd make sense to keep the nlargest around even if we want to stick to the current implementation, since users might not think to use partialsort.

ym-han avatar Dec 20 '20 20:12 ym-han