Combinatorics.jl icon indicating copy to clipboard operation
Combinatorics.jl copied to clipboard

Multisets?

Open TotalVerb opened this issue 9 years ago • 3 comments

A user on stack overflow asked about whether Julia's partitions function could compute multisets. It can't, and shouldn't.

But a multisets(n, k) iterator might actually be useful in some cases. A naive, eager implementation would be:

multisets(n, k) = map(A -> [sum(A .== i) for i in 1:n],
                      with_replacement_combinations(1:n, k))

julia> multisets(2, 1)
2-element Array{Array{Int64,1},1}:
 [1,0]
 [0,1]

julia> multisets(3, 5)
21-element Array{Array{Int64,1},1}:
 [5,0,0]
 [4,1,0]
 [4,0,1]
 [3,2,0]
 [3,1,1]
 [3,0,2]
 [2,3,0]
 [2,2,1]
 [2,1,2]
 [2,0,3]
 ⋮      
 [1,2,2]
 [1,1,3]
 [1,0,4]
 [0,5,0]
 [0,4,1]
 [0,3,2]
 [0,2,3]
 [0,1,4]
 [0,0,5]

which is not so bad to write but horrible memory-wise. It'd probably be fine if map were lazy, but I couldn't find a decent lazy map implementation that did not make everything linked lists (and hence even worse).

Would there be room for such a function in this library? Alternatively, might we (or some other library) provide a fast lazy map, so that efficient versions of these functions can be written? I can prepare a pull request if there is demand.

(As an aside, I think multiset_combinations is named incorrectly: it means combinations_of_multisets. The current with_replacement_combinations is more akin to the multiset combinations (multichoose) operation.)

TotalVerb avatar May 19 '16 00:05 TotalVerb

I think part of the confusion in naming is how in mathematics, there is only one kind of multiset, while in computing, there are two competing representations of multisets—as the elements themselves, and as a tuple of multiplicities. One is more efficient than the other depending on whether n or k dominates.

Maybe we could create a Multiset type, which has collect and iteration defined to give elements one-by-one, and also with a multiplicities() method to get the multiplicities of each element. Then with proper renaming, we might do

map(multiplicities, multichoose(n, k))

or even

multiplicities.(multichoose(n, k))

with the new v0.5 syntax.

TotalVerb avatar May 19 '16 00:05 TotalVerb

This would definitely be useful to me.

Stivanification avatar Oct 26 '16 12:10 Stivanification

As it turns out, this is implemented with the function multiexponents in this module, that also returns an interator. Some quick and dirty performance tests for the two solutions:

julia> @time multisets(7,4)
  0.000108 seconds (3.78 k allocations: 208.578 KiB)
210-element Vector{Vector{Int64}}:
 [4, 0, 0, 0, 0, 0, 0]
 [3, 1, 0, 0, 0, 0, 0]
 [3, 0, 1, 0, 0, 0, 0]
 [3, 0, 0, 1, 0, 0, 0]
 [3, 0, 0, 0, 1, 0, 0]
 [3, 0, 0, 0, 0, 1, 0]
 [3, 0, 0, 0, 0, 0, 1]
 [2, 2, 0, 0, 0, 0, 0]
 [2, 1, 1, 0, 0, 0, 0]
 [2, 1, 0, 1, 0, 0, 0]
 [2, 1, 0, 0, 1, 0, 0]
 ⋮
 [0, 0, 0, 0, 2, 0, 2]
 [0, 0, 0, 0, 1, 3, 0]
 [0, 0, 0, 0, 1, 2, 1]
 [0, 0, 0, 0, 1, 1, 2]
 [0, 0, 0, 0, 1, 0, 3]
 [0, 0, 0, 0, 0, 4, 0]
 [0, 0, 0, 0, 0, 3, 1]
 [0, 0, 0, 0, 0, 2, 2]
 [0, 0, 0, 0, 0, 1, 3]
 [0, 0, 0, 0, 0, 0, 4]

Compared with the built-in implementation:

julia> @time collect(multiexponents(7,4))
  0.000051 seconds (842 allocations: 57.641 KiB)
210-element Vector{Any}:
 [4, 0, 0, 0, 0, 0, 0]
 [3, 1, 0, 0, 0, 0, 0]
 [3, 0, 1, 0, 0, 0, 0]
 [3, 0, 0, 1, 0, 0, 0]
 [3, 0, 0, 0, 1, 0, 0]
 [3, 0, 0, 0, 0, 1, 0]
 [3, 0, 0, 0, 0, 0, 1]
 [2, 2, 0, 0, 0, 0, 0]
 [2, 1, 1, 0, 0, 0, 0]
 [2, 1, 0, 1, 0, 0, 0]
 [2, 1, 0, 0, 1, 0, 0]
 ⋮
 [0, 0, 0, 0, 2, 0, 2]
 [0, 0, 0, 0, 1, 3, 0]
 [0, 0, 0, 0, 1, 2, 1]
 [0, 0, 0, 0, 1, 1, 2]
 [0, 0, 0, 0, 1, 0, 3]
 [0, 0, 0, 0, 0, 4, 0]
 [0, 0, 0, 0, 0, 3, 1]
 [0, 0, 0, 0, 0, 2, 2]
 [0, 0, 0, 0, 0, 1, 3]
 [0, 0, 0, 0, 0, 0, 4]

As such, I would suggest closing this issue.

Stivanification avatar Dec 26 '22 13:12 Stivanification