v
v copied to clipboard
Datatype `Set` implementation
Just simple implementation of Set.
I think part of the following implementations should be added in the future:
Core set-theoretical operations
- [x]
union(S,T): returns the union of setsSandT. - [x]
intersection(S,T): returns the intersection of setsSandT. - [x]
difference(S,T): returns the difference of setsSandT. - [x]
subset(S,T): a predicate that tests whether the setSis a subset of setT.
Static sets
- [x]
is_element_of(x,S): checks whether the valuexis in the setS. (In our case it's calledexists) - [x]
is_empty(S): checks whether the setSis empty. - [x]
size(S): returns the number of elements inS. - [ ]
iterate(S): returns a function that returns one more value ofSat each call, in some arbitrary order. - [ ]
enumerate(S): returns a list containing the elements ofSin some arbitrary order. - [ ]
build(x1,x2,…,xn,): creates a set structure with valuesx1,x2,...,xn. - [ ]
create_from(collection): creates a new set structure containing all the elements of the given collection or all the elements returned by the given iterator.
Dynamic sets
- [x]
create(): creates a new, initially empty set structure. - [ ]
create_with_capacity(n): creates a new set structure, initially empty but capable of holding up to n elements. - [x]
add(S,x): adds the elementxtoS, if it is not present already. - [x]
remove(S, x): removes the elementxfromS, if it is present. - [ ]
capacity(S): returns the maximum number of values thatScan hold.
Additional operations
- [x]
pop(S): returns an arbitrary element ofS, deleting it fromS. - [x]
pick(S): returns an arbitrary element ofS. Functionally, the mutator pop can be interpreted as the pair of selectors(pick, rest), whererestreturns the set consisting of all elements except for the arbitrary element. Can be interpreted in terms ofiterate. - [ ]
map(F,S): returns the set of distinct values resulting from applying functionFto each element ofS. - [ ]
filter(P,S): returns the subset containing all elements ofSthat satisfy a given predicate P. - [ ]
fold(A0,F,S): returns the valueA|S|after applyingAi+1 := F(Ai, e)for each elementeofS, for some binary operationF.Fmust be associative and commutative for this to be well-defined. - [x]
clear(S): delete all elements ofS. - [x]
equal(S1', S2'): checks whether the two given sets are equal (i.e. contain all and only the same elements). - [ ]
hash(S): returns a hash value for the static setSsuch that ifequal(S1, S2)thenhash(S1) = hash(S2)
Other operations can be defined for sets with elements of a special type:
- [ ]
sum(S): returns the sum of all elements ofSfor some definition of "sum". For example, over integers or reals, it may be defined asfold(0, add, S). - [ ]
collapse(S): given a set of sets, return the union. For example,collapse({{1}, {2, 3}}) == {1, 2, 3}. May be considered a kind ofsum. - [ ]
flatten(S): given a set consisting of sets and atomic elements (elements that are not sets), returns a set whose elements are the atomic elements of the original top-level set or elements of the sets it contains. In other words, remove a level of nesting – like collapse, but allow atoms. This can be done a single time, or recursively flattening to obtain a set of only atomic elements.For example,flatten({1, {2, 3}}) == {1, 2, 3}. - [ ]
nearest(S,x): returns the element ofSthat is closest in value tox(by some metric). - [ ]
min(S),max(S): returns the minimum/maximum element ofS.
Why not just use a map to make a Set? What is/are the use(s) of using a separate type?
Why not just use a
mapto make aSet? What is/are the use(s) of using a separate type?
I don't know for sure how map is implemented in V, but, in a number of other languages, map is less productive than array.
Also, a number of additional functions will make this type more convenient to use in specific cases.
In addition, I saw that in the README.md package datatypes, there is a type Set, which someone considered necessary to implement. I took it as welcome to contribution :-)
Okay 👌 Your job looks really good
Why not just use a
mapto make aSet? What is/are the use(s) of using a separate type?I don't know for sure how
mapis implemented inV, but, in a number of other languages,mapis less productive thanarray.Also, a number of additional functions will make this type more convenient to use in specific cases.
In addition, I saw that in the
README.mdpackagedatatypes, there is a typeSet, which someone considered necessary to implement. I took it as welcome to contribution :-)
Have you tried to benchmark your solution vs map solution?
I was going to suggest the same as the rest did. Using a map solution will be better in terms of performance! Just for reference, here there is a Set implementation I did in the past in PHP 😅 https://github.com/MBHFramework/structures/blob/385e1e3d49d8b953fa2354a6b522a3cf6de4c24f/Mbh/Collection/Set.php#L26
just for reference, using map, the set should be something like this:
pub struct Set<T> {
mut:
table map[T]int // or map[T]bool or any type in the value
}
then Set will be just a wrapper of map when the keys are the values of the Set and most of the operations will be really efficient
I was going to suggest the same as the rest did. Using a map solution will be better in terms of performance! Just for reference, here there is a Set implementation I did in the past in PHP sweat_smile https://github.com/MBHFramework/structures/blob/385e1e3d49d8b953fa2354a6b522a3cf6de4c24f/Mbh/Collection/Set.php#L26
just for reference, using
map, the set should be something like this:pub struct Set<T> { mut: table map[T]int // or map[T]bool or any type in the value }then
Setwill be just a wrapper ofmapwhen the keys are the values of the Set and most of the operations will be really efficient
Maybe voidptr better than int?
Maybe
voidptrbetter thanint?
voidptr can be 64bit, int is 32 vbit in V, you can save half the memory for each element. otherwise you can use a simply u8 to reduce the memory footprint.
voidptr can be 64bit, int is 32 vbit in V, you can save half the memory for each element. otherwise you can use a simply u8 to reduce the memory footprint.
Perhaps it makes sense to take not only the type of value, but also the type of key?
pub struct Set<K, V> {
// maximum number of values that Set<K, V> can hold.
capacity int
mut:
elements map[V]K
}
Perhaps it makes sense to take not only the type of value, but also the type of key?
If this is a pure set() I don't think that the type of the value of the map has a meaning.
You can simply choice the smallest one like u8.
@vladimirmyshkovski I think there is no need of having a capacity for the Set
@ulises-jeremias
What would be a better name for the method of adding elements from an array to a set? For example, in Python it is called update. I'm thinking about the name of the method that returns a copy of the set: copy or clone?
Also, union is a reserved name in V therefore, the method must be called something else. Right now I'm calling it union_.
Also,
unionis a reserved name inVtherefore, the method must be called something else. Right now I'm calling itunion_.
The existing convention is to use fn ... @union(...) ... { for names that are also keywords, for example:
vlib/sync/sync_default.c.v:93:pub fn (mut m Mutex) @lock() {
vlib/net/common.v:31:fn @select(handle int, test Select, timeout time.Duration) ?bool {
vlib/net/unix/stream_nix.v:55:fn (mut s StreamSocket) @select(test Select, timeout time.Duration) ?bool {
All tests pass now.
Merged, thanks!