v
v copied to clipboard
Datatype `Set` implementation
Just simple implementation of Set.
I think part of the following implementations should be added in the future:
Core set-theoretical operations
- [x]
union(S,T)
: returns the union of setsS
andT
. - [x]
intersection(S,T)
: returns the intersection of setsS
andT
. - [x]
difference(S,T)
: returns the difference of setsS
andT
. - [x]
subset(S,T)
: a predicate that tests whether the setS
is a subset of setT
.
Static sets
- [x]
is_element_of(x,S)
: checks whether the valuex
is in the setS
. (In our case it's calledexists
) - [x]
is_empty(S)
: checks whether the setS
is empty. - [x]
size(S)
: returns the number of elements inS
. - [ ]
iterate(S)
: returns a function that returns one more value ofS
at each call, in some arbitrary order. - [ ]
enumerate(S)
: returns a list containing the elements ofS
in some arbitrary order. - [ ]
build(x1,x2,…,xn,)
: creates a set structure with valuesx1
,x2
,...,xn
. - [ ]
create_from(collection)
: creates a new set structure containing all the elements of the given collection or all the elements returned by the given iterator.
Dynamic sets
- [x]
create()
: creates a new, initially empty set structure. - [ ]
create_with_capacity(n)
: creates a new set structure, initially empty but capable of holding up to n elements. - [x]
add(S,x)
: adds the elementx
toS
, if it is not present already. - [x]
remove(S, x)
: removes the elementx
fromS
, if it is present. - [ ]
capacity(S)
: returns the maximum number of values thatS
can hold.
Additional operations
- [x]
pop(S)
: returns an arbitrary element ofS
, deleting it fromS
. - [x]
pick(S)
: returns an arbitrary element ofS
. Functionally, the mutator pop can be interpreted as the pair of selectors(pick, rest)
, whererest
returns the set consisting of all elements except for the arbitrary element. Can be interpreted in terms ofiterate
. - [ ]
map(F,S)
: returns the set of distinct values resulting from applying functionF
to each element ofS
. - [ ]
filter(P,S)
: returns the subset containing all elements ofS
that satisfy a given predicate P. - [ ]
fold(A0,F,S)
: returns the valueA|S|
after applyingAi+1 := F(Ai, e)
for each elemente
ofS
, for some binary operationF
.F
must be associative and commutative for this to be well-defined. - [x]
clear(S)
: delete all elements ofS
. - [x]
equal(S1', S2')
: checks whether the two given sets are equal (i.e. contain all and only the same elements). - [ ]
hash(S)
: returns a hash value for the static setS
such that ifequal(S1, S2)
thenhash(S1) = hash(S2)
Other operations can be defined for sets with elements of a special type:
- [ ]
sum(S)
: returns the sum of all elements ofS
for some definition of "sum". For example, over integers or reals, it may be defined asfold(0, add, S)
. - [ ]
collapse(S)
: given a set of sets, return the union. For example,collapse({{1}, {2, 3}}) == {1, 2, 3}
. May be considered a kind ofsum
. - [ ]
flatten(S)
: given a set consisting of sets and atomic elements (elements that are not sets), returns a set whose elements are the atomic elements of the original top-level set or elements of the sets it contains. In other words, remove a level of nesting – like collapse, but allow atoms. This can be done a single time, or recursively flattening to obtain a set of only atomic elements.For example,flatten({1, {2, 3}}) == {1, 2, 3}
. - [ ]
nearest(S,x)
: returns the element ofS
that is closest in value tox
(by some metric). - [ ]
min(S)
,max(S)
: returns the minimum/maximum element ofS
.
Why not just use a map
to make a Set
? What is/are the use(s) of using a separate type?
Why not just use a
map
to make aSet
? What is/are the use(s) of using a separate type?
I don't know for sure how map
is implemented in V
, but, in a number of other languages, map
is less productive than array
.
Also, a number of additional functions will make this type more convenient to use in specific cases.
In addition, I saw that in the README.md
package datatypes
, there is a type Set
, which someone considered necessary to implement. I took it as welcome to contribution :-)
Okay 👌 Your job looks really good
Why not just use a
map
to make aSet
? What is/are the use(s) of using a separate type?I don't know for sure how
map
is implemented inV
, but, in a number of other languages,map
is less productive thanarray
.Also, a number of additional functions will make this type more convenient to use in specific cases.
In addition, I saw that in the
README.md
packagedatatypes
, there is a typeSet
, which someone considered necessary to implement. I took it as welcome to contribution :-)
Have you tried to benchmark your solution vs map solution?
I was going to suggest the same as the rest did. Using a map solution will be better in terms of performance! Just for reference, here there is a Set implementation I did in the past in PHP 😅 https://github.com/MBHFramework/structures/blob/385e1e3d49d8b953fa2354a6b522a3cf6de4c24f/Mbh/Collection/Set.php#L26
just for reference, using map
, the set should be something like this:
pub struct Set<T> {
mut:
table map[T]int // or map[T]bool or any type in the value
}
then Set
will be just a wrapper of map
when the keys are the values of the Set and most of the operations will be really efficient
I was going to suggest the same as the rest did. Using a map solution will be better in terms of performance! Just for reference, here there is a Set implementation I did in the past in PHP sweat_smile https://github.com/MBHFramework/structures/blob/385e1e3d49d8b953fa2354a6b522a3cf6de4c24f/Mbh/Collection/Set.php#L26
just for reference, using
map
, the set should be something like this:pub struct Set<T> { mut: table map[T]int // or map[T]bool or any type in the value }
then
Set
will be just a wrapper ofmap
when the keys are the values of the Set and most of the operations will be really efficient
Maybe voidptr
better than int
?
Maybe
voidptr
better thanint
?
voidptr can be 64bit, int is 32 vbit in V, you can save half the memory for each element. otherwise you can use a simply u8 to reduce the memory footprint.
voidptr can be 64bit, int is 32 vbit in V, you can save half the memory for each element. otherwise you can use a simply u8 to reduce the memory footprint.
Perhaps it makes sense to take not only the type of value, but also the type of key?
pub struct Set<K, V> {
// maximum number of values that Set<K, V> can hold.
capacity int
mut:
elements map[V]K
}
Perhaps it makes sense to take not only the type of value, but also the type of key?
If this is a pure set()
I don't think that the type of the value of the map
has a meaning.
You can simply choice the smallest one like u8
.
@vladimirmyshkovski I think there is no need of having a capacity for the Set
@ulises-jeremias
What would be a better name for the method of adding elements from an array to a set? For example, in Python
it is called update
. I'm thinking about the name of the method that returns a copy of the set: copy
or clone
?
Also, union
is a reserved name in V
therefore, the method must be called something else. Right now I'm calling it union_
.
Also,
union
is a reserved name inV
therefore, the method must be called something else. Right now I'm calling itunion_
.
The existing convention is to use fn ... @union(...) ... {
for names that are also keywords, for example:
vlib/sync/sync_default.c.v:93:pub fn (mut m Mutex) @lock() {
vlib/net/common.v:31:fn @select(handle int, test Select, timeout time.Duration) ?bool {
vlib/net/unix/stream_nix.v:55:fn (mut s StreamSocket) @select(test Select, timeout time.Duration) ?bool {
All tests pass now.
Merged, thanks!