festival
festival copied to clipboard
`Collection` patching
What
Due to index invalidation, the Collection
cannot be mutated.
Assuming that most Collection
resets only contain a small amount of data change (adding an album or two), it makes more sense and would be much cheaper to re-use the inner structures, append/remove data, and re-calculate the indices rather than creating a new Collection
from scratch.
This will be referred to as "patching".
Steps
The steps would be something like:
- Store PATHs associated with the current
Collection
- Actively watch and record any file changes to those PATHs
- When a new
Collection
is created, diff the PATHs requested with our modified PATH history:
old paths new paths
--------- ---------
a
b b
c c
d *d
e e
In the above example:
-
a
was removed -
b
,c
, ande
are the same -
d
is the same PATH, but something about the underlying file changed
This means we only actually have to scan d
, and should forget about a
in our new Collection
.
- Disassemble the current
Collection
into the individual structures, leaving behind the data that is no longer needed, and keeping the data the user still wants:
|---------------------------| |-----| |------------|
| Old Collection | | Old | | Collection |
|---------------------------| |-----|-|-------------| |-----|
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | | 1 | 2 | | 41 |
| [...] | ---> |-----|-|----| |----| | |
| | | | | 12 | | 33 | | 22 |
| |----------------------| |-----|------| |----| |-----|
| | some irrelevant data | | 42 |
|---------------------------| |-----|
- After this, the new PATHs would be scanned, and we would go through the process of "The Loop", appending the data to the structures we already have if possible:
|----------------|-----------------|
| Old Collection | new data |
|-----|----------|-----| |--|
| 19 | | 1 | 2 | 42 |--| 22 |
|-----|-----| |-----| | |---|
| 66 | | 12 | | | 33 41 |
|-----|-----|--|--| |--|-------|
| 999 | new data | new data |
|-----|--------|----------|-------|
The data would be in a cohesive structure, but the indices would be all over the place and would be pointing at the wrong things.
Problems
The main problems that needs to be solved:
- How will the indices be reconnected?
- How will irrelevant data be pruned? (
.drain()
into newVec
?)
Viable Situations
This will be faster than from-scratch creation as long as the delta between the new PATHs and old PATHs are <50%.
If there are many fewer or many more new PATHs, it would make more sense to just ditch the old and start from scratch.
Since most Collection
resets are likely to only contain small changes, this patching approach would probably be viable most of the time, and would lead to major performance improvements since we would be only need to scan the new PATHs, and shift some numbers around (indices).