Order.separateArtists does not preserve the track ordering
I have been using Order.separateArtists in a few programs, and did not notice this issue until now, That is because I did not examine the results. What I have found is that the function has unexpected results regarding the tracks ordering. I was expecting the first instance of a track by an artist to remain at the same place (or higher if other tracks were moved previously). This is not the case at all. In fact it seems the processing starts from the end of the tracklist, and has the side effect of causing the first instances to be removed if seperation cannot be achieved (as expected). Maybe it is even more unexpected than this, with what seems like randonmess thrown in (or maybe some processing goes one way, while other goes the other way).
For example: I am trying to produce a tracklist keeping only the most popular tracks, with only the most popular track of each artist. This is what I get without artist separation:
// popularity reverse sort (most popular first)
// keep first 10
77: Aventura : Obsesion
76: Aventura : Dile al Amor
78: Romeo Santos : Eres Mía
76: Romeo Santos : Sobredosis (feat. Ozuna)
75: Romeo Santos : Odio (feat. Drake)
75: Romeo Santos : Necio (feat. Carlos Santana)
74: Romeo Santos : El Pañuelo
74: Aventura : El Perdedor
74: Natti Natasha : La Mejor Versión de Mi - Remix
73: Prince Royce : Darte un Beso
(first column is popularity, second artist name, third track name) As you can see, the source has many artist duplicates, which I need to remove. The easiest way to do it should be to apply artist separation, preserving the order, with a separation space of the desired size. This is what it produces:
// popularity reverse sort (most popular first)
// artist separation
// keep first 10
0: Kiko Rodriguez : Ya Te Olvide
0: Daniel Santacruz : Se Busca un Corazon
0: Juan Manuel : Si Me Dejas No Vale
0: Kewin Cosmos : Solo Quise Quererte
0: Allendi : Te Mal Informaron (HD Digital Remastered)
0: Manny Jhovanny : Lagrima de Amor
0: Juan Luis Guerra y 440 : Burbujas de Amor
0: Jhonny Evidence : Hoy Te Vas
0: Prince Royce : Culpa al Corazón
0: Luis Vargas : Loco De Amor
Those are actually the least popular tracks!
Reversing the processing improves results but is not quite right:
// popularity sort (least popular first)
// artist separation
// keep last 10
// popularity reverse sort (most popular first)
75: Romeo Santos : Odio (feat. Drake)
74: Natti Natasha : La Mejor Versión de Mi - Remix
66: Ozuna : Señor Juez
62: Aventura : Ciego De Amor - Featuring Anthony Santos
53: C. Tangana : Bobo
51: Christian Daniel : Ahora Que Te Vas (Versión Bachata)
49: Johnny Sky : En Todo Fuiste la Mejor
47: Chavi Leons : Otra Vez
43: DJ Khalid : Perfecta - Bachata Version
11: Joan Soriano : La Mamandela
Compared to the first result, you can see that the Romeo Santos and Aventura tracks are not the most popular. If I request more than 10 results, I can also see other tracks from other artists have disappeared altogether.
I think this is an incorrect logic in the separateArtists function. I wrote a similar function in Python a while ago, and I think it was doing it in the correct order. I will try to dig it up and see if I can help fix the code.
For reference, here is code I used to produce those results:
function zipdjClassicTracks() {
let sourcePlaylist = '7eufyawBpqQMvsIN4fgQB1'; // Bachata
// let sourcePlaylist = '7htbYn8f0q01n5br4XhQR9'; // Cumbia
// let sourcePlaylist = '3Bsz7QecNExg8bsDdW4iGW'; // Merengue
// let sourcePlaylist = 'playlist:0o2GXyncPfGLmGSiOqSINx'; // Salsa
let numTracks = 20;
let archivePlaylist = '1BQCyWrAkSLiDsra9TZgn8'; // [zipdj] all songs
let targetPlaylist = '1uLOtPdWnESn9zbgzzZDlh'; // [zipdj] workbench
let targetName = '[zipdj] workbench';
// Load tracks already published on zipdj
let publishedTracks = Source.getPlaylistTracks('', archivePlaylist);
// Load tracks from source playlist
let tracks = Source.getPlaylistTracks('', sourcePlaylist);
// Remove published tracks from source playlist
Filter.removeTracks(tracks, publishedTracks);
// Sort tracks by decreasing popularity
tracks.sort((a, b) => b.popularity - a.popularity);
// Separate artists by the number of tracks to publish, to ensure artist uniqueness
Order.separateArtists(tracks, numTracks, false);
// Keep only the number of tracks desired
Selector.keepFirst(tracks, numTracks);
// DEBUG LOG
for (let i = 0; i < tracks.length; i++) {
Logger.log(tracks[i].popularity + ': ' + tracks[i].artists[0].name + ' : ' + tracks[i].name );
};
// Save tracks to workbench playlist, appending to the existing ones
Playlist.saveWithAppend({
name: targetName,
id: targetPlaylist,
tracks: tracks,
public: false
});
}
Probably you use incorrect function for your purpose. In order to get only one track per artist need to use dedupArtists.
let items = // artists or tracks
Order.sort(items, 'artist.popularity', 'desc')
Filter.dedupArtists(items)
The purpose of separateArtists is re-order array to create space between the same artists.
You are right: if I only wanted to keep 1 track per artist, Filter.dedupArtists is the way to go. I had actually forgotten about that function, and achieved the same result by using
Order.separateArtists(tracks, tracks.length, false);
This does not change the fact that the resultant ordering and selection is unexepected with separateArtists. Ideally though, I would have wanted to allow up to the 2 top tracks from each artist, which I was expecting to work with:
Order.separateArtists(tracks, Math.round(numTracks/2), false);
That is how I discovered the problem.
separateArtists can insert tracks forward and backward direction.
For example, look at array and result logs.
let array = [
{ id: 'lion', followers: 10 },
{ id: 'parrot', followers: 15 },
{ id: 'cat', followers: 99 },
{ id: 'dog', followers: 95 },
{ id: 'dog', followers: 45 },
{ id: 'monkey', followers: 20 },
]
Order.separateArtists(array, 2, false);
console.log(
array
.map(a => `${a.followers} ${a.id}`)
.join('\n')
)
// space = 2
45 dog
10 lion
15 parrot
99 cat
95 dog
20 monkey
With space = 2 the 45 dog will re-insert to first position because:
- after
95 doghave only one item - but when
45 dogat first place, space until95 dogis more than 2 items and it is correct condition
That's why you see Romeo Santos and Aventura aren't the most popular in the beginning. When item can't insert any place, it is deleted. So I still think that separateArtists is incorrect function to your purpose.
You need to group by artists in order to get more than one top tracks. Something like below. Note that is grouped by first artists of track.
function keepTrackCountByArtists(tracks, count) {
let groups = tracks.reduce((artists, track) => {
let key = track.artists[0].id
artists[key] = artists[key] || []
artists[key].push(track)
return artists
}, {})
Combiner.replace(
tracks,
Object
.values(groups)
.map(group => Selector.sliceFirst(group, count))
.flat(1)
)
}
Interesting approach... I will try it, thank you.
Here is how I did it in Python. I will try to port it to JS and see how it behaves. I wrote it quite a while ago, so I am not sure how it behaves; it might very well have the same problem. But if I remember correctly, I did it in such a way that it does not remove tracks if separation cannot be achieved.
def score_artist_separation ( tracks, separation=4 ):
"""
This function is used internally by seprate_artists
Calculate the artist separation score, and
produces a list of index for the tracks that
need separation
Args:
tracks: list of spotify track objects
separation: desired artist separation (optional, default=4)
Returns:
score: separation score (0 = fully separated)
separation_index: index of tracks that need separation (list)
"""
score = 0
indexes = set()
# iterate through all the tracks
for x in range(len(tracks) - 1):
# get the main artist id for the current track
artist_id_x = tracks[x]['track']['artists'][0]['id']
# iterate through all the subsequent tracks
for y in range(x + 1, len(tracks)):
# once we reach the desired separation distance, exit
if y - x >= separation:
break
# get the main artist id for the current track
artist_id_y = tracks[y]['track']['artists'][0]['id']
# identical artists found, closer than the desired separation
if artist_id_x == artist_id_y:
# add the inverted distance to the score
score += separation - (y - x)
# add both track to the index of tracks that need separation
indexes.add(x)
indexes.add(y)
# transform the set of index to a list, and sort it
separation_index = list(indexes)
separation_index.sort()
return score, separation_index
def separate_artists( tracks, separation=4 ):
"""
Separate tracks according to the main artist
The order of the tracks returned is random,
but due to the way the separation score is
calculated and processed, multiple runs will
produce similar results.
For true random mode, you need to shuffle the
tracks before calling this function.
Args:
tracks: list of spotify track objects
separation: desired artist separation (optional, default=4)
Returns:
list of artists separated tracks
"""
max_tries = 1000
max_swaps = 100
swaps = 0
# initialise the separation score for the list of tracks, and the index of tracks that need separation
cur_score, indexes = score_artist_separation( tracks, separation )
# try to swap tracks up to a maximum limit of attempts
for i in range(max_tries):
# if we already have total desired separation, or we have not more tracks to attempt separation on, stop
if cur_score == 0 or len(indexes) == 0:
break
# pick one random track, and one random index (track that need separation)
swap_1 = random.choice(indexes)
swap_2 = random.randint(0, len(tracks) - 1)
# swap tracks
tmp = tracks[swap_1]
tracks[swap_1] = tracks[swap_2]
tracks[swap_2] = tmp
# calculate the new artist separation score
new_score, new_indexes = score_artist_separation(tracks, separation)
# if the new score is worse (higher) than the current one, revert the swap
if new_score >= cur_score:
# swap tracks
tmp = tracks[swap_1]
tracks[swap_1] = tracks[swap_2]
tracks[swap_2] = tmp
# increment the swap attempt counter, and terminate if we have exceeded the max number of swaps
swaps += 1
if swaps > max_swaps:
break
else:
# the new score is better (lower)
# reset the swap attempt counter
swaps = 0
# update the current score
cur_score = new_score
# update the list of indexes for tracks that need separation
indexes = new_indexes
# if we reach a score of 0, total desired separation is achieved, and we can stop
if cur_score == 0:
break
print("Artist separation - Desired:", separation, " - Score:", cur_score, " - Remaining:", str(len(indexes)))
return tracks