jukebox
jukebox copied to clipboard
How do we edit metas for genre/artist fusions
To make genre and artist fusions, such as 50% pop 50% jazz, 50% sinatra 50% fitzgerald
how should we edit metas in sample.py? Can you show us some examples?
Yeah, we didn't support this feature, because it is a little messy. The style embedding is given by y_cond
here. To interpolate them, you could add these lines and pass appropriate values of another_y
and alpha
when get_cond
is called:
+def get_cond(self, z_conds, y, another_y=None, alpha=0.5):
...
y_cond, y_pos = self.y_emb(y) if self.y_cond else (None, None)
+ if self.y_cond and another_y is not None:
+ assert y.shape[0] == another_y.shape[0], "Label batch size is different."
+ n_labels = another_y.shape[1] - self.n_tokens
+ another_y = another_y[:, :n_labels]
+ another_y_cond, _ = self.y_emb(another_y)
+ y_cond = y_cond * alpha + another_y_cond * (1.0 - alpha)
x_cond = self.x_emb(z_conds) if self.x_cond else y_pos
return x_cond, y_cond, prime
To construct another_y
:
- Choose
other_metas
like this (it doesn't matter what lyrics you choose here, because they are not used). - other_metas -> other_labels
-
other_labels -> another_y (
start
is used to approximately locate the lyric window, and can be set to a dummy value like 0 for the purpose of mixing styles)
Thanks! Would spherical interpolation be a better choice?
Yeah, it's possible. You're welcome to try other variations
Hi - trying to implement this and got as far as
another_y_cond, _ = self.y_emb(another_y)
...when i got the following error... cant really diagnose what's happening here @heewooj ... any thoughts? thanks in advance!
/usr/local/lib/python3.7/dist-packages/jukebox/prior/conditioners.py in forward(self, pos_start, pos_end) 89 # Check if [pos_start,pos_end] in [pos_min, pos_max) 90 assert len(pos_start.shape) == 2, f"Expected shape with 2 dims, got {pos_start.shape}" ---> 91 assert (self.pos_min <= pos_start).all() and (pos_start < self.pos_max).all(), f"Range is [{self.pos_min},{self.pos_max}), got {pos_start}" 92 pos_start = pos_start.float() 93 if pos_end is not None:
AssertionError: Range is [1049580.0,26460000.0), got tensor([[1048576.], [1048576.], [1048576.]], device='cuda:0')
ok think i have a solution... y_emb()
was calling the positional embedding from the forward()
function on the underlying LabelConditioner
class was being called... we dont need that for another_y
y_emb_time_signal = self.y_emb.include_time_signal
self.y_emb.include_time_signal = False
another_y_cond, _ = self.y_emb(another_y)
self.y_emb.include_time_signal = y_emb_time_signal