onetagger icon indicating copy to clipboard operation
onetagger copied to clipboard

Audio Features customized threshold

Open Foons opened this issue 2 years ago • 8 comments

Hello everyone, I've been following Marekkon5's work for less than a year but it seems like a fantastic project.

First of all, thanks to Marekkon5 and the people who helped you build this project.

I would like to contribute an idea that could help music lovers like me to segment and classify music in a more dynamic way.

Thanks to what Spotify's audio features provide.

There is a lot of information online about how to classify the valence and energy of a song and the different models that can be used.

The most frequently mentioned of these models is Russell's Circumplex Model of Affect.

Mixing this model with the valence and energy (arousal) of the song, it is possible to determine with an accuracy greater than 66% where to frame the song to classify it according to mood.

In my case and to do a study of my music library I have to do it manually, unfortunately since OneTagger does not segment so much in the descriptions of the audio functions.

I was wondering if there is a way to further segment the audio features to classify, in my opinion, in a more effective way.

That is, if the threshold could be customized by segmenting it more so that it is more customizable by the user according to their needs, if the idea seems good to you.

For example, in Energy instead of 0-20 low energy, 21 - 79 medium energy, 80 - 100 high energy, if it could be:

0-20 lowest, 21-40 lower, 41-60 mid, 61-80 higher, and 81-100 highest.

In Valence, for example, instead of: 0 - 15 negative, 16 - 84 Balanced, 85 - 100 Positive, it could be:

0-20 very sad, 21-40 sad, 41-60 neutral, 61-80 joyful, and 81-100 very joyful.

It seems that the audio feature Acousticness is also relevant to classify using this model too.

In my music library I also have danceability classified according to these values:

0 - 20 21 - 40 41 - 60 61 - 80 81 - 100

As you can see, segment all the audio features from 20% to 20% since it is adapted to the commented model.

Finally, depending on the quadrant, I use rekordbox to assign a color to the song and I can quickly select which song resembles the one that sounds and which in turn is classified in the same quadrant.

I put documentation in case it was of your interest or it was for the community.

Thank you very much.

https://scholar.smu.edu/datasciencereview/vol5/iss3/3/

Figura-3-Discretizacao-das-interacoes-e-intensidades-de-estimulos-e-emocoes-Paltoglou

1506.05012.pdf 20130377_finalpaper.pdf Alternative Metric Sources for Spotify® Algorithm.pdf KDIR2016a.pdf Panda et al. - 2021 - How Does the Spotify API Compare to the Music Emotion Recognition State-of-the-Art.pdf

Foons avatar Jun 29 '22 20:06 Foons

Basically you want 5 degrees instead of 3 to specify more detailed, right? The idea is nice, but I'm not totally convinced we hear this difference in Spotify's value.

Can you sum up (based on Spotify values) a list of tracks, where you are convinced these 5 degrees are indeed represented per audio feature? (You can use https://songdata.io or the tags written by one tagger).

Also, I like the simplicity how the name schedule is now. Example: energy-low, energy-med, energy-high is simple and effective. energy-lowest, energy-lower, energy-med, energy-higher, energy-highest seems not that easy to remember. very-acoustic, acoustic, some-acoustic, bit-acoustic, electronic? doesn't do it for me.

But I'm open to see what you can bring up to the table. Thanks!

bascurtiz avatar Jun 29 '22 22:06 bascurtiz

Hello again, yes I am able to explain the idea correctly and it seems to you that I can bring an improvement to the project for all users. As you ask me, I will add several examples of segmentation by valence and energy in the songs.

Sorry for my English, I'm from Spain and I can have mistakes.

As you will see in the previous image, the graph is divided horizontally by the valence being negative to positive from right to left, in Spotify it goes from 0 to 1. As for the vertical plane from bottom to top, it could be represented by the energy of the song that in Spotify it is also represented from 0 to 1. Each quadrant means a state of mind that could differentiate the songs depending on where the Spotify values ​​of valencia - Energy are located, attached image where it can be represented in a simpler way so that you understand the idea.

take1 take7

Therefore, according to the previous table, we could determine thanks to Spotify in a medium-high success rate that the following songs can belong to each of the quadrants Q1, Q2, Q3 and Q4 according to their valence and energy, remember that both by the logarithm of Spotify like Russell's own model were not conceived to be mixed but it can help us, something that I have detected is that the genre of music is also relevant, the house genre is more cheerful than techno and can modulate up or down down or to the right or left as well as the major and minor keys and the key of the song itself, it is something that I am still studying.

Later I will explain the colors, they are the ones that I can use in rekordbox and that I also use for smart lists together with the segmentation that I told you about from 20% to 20%, now you will understand why:

Q1:

Orange: https://open.spotify.com/track/5H0KVeTf9p0nodyHvWWv72?si=e6b1784827bf4816 https://open.spotify.com/track/4mJCOTvInk4glr2PxRFU06?si=f653050e0ba243b0

Yellow: https://open.spotify.com/track/0RpSdArlp75WDKl5nsFSHK?si=50d5cec43f914d28 https://open.spotify.com/track/75SS50WWBZINsZgccsSnGs?si=2cdd4f285b45448e

Q2:

Red: https://open.spotify.com/track/2c4eRbqYvTwlsuiLeDvnN6?si=8b9d91a9c61c4a37 https://open.spotify.com/track/5GxmhdfQgoWkAH3CmPHe7Y?si=06c568986c0743c6

Pink: https://open.spotify.com/track/4KXuPESFA1It3nQiNpTYVf?si=d031f6a365404198 https://open.spotify.com/track/7td77R6MNM1GL3nh7QQ0KH?si=4216ac869b5143ad

Q3:

Purple: https://open.spotify.com/track/4osLYKjGajPZP2GQlhjpCj?si=61e71d01091c4ef0 https://open.spotify.com/track/3c85Hd4bEqTqzCgB4SmpbW?si=ca92fc49f73240ab

Blue: https://open.spotify.com/track/5DNljtoJGRb5qLimlNkirn?si=a36d5fa322914972 https://open.spotify.com/track/3xJGa91yPrWZ24PRUM2T6i?si=f22c1e2ebace4e56

Q4:

Green: https://open.spotify.com/track/7C8GR9DIPP67US9qBDRWZQ?si=96bbd3e5c5bc43a7 https://open.spotify.com/track/7clVsGVT8qgPptuh0eUOVp?si=3bc845b1dc594bcb

Aqua:

I don't have songs of this type, sorry.

There would be a fifth quadrant, which is when the valence and the energy are in the middle, both one and the other between 41% and 60%, I have also seen that a part of dark music is found in that strip.

I enclose several images where I already mix the colors with the quadrants where you will finally understand everything, if it seems relevant and a good idea it could be of great help to be able to further segment the audio features of one tagger to be able to classify the songs according to the quadrants or in their Default colors, you are the ones who can change it. I hope you like the idea.

Thank you very much.

take2 take3 take5

Foons avatar Jun 30 '22 20:06 Foons

Few questions:

  1. In earlier post, you talk about 5 ranges. In previous post, you talk about 4 ranges/quadrants. How do you see it fit best?

  2. Do you want a translation when valence = high + energy = high = certain label? And when valence = low + energy = high = other label?

  3. I've converted the raw values from One Tagger with Mp3tag, divided them into 5 sections for now and marked what pops out the most. In order to do so yourself, see: https://community.mp3tag.de/t/how-to-convert-value-ranges-to-a-description/57790 If you need help, let me know.

Will inspect further tomorrow.

Hereby the results of your example songs:

q1-orange

q1-yellow

q2-pink

q2-red

q3-blue

q3-purple

q4-green

bascurtiz avatar Jul 01 '22 23:07 bascurtiz

Hello again, that's right, in the first post I commented on 5 quadrants, I gave you the example of 4 quadrants because that's where there are more pronounced differences.

In my opinion, there should be a fifth quadrant, which is balanced, or at least I have it that way in my music collection.

When the valence is between 41% and 61% and the energy between 41% and 61% we could say that the songs are balanced, here there are not many differences that we can highlight, therefore we can classify these songs as #balanced .

Now I go with each of the quadrants:

Q1 is made up of orange and yellow colors, I will always explain it from top to bottom and from right to left.

Let me explain a bit about those colors:

Orange has an energy range between 81% and 100% and its valence is divided into 2 ranges from 81% to 100% and from 61% to 80%.

The range 81% to 100% should be music that evokes us in music that excites us, according to how I have classified the songs in rekordbox when the songs have an energy that oscillates between 81% to 100% called Highest and a valence that oscillates between 81% and 100% I call it Very Joyful and finally I color it orange using rekordbox smart lists.

I'll show you a sample of my orange rekordbox screen, the valence here is between 81% and 100% (Very Joyful):

Take8

I make another sample of my screen also orange but this time the valence is between 61% and 80%. (joyful):

take 9

In the 2 previous examples the energy is between 81% and 100%.

I digitize the classification that I made some time ago to answer the first question that 5 quadrants are better than 4.

take10

I would have assigned more colors to it but rekordbox only has 8 colors, in my classification I give it a ninth color which is not to give it color and it coincides with the songs that are in the 41% and 60% ranges of both valence and energy and that I have classified as balanced songs.

take6

The fifth quadrant is what in this image is represented by the gray color and that in my music collection the songs that match those ranges have no color in rekordbox.

Second question:

Well seen, in fact it personally helps me to determine what type of label regarding valence and energy but not only that but also the genre of music I give you some examples of my classification since I have also classified music by label :

https://www.beatport.com/label/erased-tapes/12955 :

take11

https://www.beatport.com/label/azzur/57347 :

(Here the first 5 songs are from beatport and are with my manual assignment, the sixth to tenth songs are mp3 and have the classification made by one tagger in comments.)

take12

Summarizing the second question, I think that what could be done to improve one tagger would be that for each range assign the mood or a color or both.

According to the values ​​​​that that song has and in which quadrant or sub-quadrants it is.

Summing it all up, from right to left and from top to bottom:

Orange1: Highest ( 81% - 100%) - Very Joyful (81% - 100%): could be Mood: Excited.

Orange2: Highest ( 81% - 100%) - Joyful (61% - 80%): could be Mood: Aroused, Astonished

Key found most of the orange songs: 4A, 6A

No color 1: (New One) , represented as gray color in previous images. Highest ( 81% - 100%) - Neutral ( 41% - 61%) could be Mood: Alarmed, Tense , in this quadrant I have found dark music.

Red1: Highest ( 81% - 100%) - Sad (21% - 41%): could be Mood: Angry, Annoyed.

red2: Highest ( 81% - 100%) - Very Sad (0% - 21%): could be Mood: Hateful, Defiant.

Tonality found majority of the songs of red color: 5A

Yellow1: Higher ( 61% - 81%) - Very Joyful (81% - 100%): could be Mood: Delighted

Yellow2: Higher ( 61% - 81%) - Very Joyful (81% - 100%): could be Mood: Light Hearted

Tonality found majority of the songs of yellow color: 6A

No Color 2 (New One) Higher ( 61% - 81%) - Neutral (41% - 61%) could be Mood: Jealous, Impatient.

Pink1: Highest ( 81% - 100%) - Sad (21% - 41%): could be Mood: Frustrated

Pink2: Highest ( 81% - 100%) - Very Sad (0% - 21%): could be Mood: Disgusted, Bitter, Loathing.

Key found most of the pink songs: 8A

No Color 3 (New One) Mid (41% - 61%) - Very Joyful (81% - 100%) could be Mood: Happy, Pleased.

No color 4 (New One) Mid (41% - 61%) - Joyful (61% - 81%) could be Mood: Impressed, Expectant, passionate.

No color 5 (New One) Mid (41% - 61%) - Neutral (41% - 61%) could be Mood: Balanced.

No Color 6 (New One) Mid (41% - 61%) - Sad (21% - 41%): could be Mood: Distrusful.

No Color 7 (New One) Mid (41% - 61%) - Very Sad (0% - 21%): could be Mood: Startled, Disappointed, Miserable.

Green1: Lower (21% - 41%) - Very Joyful (81% - 100%) could be Mood: Serene, Solemn.

Green2: Lower (21% - 41%) - Joyful (61% - 81%) could be Mood: Attentive.

Tonality found in the majority of the green songs: ?

No Color 8 (New One) Lower (21% - 41%) - Neutral (41% - 61%) could be Mood: Worried.

Purple1: Lower (21% - 41%) - Sad (21% - 41%) could be Mood: Taken Aback, feel guilty, Ashamed.

Purple2: Lower (21% - 41%) - Very Sad (0% - 21%) could be Mood: Sad, Gloomy, Depressed.

Tonality found in the majority of the Purple songs: A6 , 7A

Blue 1: Lowest (0% - 21%) - Very Joyful (81% - 100%) could be Mood: Relaxed, Calm.

Blue2: Lowest (0% - 21%) - Joyful (61% - 81%) could be Mood: Polite, Compassionate.

Tonality found majority of the songs of blue color: ?

No color 9 (New One) Lowest (0% - 21%) - Neutral (41% - 61%) could be Mood: Tired, Sleepy.

Dark Blue 1: Lowest (0% - 21%) - Sad (21% - 41%) could be Mood: Bored.

Dark Blue 2: Lowest (0% - 21%) - Very Sad (0% - 21%) could be Mood: Anxious.

Mostly found shade of dark blue songs: 5A

or what is the same and seen graphically in colors:

take13

This idea could be used in a simplified and automated way as you have in one tagger in the quick tag section with colors and mood.

The user could choose or not whether one tagger puts the color and mood to the song automatically according to the spotify audio features values ​​and the quadrants and sub-quadrants described in this post.

Note: the first percentages belong to the energy of the song the second percentages belong to the valence.

I know that the value of acoustinness is relevant but I still haven't understood why. I guess it could be used to frame the songs in the quadrants in a more effective way but I'm still studying it.

As for the images that I have given you in terms of danceability, it also works in a high percentage, that is why in my collection I have classified the songs from A to E, with the letter E being the most danceable songs, following the same pattern of 20%.

third question

I know mytag but I prefer one tagger ;), I will review on Sunday with a little more time more songs to contrast.

Thank you.

Foons avatar Jul 02 '22 11:07 Foons

Hello, to my understanding you want a more customizable AF section which would allow more precise separation of values in prominent tag and more values for it. However this would require quite a big refactor of AF section. However OneTagger currently uses Vue2 and JavaScript for its frontend and it is being replaced by Vue3. I would like to do the migration before doing any big refactors/changes/features. So implementing this will take a longer time.

Marekkon5 avatar Jul 04 '22 13:07 Marekkon5

Hello, there is no rush, meanwhile I will continue studying the music with the proposed quadrants, it requires a little time too.

I borrowed your work through Mytag, it helps me to assign the moods while the audio features values ​​of spotify, it works really well reassigning the words by numbers and reassigning these again by the moods.

Take 14 take 15

Afterwards I will reassign danceability to the Popularimeter (POPM) tag to be able to use it in rekordbox as a Rating and finally I will reassign Acousticness to Composer (TCOM) in case it will be of any use to me in the future.

Unfortunately both Mytags and the color in rekordbox belong to the program's own database (sql), I'm studying the possibility of increasing, if it's easy, the number of colors to match the number of moods but first I'll see which they are the most relevant and I will eliminate the ones that can cause misunderstandings, if I move forward and find something relevant I will let you know.

Greetings and thanks.

Foons avatar Jul 04 '22 22:07 Foons

hello @Foons can you share your action script you use in mp3tag? I would like to test this way of classifying my music on a small sample (playlist) to see if it is really useful. thanks

flovanc avatar Feb 06 '23 14:02 flovanc

Hello @flovanc , of course, here you are, in action groups, I have the following:

Step 1:

Mood Integer: $replace($div(%1T_ENERGY%,20),0,0,1,1,2,2,3,3,4,4)$replace($div(%1t_valence%,20),0,A, 1,B,2,C,3,D,4,E) Composer field: $replace($div(%1T_ENERGY%,20),0,0,1,1,2,2,3,3,4,4)$replace($div(%1t_valence%,20),0,A, 1,B,2,C,3,D,4,E) In the Mood field: I have the substitutions of the values ​​for the description of the mood according to the screenshot 1. In the Composer field: I change it back so that I can see it in rekordbox screenshot 2. Comment field: $replace($div(%1T_ENERGY%,20),0,#Lowest,1,#Lower,2,#Mid,3,#Higher,4,#Highest) $replace($div(%1T_VALENCE%,20) ,0,#Very Sad,1,#Sad,2,#Neutral,3,#Joyful,4,#Very Joyful) $replace($div(%1T_DANCEABILITY%,20),0,#A,1,#B ,2,#C,3,#D,4,#E) $replace($div(%1T_ACOUSTICNESS%,20),0,#Min-Acoustic,1,#Low-Acoustic,2,#Med-Acoustic, 3,#High-Acoustic,4,#Max-Acoustic) $replace($div(%1T_POPULARITY%,25),0,#Deep,1,#Somewhat Popular,2,#Very Popular,3,#Top Popular) %Mood% $replace($div(%1T_SPEECHINESS%,33),0,#Music,1,#Both,2,#Spoken) $replace($div(%1T_INSTRUMENTALNESS%,20),0,#Vocal-Max ,1,#Vocal-high,2,#Vocal-med,3,#Vocal-low,4,#Vocal-min) $replace($div(%1T_LIVENESS%,20),0,#Echo-Dry,1 ,#Echo-Min,2,#Echo-med,3,#Echo-Wet,4,#Echo-Live) In the Rating MM field: $replace($div(%1T_ENERGY%,20),0,0,1,1,2,2,3,3,4,4) this updates in rekordbox the stars. Original Artist field: $replace($div(%1T_DANCEABILITY%,20),0,#A,1,#B,2,#C,3,#D,4,#E)

Step 2:

Composer field:

$replace($div(%1T_ENERGY%,10),0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9 ,9)$replace($div(%1t_valence%,10),0,A,1,B,2,C,3,D,4,E,5,F,6,G,7,H,8, I,9,J)

The result of the segmentation is like the screen clipping that I put below.

Also, depending on the values ​​and that segmentation, I use colors and duplicate the values ​​in Rekorbok's "my label", the program's own and only accessible through its database, but which is very useful for segmentation when you want to have a session music with songs with similar or identical characteristics.

It also gives the possibility of using intelligent music lists that have nothing to envy to the track suggestion and that I also use.

I am still working on a finer segmentation but due to lack of time I have not been able to continue, I think that my next advance will consist of segmenting the songs as I had the discontinued program Beatpor Pro on mac I have it installed and I use it but I have no idea of ​​the beatport.com API

This would not be possible without this fantastic program that is Onetagger ;)

Greetings.

Screen1 Screen2 Rekordbox Track Suggestion

Foons avatar Feb 06 '23 15:02 Foons