Clustering Analysis


In order to better understand how our K-means model categorizes tracks and their attributes, Tableau was used to vizualize important characteristics of each cluster. By using the dropdown, all 10 clusters can be selected and assessed in two dimensions simultaneously. On the left, a TreeMap is used to display the hierarchal arrangement of attribute averages in a given cluster. On the right, a bubble chart is used to exhibit the top 10 appearing genres in that cluster.

The following table includes the track in each cluster that is closest to its center and arguably most representative of the cluster as whole.
Note: dtcc = distance to cluster center
track_name artist cluster dtcc_0 dtcc_1 dtcc_2 dtcc_3 dtcc_4 dtcc_5 dtcc_6 dtcc_7 dtcc_8 dtcc_9
抱歉 ['Sam Lee'] 0 0.072 0.590 0.545 0.414 0.556 0.966 0.848 1.119 0.683 1.030
Meditation ['Doris Day'] 1 0.563 0.059 1.003 0.977 0.584 0.967 1.201 0.850 0.961 0.900
Milostna ['Daniel Landa'] 2 0.566 1.001 0.044 0.441 0.595 1.062 0.853 1.458 0.695 1.136
Way Back Home - Original ['Bag Raiders'] 3 0.403 0.937 0.396 0.063 0.746 1.076 0.757 1.368 0.656 1.208
Breathless ['Nick Cave & The Bad Seeds'] 4 0.557 0.538 0.621 0.808 0.072 0.910 1.069 1.116 0.788 0.811
008 - Auf der Spur der Vogeljäger - Teil 08 ['TKKG Retro-Archiv'] 5 0.973 0.961 1.099 1.156 0.904 0.097 1.359 1.350 1.045 1.196
חתולים ['Berry Sakharof'] 6 0.797 1.117 0.816 0.734 0.992 1.283 0.121 1.027 0.905 0.858
Oltre la collina... ['Mia Martini'] 7 1.075 0.825 1.420 1.369 1.100 1.338 1.079 0.083 1.378 0.602
Résiste (Live au Zénith, 1985) - Remasterisé en 2004 ['France Gall'] 8 0.734 0.977 0.753 0.701 0.810 1.041 1.009 1.400 0.104 1.209
Quien Sena (Sway) ['Al Caiola'] 9 0.929 0.817 1.052 1.152 0.747 1.150 0.847 0.622 1.095 0.097

This is a student project; we are not collecting any data. Spotify widgets do collect user data. See spotify terms.