Follow Us:

In the deliverable ‘D3.2 – Predictive analytics and recommendation framework v2, Thomas Lidy, Adrian Lecoutre, Khalil Boulkenafet from Musimap, and Manos Schinas, Christos Koutlis, Symeon Papadopoulos form CERTH, presented the work conducted in our project to meet the requirements related to predictive analytics and recommendations, aiming to produce popularity-oriented results for artists, tracks and genres.

Regarding track’s popularity estimation and prediction, the team described that overall track popularity is estimated from the following sources: Deezer rank, Spotify popularity, views and likes on YouTube, and global airplay counts provided by our partner BMAT. The team compared k-Nearest Neighbors (kNN) against Long-Short-Term-Memory (LSTM) deep learning models, and concluded that LSTM resulted slightly better, but at a dramatically higher computational cost; therefore, decided to use the kNN implementation instead. More importantly, Thomas Lidy, Adrian Lecoutre, Khalil Boulkenafet Manos Schinas, Christos Koutlis, and Symeon Papadopoulos explained that in both cases, since these models are based on machine learning models trained on historical crawled signals, they will theoretically become more and more accurate over time. In general, the different analyses have shown that a simple trend or seasonality can be easily predicted for a relatively small time prediction interval.

Nevertheless, it is pinpointed that ‘the difficulty can rise significantly when the source signal is almost static (such as in the case of Spotify popularity), the predicted target goes farther in the future, or when an unexpected trend appears without any detectable pattern in the previous days of history in the data’. As the authors mention ‘we have studied that a multivariate LSTM approach can potentially overcome these issues when more data becomes available. In the current implementation the kNN approach is used to predict a track’s popularity up to 21 days into the future, based on up to 28 days of history.

For artist popularity our team proposed a non-linear aggregation method in order to combine diverse sources of popularity information, like Spotify followers, YouTube views, playcounts etc. This method actually leverages geometrical shapes formed by the normalized metric values obtained for each artist and combines them by computing a fraction where the numerator corresponds to the artist under study and the denominator corresponds to the best possible case i.e. the most popular possible artist. The results showed that this method outperformed the most natural choice being a simple average and also it outperformed other non-linear metric aggregation methods in terms of correlation, rank correlation and rank distance with the ground truth.

Following now the impact of events, such as album release, TV show appearance or interview, on an artist’s popularity level, it was remarkable that no significant changes were observed on popularity metrics such as YouTube views/subscribers and playcounts, after the events, but changes were observed on streaming activity (Spotify, iTunes, Deezer streams). The FuturePulse team compared two different methods that estimate the level of impact an event has on future popularity values/streaming activity. In addition, the segmented linear regression method showed good performance identifying accurately the upcoming changes after an event.

Last but not least, for the estimation of genre popularity and growth the authors elaborated on the way they worked in order to tackle the problem of data sparsity. By analysing genre occurrences in Spotify artists using a graph embedding technique, our team identified sub-genre associations between genres. As they explained, ‘that information is then used to count genre appearances in music charts’. However, although the first results were promising, when these associations were considered, there was a need to further evaluate the identified associations as well as the popularity scores generated.

As for future activities for the future, the authors mentioned the following goals:  

  • Analyze playlists and develop a methodology to detect similar playlists based on co-listening patterns, content similarity and music genres
  • Update tracks popularity estimation and prediction by adding more sources e.g. country-wise airplays, charts, playlists, Spotify analytics data etc.
  • Investigate multivariate approaches (e.g. LSTM) again with more data available
  • Update and evaluate genre popularity estimation, by using genre associations
  • Combine in a principled way the several artist popularity estimations developed separately for each of the use cases.
  • Co-inform track popularity estimation and artist popularity estimation mutually

You can read the D3.2 – Predictive analytics and recommendation framework here.


Source: Deliverable D3.2 Predictive analytics and recommendation framework v2 - Authors: Thomas Lidy (MMAP), Adrian Lecoutre (MMAP), Khalil Boulkenafet (MMAP), Manos Schinas (CERTH), Christos Koutlis (CERTH), Symeon Papadopoulos (CERTH) - Contributor/s: Vasiliki Gkatziaki (CERTH), Emmanouil Krasanakis (CERTH), Polychronis Charitidis (CERTH) - Deliverable Lead Beneficiary: MMAP

Share This Story, Choose Your Platform!
European flag

Co-funded by the European Commission

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 761634. This website reflects the views only of the Consortium, and the Commission cannot be held responsible for any use which may be made of the information contained herein.