Do you know what it takes to make a hit song? Mathematicians from Bristol University in the UK purport that their ‘Hit Equation’ is extremely accurate at predicting what will and will not be a hit.
They have a site called Score-a-Hit that describes all of this. Here is a snip:
To quantify the hit potential of a song, we make use of the regression technique. Mathematically, the hit potential (peak UK chart position) of a song is denoted by a variable y and a set of audio features x of the song are also presented. A pre-trained classifier f(x)=w’x is then used to estimate the hit potential.
So far, so easy. Now is the critical point – which learning machine is most appropriate for hit potential estimation? Since the hit potential likely depends on the era, it makes sense to express the learning machine as a function of time. We thus organized the dataset chronologically and employed the Time-Shifting Ridge Regression (TSRR) as the learning agent. The algorithm of TSRR is outlined in the following table
Intuitively, the taste of music listeners evolves through time and songs in an era should be more helpful in exploiting coeval trends than past music. This is modelled by means of the memory parameter in TSRR, allowing the learning machine to “forget” past trends and adapt to new ones.
Here is the trend over time:
Here are some findings from the work:
The study found some interesting trends, such as:
Before the eighties, the danceability of a song was not very relevant to its hit potential. From then on, danceable songs were more likely to become a hit. Also the average danceability of all songs on the charts suddenly increased in the late seventies.
In the eighties slower musical styles (tempo 70-89 beats per minute), such as ballads, were more likely to become a hit.
The prediction accuracy of the researchers’ hit potential equation varies over time. It was particularly difficult to predict hits around 1980. The equation performed best in the first half of the nineties and from the year 2000. This suggests that the late seventies and early eighties were particularly creative and innovative periods of pop music.
Up until the early nineties, hits were typically harmonically simpler than other songs of the era. On the other hand, from the nineties onward hits more commonly have simpler, binary, rhythms such as 4/4 time.
On average all songs on the chart are becoming louder. Additionally, the hits are relatively louder than the songs that dangle at the bottom of the charts, reflected by a strong weight for the loudness feature.
Watching the data there is intriguing, there are times when two seemingly contrary parameters weight equally, such as 1972 with ‘Hamonically Simple’ and ‘Non-Harmonic’ giving about the same weight. Also, for 2011 the tempo indicator is >190 … which is something akin to speed-metal. So not a perfect model, but interesting nonetheless.
What do you think about this?