Business News Digital Insights Blog Labels & Publishers The Great Escape 2018

CMU@TGE Top Questions: How are Shazam-like technologies quietly revolutionising the music business?

By Chris Cooke | Published on Thursday 19 April 2018

Abstract polygonal background

With The Great Escape now just a month away, over the next fortnight we’ll be considering ten questions that will be answered during the three CMU Insights conferences that are set to take place there this year: The Education Conference (16 May), The AI Conference (17 May) and The China Conference (18 May). Today: How are Shazam-like technologies quietly revolutionising the music business?

Shazam probably remains the highest profile of all the technologies that can recognise music. When Apple announced its plan to buy the company last year, we were reminded just how long Shazam had been telling people what tracks they were listening to – the service having launched long before the smartphone, initially informing users of a track’s name by SMS.

Over the fifteen years that Shazam has been live, lots of other companies have been developing technologies that can also identify your favourite tunes. Some have tried to compete head-on with Shazam by offering music recognition services to consumers, either via their own apps or by bundling their technology into other people’s applications. Although other start-ups dabbling in audio-recognition have business-to-business, rather than business-to-consumer, ambitions.

Platforms offering audio-recognition are usually based around what are referred to as ‘digital acoustic fingerprints’, or some variation of that term. The platform creates a ‘condensed digital summary’ of each piece of audio it is exposed to. That ‘condensed digital summary’ is unique to that track, hence ‘fingerprint’. Metadata is then attached to each fingerprint to identify the audio and provide other key information about it.

Once a database of fingerprints has been built, when the audio-recognition platform is re-exposed to a piece of audio it should be able to identify which fingerprint the track is associated to. It can then deliver the accompanying metadata to the user.

From a technical perspective, advances in the audio-recognition domain include the ability to more quickly identify a track from a smaller sample of the recording being identified, and being able to ID a track oblivious of sound quality and background noise, or where the track has been slightly altered in some way.

Then there is the separate challenge of recognising songs rather than specific tracks, so that a platform can identify new and live versions of songs as well as officially released recordings. Recognising new versions of existing songs is obviously a little more challenging than matching an already logged sound recording.

Commercially speaking, the biggest potential for audio-recognition is probably in business-facing technology.

Perhaps the highest profile B2B use of this technology so far is YouTube’s Content ID. YouTube’s system is designed to allow copyright owners to more easily identify and manage user-uploaded videos that contain their content. In the case of music, that might be user-uploads of official music videos, user-generated content soundtracked with someone else’s tune, or a cover version of an existing song.

In theory, Content ID means that artists, labels, songwriters and publishers need only upload their music once into the YouTube system. That system should then automatically spot if that content is included in any other people’s videos. Whoever controls the copyright in the music can then either decide to block that user-uploaded video or share in the ad revenue it generates.

Although Content ID is probably the best known, other user-upload sites have developed or bought in similar audio-recognition systems.

Such websites are obliged to provide copyright owners with some tools to remove uploads that contain their content without permission. If they don’t, said websites could be held liable for copyright infringement for hosting unlicensed copyright material.

However, these tools don’t currently have to include anything as sophisticated as audio-recognition. The music industry would like that obligation to be added to copyright law, especially in Europe where a new copyright directive is being negotiated.

Even without the legal obligation, those user-upload sites which want to engage with the music industry have usually had to invest in audio-recognition, in order to make their proposition – “let our users exploit your music and we’ll share our ad income with you” – attractive to the wider music community.

This means more and more sites are looking to develop ever more sophisticated audio-recognition tech. Even more will do likewise if copyright law does indeed change.

Perhaps the really exciting use of audio-recognition technology in music is in public performance – ie when music is performed or played in a public space. Royalties are due whenever music is used in this way, and that money is usually collected from the venue or concert promoter by the local collecting societies, which then pass the cash on to their members.

Although less high profile than CDs, digital and sync, that income has been slowly growing over the years, even when other key recorded music revenue steams were in freefall. And as copyright regimes and collecting societies are ramped up in key emerging markets, even more live and public performance royalties should be unlocked.

But how does the collecting society know what music has been used and therefore who to pass the money onto?

When artists perform their own songs they can be expected to report that back to their collecting society (not that they always do, but they should). But what about small gigs where people perform other people’s music? What about clubs? What about bars, cafes, gyms, shops and workplaces?

The truth is we often don’t know what music is being played in these places. Until now, actively monitoring what songs and recordings were being used would have cost more than the royalties these businesses pay in. Therefore market research and market share data has often been used to distribute this income.

Clever use of audio-recognition could change all that – ie little internet-connected boxes with some audio-recognition technology inside could be listening to all the music played and then reporting back to HQ. As the cost of these technologies comes down, while the accuracy of such systems goes up, that is starting to become a reality. It’s very much early days, but some collecting societies are now experimenting with all this.

Which is how audio-recognition technology is quietly revolutionising the music business. At The AI Conference at The Great Escape next month we’ll be looking at all this in much more detail.

Music lawyer Sophie Goosens from Reed Smith will update us on what extra obligations the new European copyright directive is likely to place on user-upload sites. And we’ll talk to Rebecca Lammers from Laika Network and Gideon Mountford from Believe about Content ID and Facebook’s Rights Manager.

Plus Russell Chant from PPL and Tim Arber from PRS For Music will discuss their pilot project with DJ Monitor, using audio-recognition technology in clubs. Come join the revolution!

The AI Conference takes place on Thursday 17 May – more info here. See more questions we’ll answer at The Great Escape here.

READ MORE ABOUT: CMU Insights | The Great Escape