Understanding Podcast Audience Measurement
Counting listeners: Whether it's television, radio, or podcasts, knowing your audience is often essential. How does this work for podcasting?
This article was originally written in French for issue 3 of “Le Podcast Magazine” (1st magazine dedicated to the Podcast in France).
Since their emergence in the early 2000s, podcasts have continued to grow their audiences, reaching increasingly numerous and diverse publics. A significant milestone was reached in 2014 with 'Serial', a podcast by American journalist Sarah Koenig (produced by NPR), which quickly exceeded 5 million downloads and ultimately reached an astronomical 340 million downloads by 2018! The podcast advertising market, rather nascent until then, rapidly developed and strengthened, much like the radio advertising market had a century before. The need for reliable audience measurement tools thus became indispensable: 'How many listeners heard my advertising message?', 'When?', 'Where?' are legitimate questions that advertisers expect answers to. However, providing this information is not as simple as one might expect.
Indeed, two challenges make podcast audience measurement... frankly complicated!
First challenge: Podcasts are decentralized. Unlike centralized platforms born after the 2000s (such as Facebook, YouTube, etc.), Podcasts are not controlled by a single actor. Even though Apple Podcasts has long dominated, podcast technology has always been open and interoperable. This is excellent news because it results in an eclectic, thriving, and free ecosystem. In practice, this means that podcasts are not all stored on a centralized system but are distributed across hundreds of different hosts – at the podcasters' choice – and listeners can choose from hundreds of listening applications! There is no single point where all listens could be measured – as is the case with YouTube, for example. Measurements at a particular host will not provide data on podcasts hosted elsewhere, and measurements on a given listening application will not provide data on listeners using other applications: Podcasts are decentralized, and so is their audience measurement.
However, akin to Google Analytics for websites, solutions that fit between hosts and listening applications – often called 'Podcast Prefixes' – have emerged. Among them are Chartable, Podtrac, and OP3 – there are others. Their operation is clever because it is quite simple: it interposes the 'Podcast Prefix' between the RSS feed and the MP3 file.
Second challenge: Podcasts were born before the era of “hyperconnectivity”. The original podcast technology - MP3 files in an RSS feed - comes from the 90s when Internet connections were not permanent. I'm talking about a time that those under twenty may not know, but modems in those days would disconnect the phone: they blocked the telephone line when connected, and Internet was paid by the minute! 'Streaming' (i.e., downloading files while listening) was unthinkable, and files had to be downloaded in advance to listen once disconnected. And this technology is still used today. Most podcast audience measurement tools cannot count 'listens' (which would be possible with streaming), but they count 'downloads'. However, it is technically impossible to know if a downloaded file will ultimately be listened to, so listening statistics are in fact only download statistics.
Finally, a podcast can be downloaded completely anonymously, without ever having to create a user account. In an age where GDPR struggles to protect us from systematic profiling, this seems barely believable: a podcast can still be listened to without any 'login' or 'cookie'. Demographic data (age, gender, etc.) are therefore generally not known, except within certain listening platforms. (For example, it is impossible to use Spotify without giving your birth date. And while you are not required to specify your gender, the question is still asked…)
But then, how do you measure a podcast's audience? There are several ways to proceed. Each will measure one of the links in the podcast chain. Each implements its own methodology, each with its advantages and disadvantages.
Measurements can thus be made:
- Directly on the hosting platform
- Between the host and the listening application – the famous 'Podcast Prefixes'
- In the listening application
- By survey.
System | Type | Source | Fee | Advantages | Disadvantages |
---|---|---|---|---|---|
IABv2 | Guidelines | On the hosting platform or between the hosting platform and listening apps | Free | Free and widely implemented | Algorithms can be improved |
IABv2 | Certification | On the hosting platform or between the hosting platform and listening apps | Paid | Globally recognized | Relatively expensive |
Hosting services | Service | On the hosting platform | Free or Paid | Directly integrated into your podcast, can follow IABv2 standard and be IABv2 certified | - |
Podcast Prefix | Service | Between the hosting platform and listening apps | Free or Paid | Free to start, IABv2 certified | You lose control of your data |
OP3 | Service | Between the hosting platform and listening apps | Free | Free, Open-Source, Open-Data, follows IABv2 guidelines | Not very well known yet |
Apple Podcasts | Service | On the listening app | Free | More detailed information | Only Apple listens |
Deezer | Service | On the listening app | Free | More detailed information | Only Deezer listens |
Spotify | Service | On the listening app | Free | More detailed information | Only Spotify listens |
Google Podcasts Manager | Service | On the listening app | Free | - | Discontinued in 2024! |
RAD | Guidelines | On the listening apps | Free | More detailed information, universal | Never implemented! |
Let's start with some bad news: the most technically advanced solution proposed by the American radio network NPR, RAD (Remote Audio Data), never really saw the light of day. It was boycotted by Apple and Google even before it was implemented once. It would have allowed for reliable and precise listening data. In simple terms, it was meant to insert temporal markers into an MP3 file (for example, at the time of broadcasting an advertisement) and to notify a server each time a listening application passed one of these markers.
Regarding free third-party services that also offer a paid option, utmost caution is advised: these generally provide a basic service for free. Then, they charge for access to your own data, data that they can further capitalize on by publishing global overviews on hundreds of podcasts.
An exception should be made for OP3: although it is also a free third-party service, it has three unique features:
- The free offer gives access to all your data, so there is no paid offer.
- It is open-source, meaning all its programs are freely accessible.
- It is open-data, meaning the data it generates are accessible to everyone (not just to podcasters).
In the long run, OP3 could well become a reference and provide completely new information.
As for hosts, almost all provide comprehensive statistics, often at no additional cost. Some can even aggregate their data with partial but informative data (age, gender, etc.) retrieved from listening applications (Apple Podcasts, Spotify, Deezer, etc.). Finally, it is sometimes possible to download all your raw statistics (with Castopod for instance 😉).
Note that almost all hosts follow the IABv2 standard.
And precisely, this IABv2 standard, now universally used, deserves explanation.
It is a free standard, freely available on the IAB Tech Lab website, implemented by many solutions. It can be implemented either by a 'Podcast Prefix', i.e., a third-party measurement service, or directly by a host (which is the case for most of them).
It can lead to certification, meaning that for a few thousand euros, the IAB will certify that the measurement tools in place comply with the standard.
The goal is to compare audiences, ensuring homogeneity in the counting method. Indeed, if you thought that one download number was just like another, think again: there are many ways to count.
Note also that the IABv2 standard was primarily designed to reassure advertisers who want to be sure that when they pay for a thousand listens, they get their money's worth.
In short, the rules to follow are as much as possible:
- Count only downloads of sufficient duration (to ensure the advertisement was heard). Therefore, downloads corresponding to a duration of less than a minute are excluded.
- Count only once the episodes downloaded multiple times by the same listener over a 24-hour period.
- Exclude all downloads made by robots.
We say 'as much as possible' because this standard was designed to work in the podcast ecosystem, which we have seen is quite restrictive. We are talking about the number of 'downloads' and not the number of 'listens'.
Moreover, unlike websites, podcast listening applications do not have 'Cookies' or other trackers: it is technically impossible to accurately identify a listener. Therefore, the listener is identified based on their IP address and the listening software used. If a listener changes IP, for example, switching from their professional connection to their personal connection, they will be seen as... two different listeners.
However, in the end, the errors don't matter as long as everyone makes the same ones: what counts is having a common reference point for comparison. Indeed, even with an ultra-accurate measurement of listens, how can we guarantee that the listener was attentive and focused at the time of the message that matters to us? This is part of the pitfalls of any measurement and is not specific to podcasts. The challenge is more about the comprehensiveness and availability of data than the way they were generated. In this respect, OP3 seems to meet all our expectations, provided this solution is widely adopted.