2021 State of audio AI consumption report
Exciting stuff - our yearly audio AI consumption report is here!
The superpower of sound and listening started at the dawn of humankind and oral stories. Today, it continues in new ways - with radio, podcasts, and of course - audio AI.
Here at Trinity Audio, we are on a mission to help every piece of amazing original content reach as many people as possible. We do it through the use of audio AI technology, which allows us to offer smart audio experiences in 125 languages and accents, and with over 600 natural-sounding voices.
There was a lot of listening going around in 2021 so we’ve dug deep into our data to show you the ongoing audio AI listening habits across the globe.
Based on data collected from over 100 publishers in 49 countries that feature our audio solutions, our main findings for 2021 are:
Audio content is engaging: almost half of the total listeners listened through the entire content as the completion rate clocked at an impressive 48.87%.
People embrace audio ads: audio ads had a whopping LTR of 96%, once again proving that audio advertising is extremely receptive with people listening to entire ads during their audio experience.
Most listening is done in afternoon hours: mid to late work hours are the peak listening times, with the largest amount of playback happening around 15h.
There is high engagement in English-speaking places - listeners in countries where English is a primary or secondary language like in the USA, Jamaica, and the Scandinavian countries, engage the most via eardrums.
Right before we dive in, here’s a quick glossary of sorts, that will help you navigate smoothly. Player load - the instance in which an audio player loaded on a page, making it possible to play the audiofied content.
CTR or click-through rate - the percentage of times audiofied content was played, as a subset of the times the audio player was loaded.
LTR or listen-through rate - the average percentage of the audiofied content that was listened to.
Completion rate - the percentage of the audiofied content that was listened to all the way.
Pre-roll ad - an ad that is inserted before the start of the content.
Mid-roll ad - an ad that is inserted during the content, usually around the middle.
Trinity Player - our audio AI play that converts textual content into audio in just a few clicks.
Trinity Pulse - a separate unit that displays bursts of respective top trending audio content alongside popular podcasts and radio shows from across the web.
Now, let’s get into the nitty-gritty.
Engagement with audio AI content is off the charts
Arguably the big news here is that people really dig audio versions of content narrated by AI voices.
Our audio units (Trinity Player and Trinity Pulse) were loaded slightly more than four billion times, automatically generating audio content.
The click-through rate (CTR) was 2.03% or 79,3 million audio playbacks. From there, 48.87% of our text-to-speech audio versions of news articles and blog posts were listened to from start to finish.
Furthermore, there was a listen-through rate (LTR) of 69.14%, which means that once a user clicked to listen, they had clear intent - more than two-thirds of the audio content was listened to on average.
All of this clearly suggests two things.
The first one is that human ears have become not only tolerant of “mechanical” voices but comfortable with them.
Powered by neural technology, we live in the age of the most natural and human-like synthesized voices ever developed. The advantage of neural text-to-speech is learning from training data, which results in smoother speech with no audibly stringed units of sound, proper rhythm, and intonation of the voice depending on the intended use case - if the context is conversational or informational.
This type of synthesized speech has seamless transitions, whether it’s a natural pause between paragraphs, the switch in the dialogue between different characters, and so on.
Secondly, the high LTR and completion rates show that the ability to consume content via audio is serving a market need and that audio AI technology has become very important in providing the best content experience.
Mobile wins the usage battle against desktop
In what is perhaps the most unsurprising find of the report, digital audio listening occurs primarily through mobile devices. They are a major factor in digital audio’s growth, driving the constant uprise in the share of time spent listening.
Our data shows the same: 71.1% of listening was done via mobile, while the remaining 28.9% falls on desktop devices.
With wearable tech such as earbuds and smartwatches among the largest and fastest-growing segments in the IoT market, we expect mobile listening to gain an even greater share this year.
Times when ears are most occupied
In terms of when listeners are tuning in the most, 14:00 to 17:00 locally are peak hours, registering slightly above 10k playbacks per hour each day.
Afternoon work hours generally had a steady listening rate, followed by evening hours between 21:00 and 23:00 as the time of day when listeners sought out audio content.
Taking everything into account, listening is fairly evenly spread throughout the day. On average, there are almost a million more plays at the peak hours compared to 7:00 and 8:00 AM, which registered the lowest listening rates.
What’s interesting is that morning hours between 6 AM and 9 AM have the highest LTR rate at 73%, despite the lowest content plays. In comparison, peak listening hours had a 67% LTR, which is slightly below average.
The point here is that listeners have formed a specific habit and are alternating between light and heavy listening in their daily media diet.
Since this is a yearly review, here’s some anecdotal data about monthly listening.
Curiously, most listening in 2021 happened during late fall and winter. December, November, and January were the busiest listening months of the year, in contrast to June and July.
This is likely due because the change in seasons affects our mood as it brings different physical and psychological circumstances, emotions, and challenges. The carefree days of summer when the school is out and vacation mode is on means less listening, which are slowly being replaced with shorter days and colder temperatures.
In many places around the globe, temperature and light represent a time for reflection, which often leads to more focused listening due to audio’s highly personal and emotional nature.
Countries with the most audio-hungry listeners
For this, we’ve looked at countries where there was a minimal 100.000 content plays throughout the year.
Out of 37 such countries, listeners in Jamaica were by far most keen to play the audio version of the content they were consuming, registering a 7.15% CTR. Denmark and the US are next with 3.75% and 3.04%, respectively.
In terms of LTR, Denmark leads the list with 79.97%, followed by its neighbor Sweden at 77.46%. Israel is at third place with a 73.61% LTR, just ahead of the United States, where the highest content plays were registered.
Looking at the completion rates, things are vastly different. Ghana had the highest completion rate with 58.57% of all audio content being listened to from start to finish. Next up was Sweden with 54.77% and Israel with 54.18%, with the United States and Poland rounding off the top five.
As data suggests, listening is a truly global pastime, covering almost every continent (we are still waiting on data from Antarctica, though). It can be said that audiences have developed listening habits, whether it’s for information or entertainment purposes, that are now being actively maintained.
The fact that there is a high engagement in countries where English isn’t the native language - but people still listen to English-based content - supports this.
The majority of this behavior was observed in traffic from Europe, most notably Scandinavia and Israel where English has widespread use in education and society in general.
One conclusion is that having an audio solution is relevant for countries that are no stranger to English but still have certain difficulties in reading a full article - comparable to having English subtitles along with original English audio for easier following.
Audio advertising works
The narrative surrounding audio ads is that listeners are generally more open to them and more receptive in comparison to viewers.
But can a screenless link between a brand or a business and their audience consumer be monetized?
Our data provides a resounding ‘yes’ as an answer.
Thanks to our programmatic advertising capabilities, every piece of audio content was accompanied by an audio ad, dynamically inserted in pre-roll (before audio content) and mid-roll (during a section in the relevant place within the content) positions.
In 2021, the average LTR was a whopping 96% or 105.8 million ads listened to.
In other words, almost everyone who listened to the content in question also listened to an advertisement in full. This goes to show just how resilient the audio advertising market is, and how rare it is to see and hear such high engagement rates.
In relation to the breakdown of the consumption of the ads across different content lengths, it appears whether long or shorter content, ads are being listened to at very high listen-through rates - no less than 94%!
Hence, audio ads are becoming a more important part of the media strategy as they allow the evolution of marketing and advertising strategies in a natural way. They offer a unique ability to reach a highly targetable and mobile audience in a brand-safe environment and fill otherwise untouchable voids in the user’s buying journey - places where screens are simply not an option.
It’s worth noting that in October, we launched the first-ever audio AI ad campaign in the United States with our partners McClatchy - native audio ads entirely driven by artificial voices. The campaign featured both pre-roll and mid-roll spots alongside automatically generated audio content of a story, and the results were equally great:
pre-roll ads had an over 99% completion rate
mid-roll ads had a completion rate of 75%
Because both the ad and content are generated and voiced by AI, the audio quality is the same throughout the entire length of playback. This was a major factor for high completion rates as it creates a similar feeling to podcasting where the host is reading the advertising sponsorship of their podcast.
In the true native sense, ads are integrated into the listener’s audio experience.
So why is all of this a big deal?
With all the innovation that’s happening, constant connection to the Internet, and the proliferation of IoT in our lives, users are ever closer to being online 24/7.
As a medium, audio represents an escape from all the ocular bombardment but more importantly - it’s experienced as something that completely occupies everyday life.
With content coming at us from every angle, all day, people continue to uncover all that audio has to offer.
Convenience of use
We could go on and on.
The point is:
our visual attention is maxed out.
Whether you’re a publisher, content creator, or advertiser, audio is a trusted and versatile medium that offers magnificent benefits.
The fact that you can dial in the AI and ML to cater to diverse and multilingual audiences provides two key things:
Flexibility in how listeners can consume content
How anyone can offer an audio experience at scale
It is our hope that this quick little report will open your eyes and ears to a massive opportunity.
Starting with January 1st and ending with December 31th 2021, data was compiled from 100 publications in 49 countries across the globe. The audio player was loaded slightly over 4 billion times during the period, with a total of 79,3 million audio playbacks, serving approximately 112 million users*.
*user data is calculated based on internal numbers and does not correlate with commonly used web methods