Introducing Octopus – the audio AI tool for content creators

Ron Jaworski
Apr 9, 2021
6 min read

Updated: Nov 29, 2021

What do you call an octopus who plays guitar? A rocktopus!

There’s a perfectly good reason why that pun is my opening line. We are starting off 2021 with a modest bang and introducing our latest rock star: the Octopus! An audio solution that helps content creators extend their “content arms” all across the www via the magic of audio.

Bitchin’ name aside, Octopus is our latest addition to Trinity CMS that targets two distinct groups within the media ecosystem: authors, content creators, and editors on one side and listeners on the other.

In a way, Octopus takes care of both content creation and content distribution for publishers, brands, bloggers, and the rest who want to step up their audio game and provide an even better listening experience to a growing listenership. It’s the next level of listening powered by constant advancements in audio AI. Let’s dive in.

What listeners and content creators are gaining with this next generation of audio creation

For those unfamiliar with the usual way Trinity Player works, I’ll just say that we integrate our solution into a website, it analyzes the textual content, and then automatically provides an audio version of it. Users get an option to listen instead of read, and can do so while doing something else off or on a computer/smartphone (the playback resumes in the background).

With Octopus, we come full circle and provide authors with a hybrid text editor within the CMS where they can:

work directly on the resulting audio content
customize it to the tiniest of details
distribute across the audio and voice landscape

With a robust solution on one end that turns all textual assets to audio and a fully customizable service on the other, we enable content creators to audiofy an article in a matter of seconds, edit it via our editor to give it a personal touch, and distribute it via custom news feeds. It doesn’t matter if the content in question is published an hour ago or 10 years ago.

Wild right? Let’s break the magic down.

The hybrid editor image

Pronunciation – to-may-to, to-ma-to? It’s now the content creator’s call

It’s possible to fine-tune almost every aspect of the listening experience, starting with pronunciation. It’s possible to change the way a certain word or phrase is pronounced, in case it doesn’t sound quite as you’d like. Some of the things you can do are:

Enter custom pronunciation which allows you to perfect every single word, especially if it’s from a foreign language (e.g. surnames);
Spell out each digit individually;
Interpret the numerical text as a cardinal number, ordinal number, fraction, or measurement;
Spell out each letter of the text;
Interpret the text as part of a street address or as a date;
Interpret the numerical text as a 7-digit or 10-digit phone number;
Interpret the numerical text as duration in minutes and seconds;
“Beep out” the text in case of an expletive or an inappropriate word in the text somewhere.

Empowering content creators to adjust voice styles and features

Choose the voice you feel best suits your content in terms of gender and language;
Add a different voice for different paragraphs for a more diverse listening experience, like interviews;
Select a voice style that best suits the type of your content:

Regular: suitable for all types of content

News: particularly suitable mostly for news content

Conversational: particularly suitable for story-like content

Explore different features for each voice such as the general speed of the content or parts of it, breathing (add breathing sounds via our automated algorithm or manually), pauses (add a pause anywhere in the text and customize the length of it), and tone (in terms of the emphasis, volume, rate, pitch, and timbre for a specific word, phrase, or sentence), among other features.

Options for changing the tone of a voice

As you can see, there are loads of options to make the voice sound more natural and human. In a nutshell, you can choose the default voice on the channel level and let our AI do what it does best, while retaining the option to have small fixes as you see fit. The end result is audio content that can be personalized to a specific listener group, making it feel like it is just for them.

Another benefit for the author is that this helps them add an immersive layer to the story. Because every word and every part of the content is open to changes, this basically helps authors shape what the story is about and give it more of a personality, so to speak. This is particularly useful for so-called developing stories, where modifications in real time made it possible to deliver the full story and experience.

Audio content distribution on a whole new level

Get this: all of the audio content created using the Octopus will automatically be distributed in a flash briefing format. Users can invoke flash briefings on a smart speaker or any other voice-enabled device that features Alexa, Google Assistant, and Siri or go to their favorite streaming platform and play the content from there.

At the center are Splash pages, which are easily exported and implemented webpages to promote said content in all of its glory. Here’s an example of one such Splash page from one of McClatchy’s newsrooms, the famous Miami Herald, where listeners can subscribe to and/or simply choose between multiple channels to momentarily consume audio content:

So how can one leverage this? For instance, journalists can create their own news feeds featuring topical information. If someone is covering a certain local team, they can create a feed and update it regularly with relevant news. A blogger can have a daily briefing on current happenings or a weekly summary of their best content. There are all kinds of possibilities to create custom made content and have it portable and easily updatable to stay fresh and relevant

Voice technology is an evolving medium that represents exciting opportunities to innovate, especially when it comes to narration and storytelling, areas where smart speakers excel and are already being embraced by users across the globe. With McClatchy’s cross-industry content and storytelling experience, Octopus allows it to create individualized streams of information and further build experiences designed for eardrums.

UX gets louder – and better

The octopus represents a move toward an audio-infused publishing interface that can be easy to use and highly scalable at the same time. This is something that a powerhouse such as McClatchy, who up to now relied on human narration to convey nuance and emotion, wasn’t able to accomplish previously.

In providing an AI solution to its reporters, the company, and hopefully other publishers, gets a bunch of options to make the user experience more pleasant and accessible. In other words, it gets a better and stronger presence in the ears of listeners. As people speak in different languages and speaking styles, the latest text-to-speech systems are mimicking as closely as possible human behavior to adapt to different contexts and overall improve customer experiences.

So, what Octopus brings to the table is:

The true promise of scalability to specifically tailor content to specific audience segments;
Full editorial control to shape the audio content at will;
Sense of exclusivity where each publisher can decide how to create and distribute content, whether it’s audio-only, audio accompanying text, accessible only via smart speakers or streaming platforms, or both as there are plenty of options to experiment and come up with a winning combination.
New distribution options via integrations with leading streaming audio platforms and voice assistants;
New monetization options such as audio ads, subscription models, and more.
Reduced costs as creating and distributing audio takes mere minutes without the cost of professional studios, narrators, producers, and so on.

As technology advances, there will be even more options to improve the listening experience on a granular level, as well as distribute and monetize the audio content behind it. Because it’s easy to listen to whenever and wherever, it doesn’t take much for publishers to help form a new Wmedia habit and instill loyalty along the way. The simple fact is that people prefer to hear content so going forward, audio will definitely be one of the answers.

That’s it for now from the Trinity Audio team and myself. We are extremely proud of this product and grateful for the collaboration and support from our friends at McClatchy. You’ll be hearing more about new products from the Trinity dev lab this year but for now, enjoy all the tentacles of Octopus.

Make sure you’re following me on Twitter for ongoing updates, tips, and industry takeaways!

#Userexperience