Crafting Mission: Generate – a Podcast About Gen AI, Built With Gen AI
When you hear about generative AI, you probably think of textual content. Maybe a smart reply in your inbox, an AI chatbot, or even automated news articles. Today we’re going to talk about something a bit different: audio.
At Mission Cloud, we've been deep at work building AI solutions for a while, and were even invited to be a beta partner for Amazon Bedrock, their native service for working with large language models. But we thought we had a great opportunity to showcase our expertise and use these technologies to make a podcast, a show where the voice you hear isn't exactly human... at least not all the time.
The Challenge of Recording
The first step in creating a good podcast is finding a captivating voice and an authority for the subject. Dr. Ryan Ries, our AI practice lead, was the perfect fit — not only for his 20+ years of experience in AI and machine learning, but also his baritone, commanding voice. The challenge? Ryan's schedule is jam-packed from working with AWS and helping customers bring their ideas to reality. But if we could get him to speak without having to have him record each episode, we could scale this.
What am I alluding to here? Well, we've built our podcast, Mission: Generate, with an entirely AI-voiced Ryan and myself, a "RyAIn" and “CAIsey,” if you will, who speak through a combination of Natural Language Processing (NLP) and generative AI.
To get this voice mimicry to sound good, I had to start with high-quality recordings. So I used FFMPEG to extract the audio from some professional videography we did, and refined it with Audacity to make it crisp and podcast-quality. (Generative AI advised me on how to tune EQ and adjust the right settings, despite my not being an audio engineer!) For my own voice, I recorded a laid-back conversation with our graphic designer, Krista, just discussing music and our favorite company obsession, Music League.
I came to this approach after failing to get a good sound from any other style of recording myself--it turns out that natural conversational tones are dynamic, filled with the pauses, changes in rate of speaking, and emotion that can help the algorithm approximate your speaking style much more effectively. So recording one side of a natural conversation can be a great starting place. To be perfectly honest, Ryan's natural monotony actually threw a bit of a wrench in our plans here--haha sorry, Ryan! But it’s clear that capturing as much dynamism as we could is key to ensure the final output isn’t monotonous.
Writing the Script: AI Assisted, Human Perfected
With the voices ready, the next challenge was scripting. While we used generative AI to summarize and transform technical content we’d already produced into a podcast format, the initial results felt... robotic and a little inauthentic. It wasn't quite the way Ryan talked or the level of detail we were always aiming for. Sometimes it was too jokey, other times too straightforward. So I hand-edited all the scripts myself, sprinkling in colloquialisms and natural phrasing, the "you knows" and "likes" that make speech sound human. It was a dance of sorts: generative AI would produce a working draft of content and then I'd rewrite, often replacing much of it, but using it as a kind of outline to ensure we covered what we needed.
A Blend of Past, Present, and Future
In the podcast, we dive deep into the technical subject matter, including AWS's own services for machine learning, data, and infrastructure, and also the shifting landscape of potential that is generative AI itself. Episode 1 focuses on Intelligent Document Processing (IDP), a well-known machine learning topic but one that has gained some new capabilities when combined with generative AI. Because we've been working with customers for years on their machine learning algorithms and have already built several full-fledged generative AI solutions, we get to offer a real-world builder’s perspective on all of this. So the podcast is not just a showcase of technology or folks pontificating on what might be possible--we're drawing from our own, hands-on experience building with these technologies for real customers.
The unveiling at our company-wide all-hands was a blast. As Ryan’s voice boomed across our Zoom, detailing the magic of IDP, most did not see the big reveal coming--that it wasn’t Ryan speaking, but an AI. That's how convincing this technology already is--even colleagues who'd worked with Ryan for years were surprised by the fidelity.
Here are two takeaways from this adventure for anyone thinking about building their own podcast using generative AI:
- Quality is derived from emotion: If you're going to build a podcast this way, your input recordings need to be of high quality. But more importantly, they must carry a range of emotions and dynamic cadence to give the AI the broadest range of natural sounding speech for each voice. You should expect voices will sound slightly toned down once extrapolated.
- Humans aren't just information delivery units: AI was useful for outlining scripts, but to sound natural you need to add your own warmth and authenticity to the script. It helped that I've worked closely with Ryan for over a year and can somewhat mimic his speaking style in text, but real humans don’t just barrel ahead perfectly, they make asides, guesses, and leave room for indecision.
Generative AI is powerful, transformative, and, as you've discovered today, a reasonably human podcast host. What we also see is that, just like with many of the solutions we build for our customers, the interplay of human touch, insight, and creativity with this technology is what gives it its magic. (And yes, even this blog had a little gen AI magic sprinkled on it…)
If any of this has piqued your interest, why not give our first episode a listen here and tell us what you think? Is "RyAIn" human enough to keep your interest? Want to find out where our next episodes will go?
Stay tuned. And as we say in our outro: best of luck out there and happy building!
Internal Use Cases for GenAI
Learn how to incorporate generative AI into your business with the help of an experienced AI partner. Explore the potential and unlock a new range of use cases.
How to Build Your LLM Policy
Generative AI offers great potential, but you need a Large Language Model policy to mitigate risk. Learn how to create a LLM policy that enables innovation.
AWS Generative AI: The Basics
Enhance your business with generative AI models on AWS. Partner with Mission Cloud to learn how to reach your AI goals while reducing risk and maximizing ROI.