Main Highlights:
- AI-powered speech and video enable substantially quicker, easier, and less expensive content creation, not to mention multilingual adaptability.
- These technologies are also incredibly accessible via text-to-voice services, which means that nearly anybody can use AI to create content without needing a studio or a large amount of specialized equipment, leading to a boom in demand in the entertainment industry.
- Individuals and enterprises alike are safeguarded against falling into a trap as highly disruptive AI-powered virtual identities become increasingly common.
With the significant advancements in AI/ML technology deployment, one of the most fascinating, contentious, and fast-growing advancements concerns the human voice. One particular instance immediately comes to me as capturing the plethora of challenges and emotions associated with AI-powered representatives.
Last summer, artificial intelligence technology was utilized to voice portions of Anthony Bourdain’s works. He never uttered or read aloud, but voice cloning technology gave the text to life in Roadrunner: A Film About Anthony Bourdain. Some audience members felt fooled into believing it was not Bourdain. In contrast, others thought the move was a blunder because Bourdain was not living to provide permission to modify his voice in this manner, while others felt it was simply a clever storytelling device.
The Bourdain example exemplifies two critical challenges that will dominate how AI-based speech technologies are employed in the future. On the one hand, there are concerns about who owns a voice and controls how it is utilized in the present and future. On the other hand, is it morally acceptable to allow someone’s voice to be used in the public realm after their death when they do not influence how or what is said?
These concerns have arisen as AI-based speech technology begins to gain traction; much time and money have been spent on research and development to make machine-generated voices seem “real.” They can now transmit the emotions, texture, and cadence associated with human speech, the natural rise and fall, and various other distinguishing indicators (not to mention song). This is game-changing since it has become more difficult for listeners to distinguish between human and computer speech.
As such, we’ve reached a tipping point in the evolution of the technology, when we must establish fundamental standards and build guardrails, or else, as with so many previous breakthroughs, speech technology’s applications will be utilized in ways they were never intended.
Digital identity ownership
We’ve evolved into a global community hungry for great content experiences through movies, television, and streaming services or user-generated content on platforms like YouTube and TikTok. And, in the not-too-distant future, the metaverse will introduce even more novel modes of engagement with the material. All of these paths create significant prospects for speech and video-driven by AI.
AI-powered voice and video make content creation significantly faster, simpler, and less expensive, not to mention adaptable to several languages. These technologies are also highly accessible via text-to-voice services, which means that virtually anybody can harness AI for content production without needing a studio or a lot of specialized equipment, resulting in a surge in demand in the entertainment business.
Simultaneously, much worry is associated with owning and monetizing one’s virtual identity. In a world rife with deep fakes, misrepresentation, and identity theft, it’s natural for folks to question what happens if their digital identity is co-opted for their gain. Not only would the individual lose control over how their appearance is used, along with any cash or brand recognition connected with it, but it might also be utilized in improper, even criminal, ways – or so the idea goes.
This is quite improbable, however. Each human voice – and each human face – has its distinct imprint, made up of tens of thousands to millions of traits. With improved fraud detection and control systems in development, it should be pretty simple to defend AI-powered identities. What is even more difficult is managing that digital identity over time. It becomes less about business and more about a series of intricately linked ethical judgments.
The ethics of virtual representation and artificial intelligence-assisted identities
Was it permissible for the director to use Bourdain’s digital voice in his film? The director secured authorization to use his AI-cloned voice to deliver the question words, but from whom? Who ultimately has the final say?
Similarly, the AI-powered voice of renowned South Korean folk-rock singer Kim Kwang-Seok was recently employed to release a new song. Although the musician has been deceased for 25 years, a broadcasting firm negotiated a contract with his family to hire artificial intelligence to clone his voice and use it for something new, much to the public’s joy. There are several such instances of entertainment firms and content producers attempting to reintroduce the voice and likeness of celebrities for concerts or films. However, is it ethically acceptable?
On the surface, it appears to be a straightforward matter that can be resolved through licensing agreements and contracts with the entertainer’s estate or, preferably, while the artist is still alive. As the practice becomes more widespread, we should expect to see a clause containing a person’s name, image, voice, or likeness in their Will, particularly one that governs their posthumous wishes or appoints a manager to oversee the career of their virtual self — much like they have a business manager in life.
Virtual identities are not reserved for celebrities alone.
While it is understandable for superstars to explore such content and management partnerships, what about the average person? Perhaps those who mourn the loss of a loved one, such as this mom who just lost her little daughter to an illness? The mom connected with her daughter in avatar form after meeting in a virtual reality setting, supposedly journeying to a version of paradise and throwing a birthday celebration.
While the moment is significant to the young mother and her family, the interaction is not genuine. Certain businesses and customers are opposed to providing such experiences because they alter the child’s look and personality. In contrast, others see an opportunity to bring solace and closure to bereaved families.
And how about developing new educational virtual experiences, such as the award-winning Interactive Holograms: Survivor Stories Experience? When students and communities debate whether the Holocaust occurred or what it meant to be an actual Nazi, isn’t there room for such technology to be used for good? What are the acceptable limits of creative license?
Embracing an AI-enabled future with AI-enabled identities
There are no simple answers for virtual or artificial intelligence-powered identities. We are on the verge of a paradigm shift in content production, where celebrities and everyday people alike will soon be challenged to consider how their voice and image may be utilized not only today but long after they are gone.
Virtual identity will evolve into a currency compared to their physical assets, one in which individuals may declare their life and death intentions and select managers and executors to authorize its future use. This may seem far-fetched, yet neither synthetic voices nor avatars age. With the mainstreaming of the metaverse, our virtual selves can survive far longer than our physical identities.
It will become a new requirement for everyone to establish and clearly define criteria for their digital identity. Similarly, firms that provide platforms for developing AI-powered voice and video content must establish explicit regulations for adopting and using a specific virtual AI-powered identity. Thus, both individuals and businesses are protected from falling down a slippery slope as highly disruptive AI-powered virtual identities grow more prevalent.