Eps 1559: Audio books generated by Artificial Intelligence.

The too lazy to register an account podcast

Host image: StyleGAN neural net
Content creation: GPT-2, transformers, CTRL

Host

Daisy Shelton

Daisy Shelton

Podcast Content
In essence, an audiobook is a vocal recording of a books text, which can be listened to instead of read. Audiobook listeners select audiobooks because of the presentation of material, but also because of the narrative. Because of the expense of the narrator and the audio production, most titles are never made into audiobooks, especially with smaller publishers, said Bryan Carroll, managing director, Indiana University Press, on rights. Nearly 1,000,000 books are published each year in the U.S., yet only about 40,000 out of nearly 1 are converted to audiobooks, mostly because of costs and production time.
While demand is steadily increasing, the audiobook industry faces a number of procedural challenges. Consider the challenges in producing an e-book of printed books as compared with producing an audiobook. The problem in satisfying that expectation, or exceeding it, is in the challenge of producing substantially more audiobooks. Even if it does not, if the quality of the audiobooks generated by computers is good enough, there seems little reason to make exceptions to that rule.
That is, incorporating artificial voices and eliminating human narrators would drive audiobook volume, leading to higher margins and faster audiobook market times. Many countries lack the developed ecosystem for producing audiobooks, so AI voices will help them to produce mass quantities of content on audio. AI narration is not going to replace quality audiobooks produced by actors and vocalists who are at the peak of their skills. It really should come as no surprise, then, that the audiobook industry is looking towards AI to streamline its narrative processes.
Perhaps nobody is approaching the revolution of AI narration quite like Speechki, a new audiobook-recording platform that uses synthetic AI voices to produce audiobooks in as little as 15 minutes.
DeepZen is a company that produces the machine voices used in audiobooks, which are based on real voices from human actors. To assist in conversion, DeepZen, a company in London, is part of NVIDIAs startup incubator program, Inception, developing a system powered by deep learning, which is capable of creating full-voice recordings for books and other speech-related applications, which are human-like and full of emotion. Its AI-powered features utilize machine learning to convert audiobooks to text. Google is testing out an autonomous narration service of its own, which publishers can use to create English audiobooks free of charge, using over 20 different synthetic voices.
With Googles offer, you can select one of over 35 different voice actors, send in your.EPUB file, and get audio tracks back which can be edited. According to industry sources, both Apple and Amazon worked with narrators on developing their own vocal corpora, and Googles offer is currently available in free beta. It would be cost-prohibitive to produce different versions of an audiobook, featuring varying voices, but this can be achieved using AI narration. A rough audiobook can therefore be created very quickly, including using different voices for different speakers.
If you are using Google to build an audiobook with an automatic voice, there is a tool that can be used to instruct an AI narrator on how to pronounce certain words. Imagine listening to an audiobook, and having a function that lets you choose narrators accent. As far-fetched as this sounds, this feature could come in handy in those instances when you have found an audiobook you are interested in, but the narrator has an accent that makes listening difficult.
Listens means, overall, that you can feel fairly confident that Audible is not going to sell you an AI-narrated audiobook. There is one obstacle currently standing in the way of an easier audiobook with AI narration, and that comes from Amazon. The biggest benefit of audiobook AI voices to publishers is the savings on actual human voice actors.
Every audiobook listener I know fears the market for their preferred human voices is going to shrink. There are many different episodes of VOBoss podcast about artificially intelligent voice acting, with various opinions from that it should not be allowed, that everything sounds awful and that AI will never communicate as well as humans, to the point where we should accept AI, license our voices, and use this tool to promote our own personal brands and earn more money. For those of us who listen to audiobooks daily, who dearly love our favorite narrators, and who essentially live and breathe audiobooks, this whole AI-voice-for-audiobooks discussion might seem like a little slap on the wrist, because an audiobook is such a deeply personal experience. AI voices are also still nowhere near here .
Narrators could potentially use the technology to produce an early draft of the audiobook, and then polish the completed audiobook the way that translators with AI augmentations do. An intriguing use case for the Pozotron technology is creating foreign-language versions of audiobooks using the same voices of English-language narrators. Today, Google announced yet another variant on their own.
Some publishers are looking at synthetic voices as a way to capitalize on growing audiobook demand, a healthier segment than some other parts of the books business. Audiobooks account for 10 to 15 percent of the Canbury Presss per-book revenues. You can select from 130+ voices in 20+ languages and accents, wide-range tonality, and audio-support tuning options to produce professional, natural-sounding, studio-quality audiobooks within minutes, right in your own home. With the Murfas Realistic AI Voices, you can generate emotionally-rich voiceovers for your audiobooks across multiple accents, and also add polish to your narration through various customisation features such as pitch, pause, emphasis, and pronunciation.
When done well, voice actors infuse audiobooks with narrative sense and emotions that compliments written words. AI also only keeps reading at a steady pacing, when a good audiobook narrator plays with the pacing, hanging back in certain scenes while racing through others.