OpenAI provides limited access to a text-to-speech generation platform it developed called Voice Engine. This platform can create synthetic voices based on her 15 second clip of someone's voice. On command, AI-generated voices can read text prompts in the same language as the speaker or in many other languages. “These small-scale deployments help us think about our approach, safeguards, and how voice engines can be beneficially used in a variety of industries,” OpenAI said in a blog post.
Companies with access include education technology company Age of Learning, visual storytelling platform HeyGen, leading health software maker Dimagi, AI communication app creator Livox, and health system Lifespan.
In these samples posted by OpenAI, you can hear what Age of Learning is doing with technology that generates pre-scripted narrated content, as well as You can also read out “real-time personalized responses” to students as they are written. .
First, the English reference audio:
And here are three audio clips that the AI generated based on that sample.
OpenAI said it will begin developing the speech engine in late 2022 and that the technology is already powering preset voices for text-to-speech APIs and ChatGPT's text-to-speech functionality.In an interview with tech crunchJeff Harris, a member of OpenAI's speech engine product team, said the model was trained on “a combination of licensed and publicly available data.” OpenAI told the publication that the model is only available to about 10 developers.
AI text-to-speech generation is an ever-evolving field of generative AI. Most focus on musical instruments and natural sounds, but few focus on audio generation, in part because of the questions raised by OpenAI. Names in this space include companies such as His Podcastle and Celebrities, which offer AI voice cloning technology and tools. barge cast I explored it last year.
OpenAI said its partners agreed to abide by its usage policy of not using Voice Generation to impersonate a person or organization without their consent. Partners are also required to obtain the “explicit and informed consent” of the original speaker, and rather than building a way for individual users to create their own voices, must disclose to the listener that it was generated by an AI. OpenAI also adds watermarks to audio clips to track their origins and actively monitor how the audio is used.
OpenAI has a number of issues surrounding such tools, including phasing out voice-based authentication to access bank accounts, policies protecting the use of people's voices in AI, increasing education on AI deepfakes, and developing tracking systems. It proposed several measures that could limit the risk. About AI content.