PROVIDENCE, R.I. (AP) — Before last summer, Alexis “Lexi” Bogan's voice was booming.
She loved belting out Taylor Swift and Zach Bryan ballads in the car. Whether she was corralling misbehaving preschoolers or debating politics with her friends at the backyard fire pit, she was always smiling. During her high school days, she was active as a soprano singer in the choir.
Then the voice disappeared.
In August, doctors removed a life-threatening tumor near the back of her brain. When her breathing tube was removed a month later, Bogan had trouble swallowing and strained to say “hello” to her parents. Months of rehabilitation helped her recover, but her speech impairment remains. Her friends, strangers, and her own family struggle to understand what she is trying to convey.
In April, the 21-year-old regained her old voice. It's not the real thing, it's a voice clone generated by. artificial intelligence She can be called from the phone app. Her synthetic but surprisingly realistic AI voice, trained on her 15-second time capsule of her teenage voice (sourced from a cooking demonstration video she recorded for a high school project), Now she can say almost anything she wants.
She types a few words or sentences into her phone and the app instantly reads them out loud.
“Hello, can I have a Grande Iced Brown Sugar Oat Milkshake Espresso, please?” said Bogan’s AI voice as she held her phone up to her car window at the Starbucks drive-thru.
experts warn Rapidly advancing AI voice cloning technology could amplify phone fraud and cause disruption democratic elections And you end up violating the dignity of people, living or dead, who never consented to having their voices reproduced and saying things they never said.
is used to produce deep fake robocall To New Hampshire voters imitating President Joe Biden. In Maryland, authorities recently filed charges A high school athletic director used AI to generate a fake audio clip of the principal making racist remarks.
But Bogan and a team of doctors at Lifespan Hospital Group in Rhode Island believe they've found a use that justifies the risk. Bogan was one of the first people to successfully recreate a lost voice using OpenAI's new voice engine, and is the only person with her condition. Other AI providers, including startup Eleven Labs, are testing similar technology with people with speech disorders and language loss, including lawyers who are using voice clones of themselves in court. included.
“We expect Lexi to be a pioneer in advancing the technology,” said Dr. Rohaid Ali, neurosurgery resident at Brown University School of Medicine and Rhode Island Hospital. He said millions of people with debilitating stroke, throat cancer and neurodegenerative diseases could benefit.
“We need to be aware of the risks, but we cannot forget about the benefits to patients and society,” said Dr. Fatima Mirza, another medical resident involved in the pilot. “We were able to help Lexi regain her true voice, so she can now speak to herself in her truest words.”
Mirza and Ali, who are married, caught the attention of ChatGPT maker OpenAI for their previous research project at Lifespan, which used an AI chatbot to simplify patient medical consent forms. The San Francisco company approached him earlier this year when it was looking for potential medical applications for its new AI voice generator.
Bogan was still slowly recovering from surgery. The disease began last summer, with headaches, blurred vision and a drooping face, alarming doctors at Hasbro Children's Hospital in Providence. They discovered that a vascular tumor the size of a golf ball was compressing her brainstem and intertwined with her blood vessels and cranial nerves.
“It was a battle to control the bleeding and remove the tumor,” said pediatric neurosurgeon Dr. Konstantina Svokos.
Svokos said the location and severity of the tumor, as well as the complexity of the 10-hour surgery, damaged Bogan's tongue muscles and control of his vocal cords, interfering with his ability to eat and speak.
“When I lost my voice, it felt like part of my identity was taken away,” Bogan said.
This year saw the introduction of feeding tubes. Although her speech therapy has continued and she is now able to speak clearly in a quiet room, there is no sign of her regaining the full clarity of her natural voice.
“At some point, I started forgetting what my voice sounded like,” Bogan said. “I'm pretty used to my voice now.”
When the phone rang at her family's home in North Smithfield, a suburb of Providence, she held it up to her mother and answered the call. She felt that she was a burden to her friends every time she went to a noisy restaurant. Her father, who has hearing loss, had a hard time understanding her.
Back at the hospital, doctors were looking for a pilot patient to experiment with OpenAI's technology.
“The first thing that came to Dr. Svokos' mind was Lexi,” Ali said. “We reached out to Lexi to see if she would be interested, but we didn't know how she would react. She tried it and saw how it worked. I liked checking it out.”
Bogan had to go back several years to find suitable recordings of her voice to “train” the AI system how to speak. It was a video where she explained how to make pasta salad.
Her doctor intentionally fed just a 15-second clip to the AI system. Cooking sounds make other parts of the video incomplete. This was also all her OpenAI needed, an improvement over previous technology that required longer samples.
They also knew that getting something useful out of 15 seconds could be critical for future patients who don't leave a trace of their voice on the internet. A short voicemail left for a relative may suffice.
When we tested it for the first time, everyone was surprised by the quality of the voice clone. Occasional glitches (such as mispronunciation of words or missing intonation) were almost imperceptible. In April, doctors equipped Bogan with a custom phone app that only he could use.
“Every time I hear her voice, it touches me so much,” said her mother, Pamela Bogan, with tears in her eyes.
Lexi Bogan added, “I think it's great to be able to have that sound again,” adding that it “boosted my confidence a little bit to where it was before all of this happened.” Told.
She now uses the app about 40 times a day and provides feedback in hopes of helping future patients. One of her first experiments was talking to children in the kindergarten where she works as a teaching assistant. She typed, “Hahahaha,” expecting a robotic response. To her surprise, it resembled her old laugh.
She used it to ask for the location of items at Target and Marshalls. It helped her reconnect with her father. And ordering fast food just got easier.
Bogan doctors have begun cloning the voices of other aspiring Rhode Island patients and hope to bring the technology to hospitals around the world. OpenAI said it is working cautiously to expand the use of its voice engine, which is not yet publicly available.
Many small AI startups are already selling voice cloning services To the entertainment studio Or make it more widely available. Most voice generation vendors say they prohibit impersonation and abuse, but companies vary in how they enforce their terms of use.
“We want to make sure that everyone whose voice is used in our services has ongoing consent,” said Jeff Harris, head of product at OpenAI. “We want to make sure that the word is not used in a political context, so our approach is to be very specific about who we provide the technology to.”
Harris said OpenAI's next steps include developing a secure “voice authentication” tool that would allow users to replicate just their own voice. That “may be the limit for patients like Lexi, who suddenly lose the ability to speak,” he says. “So we think we need to build a high level of trust, especially with healthcare providers, to provide a little more unfettered access to technology.”
Ms. Bogan has impressed doctors with her focus on thinking about how technology can help people with similar or more severe language impairments.
“Part of what she's done throughout this process is to think about how to tweak this and change this,” Mirza said. “She was a great inspiration to us.”
For now, you have to fiddle with your phone to get the voice engine to speak to you, but Bogan has developed language recovery tools that include an electric larynx that fuses with the human body and emits robotic-like sounds, as well as artificial voices. We envision an AI voice engine that improves on old treatments for. Or you can translate words in real time.
As she grows older, she becomes less sure of what will happen, and the AI's voice continues to sound the same as it did when she was a teenager. Perhaps the technology could allow her AI voice to “age,” she said.
For now, she said, “I don't have my voice completely back, but there are things that are helping me find my voice again.”
___
Associated Press and OpenAI License and technology agreement This allows OpenAI to access some of AP's text archives.