How artificial intelligence helped Val Kilmer

How artificial intelligence helped Val Kilmer

Yes, technology is the answer! Val Kilmer can speak again, and not just in the new movie “Top Gun: Maverick”.

“My voice, as I knew it, was taken away from me. People around me are now struggling to understand me when I speak, “said American actor Val Kilmer in a video shared on YouTube late last year. Operated for neck cancer in 2015, the actor almost completely lost his voice, a voice that surviving members of The Doors admitted they thought belonged to their late vocal soloist, Jim Morrison, when they heard him sing in the film Oliver Stone since 1991. A few years followed in which he had a hard time communicating with those around him and thought at every moment that he was finally ending his career. Then, artificial intelligence (AI) helped him to be heard and understood again.

A Photoshop for voice

Let’s face it, even the most famous artificial voices based on those of real people, like Siri or Alexa, have sounded and continue to sound fake. But in the meantime, the infamous deepfakes have emerged – the generic name for seemingly real audio or video recordings, created using AI, in which someone appears saying or doing something they didn’t actually say or do in reality – and digitally generated voices they have become more credible and more natural.

Without going into too much technical detail, behind any successful deepfake are algorithms created and trained to manipulate human faces and voices. The algorithm that underlies a deepfake video, for example, overlaps the movements and words of a person (A) over those of the fake character (B) (B). Basically, artificial intelligence generates a new video, in which B moves and speaks with reference A. The more the algorithm learns from more video / audio recordings, the harder it is to identify as false.

To speak again, Kilmer collaborated with the British start-up Sonantic, the creator of a unique solution for converting written text into speech, which John Flynn, co-founder and chief technology officer of the British start-up describes as a “Photoshop for voice “, able to express subtleties such as teasing or flirting. The key element, company officials say, “is the incorporation of non-speech sounds into the audio stream, by training AI models to recreate those small air intakes that give real speech the imprint of biological authenticity.” The platform allows users to adjust the rhythm of the lines, experience different emotions and change the inflection of sentences by changing the tone of each word spoken.

“Val wanted us to help him digitally reconstruct his voice so that he could continue to create. Which I did, ”says co-founder Zeena Qureshi, who also holds the position of chief executive officer at Sonantic. Usually, when the company creates a voice model with an actor, audio recordings of scenarios that he reads in advance are used. Subsequently, the recordings in question are uploaded to the company’s voice engine, which uses them to train the AI ​​model. Kilmer’s case, Sonantic details in a text published on the company’s blog, was a bit more complex and involved “more manual labor.”

The first step was to gather old audio recordings with the actor’s voice, which were “cleaned” of any traces of background noise. However, the data thus obtained were too little to help the algorithm reproduce Kilmer’s natural way of speaking, and it was necessary to create and implement new algorithms, with the help of which Sonantic finally managed to generate over 40 different models of voice, including the one used in “Top Gun: Maverick”.

Speaking beyond the grave

Continuing the 1986 blockbuster, “Top Gun: Maverick” shattered all expectations since its theatrical release on May 27, generating revenue of nearly $ 127 million on its first day of release in the United States. And for many ticket payers, the return of Val Kilmer as Tom “Iceman” Kazansky was an undeniable highlight. But his appearance has long been a big question mark. Finally, the writers intertwined Kilmer’s story with that of his character: Iceman also has neck cancer and communicates mainly through written messages. But he also talks to Cruise. With the unmistakable voice before the disease.

About the AI ​​model created for Kilmer, John Flynn says that the actor will be able to use it both in his professional and personal life. “Kilmer can take part in TV or film productions, and his voice recordings will be made with the help of the Sonantic application. He can license those recordings for various productions and studios. The voice model can also help him communicate in everyday life, as a personalized replacement for robotic speech generation devices “, explains John Flynn.

But this is not the first time that AI has been the tool to create an artificial voice for a real person. Film production companies and dubbing studios have used AI models in movies and TV shows to produce voice versions for different ages of the same actor, as well as to bring back to life the voices of dead personalities.

In the documentary “Roadrunner: A Film About Anthony Bourdain” (2021), for example, the AI-generated voice of the well-known chef (who committed suicide in 2018) is heard “reading” an excerpt from an e-mail he sent to sent to his friend, artist David Choe. The message is real – “My life is kind of shit now. You’re successful, I’m successful, and I’m wondering, are you happy? ”- but director Morgan Neville has been accused of using deepfake to produce an audio recording of a phrase Bourdain never actually uttered. “I used a modern storytelling technique at one point in the story when I felt it was important to bring Tony’s words to life,” Neville said.

Mark Hamill, the actor who played Luke Skywalker in the feature films “Star Wars”, is still alive and can still play without problems. Even so, Disney studios preferred to use an AI algorithm from a Ukrainian company to reproduce the voice of their youth and give voice to a CGI (computer-generated imagery) alter ego four decades younger. , for an episodic appearance in chapter six of the new series “Boba Fett’s Book”.

Sonantic has also worked with other actors, but prefers not to reveal their names. Active for four years, the company collaborates mainly with game producers such as Obsidian Entertainment and Remedy Games, and often licenses its artificial voice service to studios, allowing them to edit synthetic voices to get natural tones and inflections for a a certain exchange of remarks or a certain scene.

Lots of noise for a lot of money

With an obvious potential to help people with speech difficulties – as is the case with Kilmer – the new technology also comes with questions and fears of a legal, ethical and economic nature, especially for vocal actors, who are worried about the disappearance livelihoods. Deepfake algorithms have been used in anti-fake news circles to make videos with politicians like Donald Trump and Barack Obama, in order to highlight the dangers of a technology designed to make people look like they’re saying things they don’t know. they never claimed.

“When I’m an actor, I decide whether or not to support the content of the messages I read. It would be devastating for any actor to know that his voice is somewhere out there, saying things he might not necessarily support, “says Jay Britton, a voice actor who plays animated characters in the Netflix children’s series.” Tit! Tit! Matei Mașinescu ”and in a long list of video games.

Many actors are increasingly worried that they will (no longer) be paid correctly or that they will lose control of their voice, which is their brand and reputation, says a spokesman for SAG-AFTRA, the union that represents the actors. vocals from the USA. Fears that have already been the subject of a lawsuit against TikTok, opened by Canadian actress Bev Standing, after the Chinese platform included in its application a synthetic copy of her voice, without asking her permission. The two sides reached an amicable financial agreement, but its terms were not made public.

Standing’s experience is an echo of that of Susan Bennett, the original voice of the Siri virtual assistant, originally developed by the company of the same name and then bought by Apple in a transaction estimated by market analysts at over 200 million dollars. Bennett was paid for the recordings that underlie Siri’s current voice, but they were made for another software maker, ScanSoft. Apple, she says now, would have used her voice without announcing it, without paying it, without any agreement and without even admitting that it is her voice.

Sonantic claims that their algorithm was not created to replace the actors. The company’s website claims that it can “reduce production times from months to minutes”, promising “convincing, realistic performance with expressive voices generated by AI, for games and movies”, admitting, however, that all these are prerequisites that could reduce the number of hours human actors are paid to spend in recording studios.

From a legislative point of view, there is no express provision prohibiting technology companies from generating synthetic voices. There is, however, a general framework for discouraging those who want to take advantage of any resemblance to a celebrity. In a case of voice theft in the 1990s, singer Tom Waits sued US chip maker Frito-Lay for using a voice similar to his own in an ad and receiving $ 2.6 million in damages. of dollars.

But things remain more in an area with more than 50 shades of gray. “If a company reproduces the voices of acquaintances without permission, it is possible to violate their right to privacy and risk a lawsuit. If you do it as a parody or an artistic routine, then it is not a violation. It is, instead, if you do it for commercial purposes, “says lawyer Peter Raymond of Reed Smith in New York, who specializes in intellectual property and copyright.

But could it violate any laws other than copyright laws? In America, for example, the legislation in force is quite complicated and differs from one state to another, explained for Fortune the lawyer Mitchell Schuster, partner of Meister Seelig & Fein LLP. States such as California also recognize post-mortem image rights (use of name, voice, etc.) of a celebrity, while New York law considers that they cease at the time of death.

Moreover, in the case of California, the entertainment industry has begun lobbying to update legislation by protecting and against deepfake. Several other US states have recently passed anti-consensus XXX laws that are non-consensual or that may interfere with the electoral process.

Far from all this tumult, Val Kilmer is satisfied: “Sonantically he restored my voice in a way I never imagined possible. The chance to tell me the story, with a voice that feels authentic and familiar, is an incredibly special gift. “

This article appeared in issue 142 of . magazine.

PHOTO: Getty