How To Digitally Recreate Darth Vader’s Voice From A War Zone
11:54 minutes
James Earl Jones played Darth Vader for 45 years. But this September, he officially stepped down from the role. Fear not, Star Wars fans—the villain isn’t gone for good. Instead, the filmmakers have teamed up with the Ukrainian AI company Respeecher to recreate his voice.
Respeecher can convert one person’s speech into the voice of another. The company’s work has appeared in the Star Wars canon already, as Young Luke Skywalker in “The Mandalorian” and “The Book of Boba Fett.” And just last month, they debuted their Darth Vader mimic in the T.V. show “Obi-Wan Kenobi.”
They always knew that it would be challenging to recreate Vader’s iconic voice. But their job got a whole lot harder when Russian troops invaded their nation.
Respeecher chief technology officer Dmytro Bielievtsov and sound engineer Bogdan Belyaev join guest host Kathleen Davis to talk about their work.
Invest in quality science journalism by making a donation to Science Friday.
Dmytro Bielievtsov is Chief Technology Officer at Respeecher in Kyiv, Ukraine.
Bogdan Belyaev is a sound engineer at Respeecher in Lviv, Ukraine.
KATHERINE WU: This is Science Friday. And I’m Katherine Woo.
KATHLEEN DAVIS: And I’m Kathleen Davis. I don’t know about you, but when I think of famous cinematic moments, one of the first that comes to mind is Darth Vader’s signature line.
DARTH VADER: I am your father.
[BREATHING HEAVY]
KATHLEEN DAVIS: James Earl Jones voiced the Star Wars villain for 45 years until last month when he officially stepped down. If you’re wondering who’s replacing him, well, that’s complicated. It’s not an actor, but an AI mimic, one that recreates James Earl Jones’ voice from nearly 50 years ago.
The Star Wars filmmakers teamed up with a Ukrainian AI company called Respeecher. Respeecher feature converts performances from one actor into the voice of another. Take a listen.
JAMES: Hey, this is James. I’m an actor, and this is my real voice. In this video, I’ll demonstrate a couple of new features of features speech to speech voice-cloning engine. Obviously, I can speak with another person’s voice. But notice that now the sound is 44.1 kilohertz.
KATHLEEN DAVIS: The company’s work has appeared in the Star Wars canon already as young Luke Skywalker in The Mandalorian, and The Book of Boba Fett. And just last month, they debuted their recreation of Darth Vader in the TV show, Obi-wan Kenobi. Check this out.
[BREATHING DEEP]
DARTH VADER: I am what you made me.
KATHLEEN DAVIS: Pretty uncanny, right? The company’s path to the big screen has not been easy. Sure, they knew it would be hard to make a perfect mimic of such a legendary actor, but what they didn’t expect was to have to do so under air raids and gunfire as Russian troops invaded their nation. Joining me to talk about all of this are my guests, Dmytro Bielievtsov, chief-technology-officer officer at Respeecher based in Kyiv Ukraine, and Bogdan Belyaev, sound engineer at Respeecher based in Lviv, Ukraine. Welcome both of you to Science Friday.
DMYTRO BIELIEVSTOV: Yeah, thanks for having me.
BOGDAN BELYAEV: Yeah, hello.
KATHLEEN DAVIS: Bogdan, let me start with you. Are you a Star Wars fan?
BOGDAN BELYAEV: I’m a fan from my childhood. When I was a kid, I watched the second episode of Star Wars. I didn’t start from the first one. After the premiere, I was into it, let’s say.
KATHLEEN DAVIS: Yeah, I imagine it would be really exciting to learn that you would be working on the Star Wars shows then?
BOGDAN BELYAEV: Oh yeah. I wouldn’t believe it if I knew that, you know?
KATHLEEN DAVIS: So what materials do you need to recreate a voice?
BOGDAN BELYAEV: Basically, we need recordings for the source speaker, it’s like the actor who is going to be converted into the target voice. And the target recordings, which are recordings of the voice that we are going to convert to.
KATHLEEN DAVIS: And where do you actually get these target voice samples from?
DMYTRO BIELIEVSTOV: Yeah, that really depends on the situation. So there are projects where we could just take ideal recordings like from ADR if we’re working on a movie. But if we’re working on a historical character, or someone who’s young voice we want to make instead of their current voice, well, in that case, we’ll have to go back and look for old recordings. And for the big movie projects, usually the client would have some internal archival recordings that we would end up using.
KATHLEEN DAVIS: And how much tape is enough to build a good replica of a person?
DMYTRO BIELIEVSTOV: Yeah, that really depends on how good the data is. If we have great ADR recordings, then we would be totally fine with 20 to 30 minutes of recordings. But in practice, especially with these characters that we don’t have a good homogeneous recordings of, like for those cases, we would have to use as much data as we can. And like an hour or two hours would be great in these cases because some data is corrupted, some data has some noises, but we could still kind of pick out a half an hour of good material out of it.
KATHLEEN DAVIS: OK, so you’re using this tape to build a model of a target voice, and you have the performance of an actor whose voice you want to change. What aspects of speech does this model retain from the original source performance, and what might get changed during that conversion?
DMYTRO BIELIEVSTOV: We take content, we keep the content, and we keep the performance, like the intonations, the level of arousal, whether the voice is whispering or half whispering, or the projection. So we take all that from the source, and then we replace their vocal apparatus in a way, and we change the timbre. Also, we change slight phonetic kind of habits. So when someone tends to have a very peculiar S or F, the network would replace that. So you don’t need to actually try to mimic that as an actor.
KATHLEEN DAVIS: So it sounds like you still need, at the core of this performance, a great performance. You still need that source actor to put on a show, right?
DMYTRO BIELIEVSTOV: Oh yeah, for sure. So you just need to use the linguistics, and use the intonations, and the style of acting, but you don’t need to try to imitate any physical aspects of how that person sounds.
KATHLEEN DAVIS: Interesting. So as we heard a little bit earlier, your product really does sound like a real voice. I think if I were watching that Star Wars show, I might not even realize that it is a cloned voice. Can you as the creators of this technology tell the difference between a real voice and your conversions?
DMYTRO BIELIEVSTOV: Yes, but I think– I was hoping that Bogdan would say yes immediately.
[LAUGHTER]
BOGDAN BELYAEV: I was hoping that you would do the same.
[LAUGHTER]
OK. So yeah, I just wanted to say that our ears are having the lower threshold of like detecting the conversions, but I had a few times when I missed the files and listened to our conversions as the real recordings, and didn’t notice the difference. So yeah, it’s sometimes tricky.
[LAUGHTER]
KATHLEEN DAVIS: Yeah, I mean, it does sound a little bit scary. And in the news– I would say here in the US, we hear a lot about AI deep fakes that are used in scams or political propaganda. How does this feature make sure that this technology doesn’t get abused for those more nefarious purposes?
DMYTRO BIELIEVSTOV: Right. Yeah, there are two components of this. One is that whenever you give anyone the actual code so that they can run the technology. We keep everything in-house, and we always make sure to obtain a permission from the actor whose voice we’re cloning.
KATHLEEN DAVIS: Interesting. So let’s talk a little bit about the timeline for this Darth Vader project. So if I have done my math right, and I have my dates correct, your team was working on this Obi-wan Kenobi TV show right around the time that Ukraine was invaded by Russia. What sort of precautions and accommodations did you have to make to keep everybody safe?
DMYTRO BIELIEVSTOV: Right. Yeah. So probably one of the most important parts of it happened exactly when the invasion happened. So what we did as a company is a couple of weeks before the invasion, we pretty much decided to kind of distribute the team a little bit. So we relocated part of the team to a different city to Lviv. Bogdan also went to Lviv to work from there just in case something bad happens, which unfortunately did happen.
KATHLEEN DAVIS: Did either of you personally have to relocate?
DMYTRO BIELIEVSTOV: Yes, I had to. I think I stayed in Kyiv for some time after this happened, but then I went to my parents’ place for a couple of weeks, I think for four weeks or something, and then I came back.
BOGDAN BELYAEV: Yeah, I had to relocate to leave because, currently, I cannot come back to my hometown because of occupations.
KATHLEEN DAVIS: Yeah. And Bogdan, I heard that your hometown was actually invaded. Do you remember what was going on, what you were doing when you heard that news? I mean, what was that day like for you?
BOGDAN BELYAEV: Yeah, I remember that I woke up at around like 4:00 or 5:00 because I heard that my wife was talking with our family members, and I heard that her voice is shaking. And I just directly understood that it happened. We were shocked as everyone for the first half of the day, but, yeah, we were prepared.
KATHLEEN DAVIS: I think that in a situation like this, a lot of people would not be thinking about work, but did you keep working during this time on this project?
BOGDAN BELYAEV: Yeah. Yeah, it surprised a lot of people. But yeah, every time when they think about these days, the first weeks of a full scale invasion, I still have more yes but no not to do that.
KATHLEEN DAVIS: Yeah. I mean, for me, I would almost feel like it’s the one thing that I would be able to control, right? Is what I’m doing with my work. Is that something that maybe you were thinking about?
BOGDAN BELYAEV: Yeah, exactly. Yeah, the thing that came into my mind that our army is working at this moment. And we have electricity, and we have internet connection, and you can even go outside and get some bread, or water, or whatever. So everything is working. And, for me, it was like a light when everything is dark around.
KATHLEEN DAVIS: Well, all of your hard work on this project really paid off last month when this Darth Vader voice actually aired on the Obi-wan Kenobi TV show. I mean, Bogdan, as the resident Star Wars fan, how did it feel to watch these voices air?
BOGDAN BELYAEV: It’s a big mixture of feelings, like with happiness, and some kind of fear, and excitement, and all full of this stuff. My wife said, like, do you understand that it’s forever? And like, oh, come on, yeah. Like, it’s safe and captured, and, yeah, it will be somewhere like in 20, 30, 100 years.
KATHLEEN DAVIS: So now that you have seen that your product works, are you getting interest from other Hollywood productions?
DMYTRO BIELIEVSTOV: Yeah, definitely. It’s kind of a confirmation or sanity check for other companies that we’re not messing around, and we’re making a technology that’s worth their attention.
KATHLEEN DAVIS: Dmytro, Bogdan, thank you so much for sharing your stories with us today.
DMYTRO BIELIEVSTOV: Thanks a lot for having us here, as well.
BOGDAN BELYAEV: Thank you.
KATHLEEN DAVIS: Dmytro Bielievstov, chief-technology-officer at Respeecher, based in Kyiv, Ukraine. And Bogdan Belyaev, sound engineer at Respeecher based Lviv, Ukraine.
Copyright © 2022 Science Friday Initiative. All rights reserved. Science Friday transcripts are produced on a tight deadline by 3Play Media. Fidelity to the original aired/published audio or video file might vary, and text might be updated or amended in the future. For the authoritative record of Science Friday’s programming, please visit the original aired/published recording. For terms of use and more information, visit our policies pages at http://www.sciencefriday.com/about/policies/.
Jason P. Dinh is Climate Editor at Atmos Magazine in Washington, DC. He previously was an NSF-funded intern at Science Friday.
Kathleen Davis is a producer and fill-in host at Science Friday, which means she spends her weeks researching, writing, editing, and sometimes talking into a microphone. She’s always eager to talk about freshwater lakes and Coney Island diners.