# 173 Hollywood-Grade AI Dubbing with Deepdub CEO Ofir Krakowski
What we can take from ChatGPT and take to my world is build a product, not a technology. And what we're actually doing in Deepdub is actually building a product that people can use for a very simple and accurate purpose. And hi everyone, and welcome to SlatorPod. Today we're really excited to have Ofir Krakowski on the podcast. Ofir is the co-founder and CEO of AI Dubbing company Deepdup. Deepdub was actually featured in our 50 under 50 list of startups that we should watch, that you should watch.
Company raised $20 million last year and has been growing since. Hi Ofir, and thanks so much for joining us today. Hi Florian, and great to being here. Absolutely. Where does this podcast find you today? What country, what city? What office? So I've been travelling these past weeks, but currently, I'm in Tel Aviv in our RND facilities. Great.
You got the most awesome backdrop ever we had on the podcast. I got to say, for those that are only listening, Ofir has a backdrop that says, may the force be with you in a giant picture of Yoda on the right side. So that's very cinematic speaks to, I guess, what the company does. So tell us more about Deepdub in a nutshell, elevator pitch. So Deepdub was essentially founded to bridge the gap of languages.
So eventually what we have developed is a platform that enables to take content from one language into multiple languages simultaneously and do it with a very affordable and a very easy way. So this is what we do. We started by aiming for how we can make it theatrical using technology and bringing more efficiency into these processes. And this is essentially how we started by aiming for the highest grade, because we actually figured that if we can solve how we can do a Disney movie, then we can go down the line and serve everybody else. That's great. I really want to talk to you about that because, that's what's kind of striking to me that Deepdup, like, aims for basically the Holy Grail to stick with, I guess that might be an Indiana Jones metaphor, which is now kind of current again.
Anyway, so how did you end up doing this? This is not your first company. What's your professional background and why AI Dubbing? What's been the journey to that particular place? So, just to start, I can tell you that actually building this company combines two of the things that I like. This is creating content, creating good stories, and creating great technologies.
So this combines, at last, I found something that I'm really enthusiastic about it, but for a lot of years, I was the head of the machine learning and innovation department in the Israeli Air Force. And in there, I've learned deep learning and generative AI and machine learning technologies. And taking this knowledge, I wanted to contribute to see how this technology can impact globally. And the first company that I was involved in was in the healthcare.
And as I went from this venture, I've searched with my partner Nir, which is actually my younger brother. I've worked for almost a year and a half exploring different ideas how generative AI can actually impact globally. It struck us one time that actually content is always created in one language and for a specific culture. And when you want to watch it in a different region, it's actually, in most of the cases, not accessible. And by content I mean audiovisual content, which is very important because there is a lot of kind of content. But audiovision is currently what's driving most of the businesses because we're at the age of the YouTube shorts, TikTok and Instagram versus the text platforms.
Now, Israel, is that a dubbing or a subbing market? What was your personal exposure from day to day? Because in the Israeli army you probably weren't so much in the content side of things. But yeah. How about Israel? It's funny because as you think about Israel, Israel as a relatively small country, it's a sub country for adults. But since I have kids, I understood that they are not exposed to content, which is not dubbed because till the age of nine, they cannot actually read very well different languages and even in their own language.
So actually they are not exposed. It's not just entertainment, it's even education. Because one of my sons, he wanted to learn Python. It's kind of a computer language.
But actually most of the YouTube videos out there are in English. So it's not accessible to him. So this kind of understanding came into place that content is not available to most of the people around the world. It's for billions. And obviously, if you know English, most of the content will be available to you. But if you don't and this is the case for a lot of people and some of them cannot read, even, so, the subs doesn't work for them.
So content is not accessible. And going forward, we understood that starting with theatrical grade of dubbing would solve a lot of problems not only for people who cannot read, but also people with height they cannot see. So it will impact a lot of people.
And this is why we chose this area to start with. Very good point. Yeah, this kind of point resonates with me a lot because I have kids well, in that age, just start reading really well. But one of them is a little older, he can read the subs. The younger one not yet fully and it's really tiring.
So, yeah, I guess the kids component for a lot of the international people, I guess we are like speak English. We very effortlessly watch an English video. It's sometimes hard, it's hard for people like us to really understand that there is this huge universe of people that wouldn't be able to watch any English video and kids bring this home, drive this point home quite easily. Now, beyond the kids and the dub content there, what are the key users and client segments that you have currently? I mean, you're aiming at the theatrical, the biggest possible or the Holy Grail of dubbing, but what's your current user base, client segments, etc.?
Actually, as I told you, we started with most of the big Hollywood studios working on different projects, different use cases. So starting with creation of voices, new voices and dubbing content, and also dubbing content, it means we have content out there, which is part of theatricals that actually been on the teaser for two months ago and content is currently streamed on the streaming platforms, but also small productions that cannot afford till now to dub their content and take the content. And actually, there is a lot of content that goes into the US. So actually, most of the foreign content was not accessible to US Audiences.
So most of our work is done into Latin America and the US but also to European countries. So you see content is flowing from different countries. I'm very proud that a lot of foreign content, good foreign content, is actually now accessible through our technology to American audiences.
It enables American audiences to be exposed not only to the content, but also to different cultures. For example, if you see a great TV series like The Bridge, it's a cultural thing. Everybody in Europe knows this TV series. Actually, when it first came to US, it was remaked. They remake it, and it's totally different view of the same show. But now audiences are seeing the original one and enjoying Danish and Swedish cultures in an essence.
And it's easy to digest, not just when you are seeing it with the original languages. Sometimes for people which are not used to see things with subs, it's more difficult to digest: digest the culture, digest the language, digest the story. So we're making it a little bit easy. It doesn't work for everybody, but a lot of people find it very comfortable to start with. Now, on your website it says, Deepdub plucks into the post-production process of content owners and takes complete ownership of all their localization needs. That implies you do tech, but also some of the services component, kind of a managed service.
So can you just walk us through how you support the clients? And also, is it all synthetic voices or you're also kind of including maybe human voices and then using AI to make the process more efficient or, just give us kind of an overview of that. This is a great question because we have different workflows for different customers. So I started with the high- end theatrical customers. Then for them, we're providing also professional services. Actually, this kind of stock customers, the high-end customers want someone to curate results of machines.
So we're incorporating humans in the process to curate the result. Machines in the translation part currently cannot provide, they can go up to between 70% to 90% in accuracy. But to drive to 100%, which is really, really needed in a high-grade content, drama content, you need the person to go in and curate the machine. Whatever somebody tells you machines can do it, they currently cannot do it.
So every AI solution needs someone to curate the results of a machine. And this is what we provide for the high-end solution. And we provide the professional services on top of the platform, and we're doing it very efficiently on the voices part. So our machine technology, we have different voice models. Some of them are voice-to-voice models, so they allow someone that curates the text just generate the new voice.
But also we can generate the voice from text only. So it allows a lot of flexibility generating exclusively the emotion parts. This is how we have results that can go for theatrical grade, but can scream, can shout, and can even talk while eating. So this is a very difficult issue for AI technologies, but we combine these technologies together and enable everybody to work on this technology. So actually, even the dubbing director, and we having the dubbing director for the studios, he can go, and I want it differently and he can do it.
That's fascinating. There's two things I want to follow up here. You say 90%, like 70% to 90% just on the translation component. Do you feel the rest is literally it's just a human creativity element that like a machine, it's almost like uncomputational.
It's uncomputable basically that last 10%, because you need to take a lot of creative liberties and almost act like an author. Or is it an actual technology problem that you think is going to be solved in the next two to four or five years? I am not a prophet, but I think a lot of these issues will be solved. But there is another part of creativity, that is part of human creativity, that you not take out of the equation. For example, a joke. A joke is a very regional, in some cases is a very cultural thing. And when you go from one language to another, you need to adopt it.
And currently the machine cannot adopt jokes very well. So this is a very understandable problem. But there are even simpler problems. For example, if I say I went to 7-Eleven to buy Coke, for example, in this being doing a free advertisement. But anyway, if I go this, even in Google Translate, 7-Eleven in Sweden maybe does not exist.
It's another store, right? So translation would say seven one one because it doesn't understand it's a store. So we have light technologies to compensate for this kind of thing. So this is the kind that could be solved, but jokes is a different level of creativity. So in essence, if I look on the process, human creativity still needs to be in the process.
And this is why we build a platform and not just a fully automated tool that you can work with for the high-end. Obviously, the tool allows you to go very on an automated way, but you'll get to a certain level of quality on the translation, which in some cases is pretty much good. And translation is only one part of the challenge in the AI Dubbing. I was going to ask you, how do you even rank these different challenges in terms of complexity? I mean, like you just mentioned, emotions, now we have the translation component.
There's just so much in AI Dubbing that you need to tackle. And you as a company tackling all of this, you kind of probably need to prioritize in terms of the types of features and services you're all at. Could you rank those challenges for us? Maybe top three. Top five? Like, which one is the one you're working on, the hardest one, etc.
We started what we thought that is the highest priority is creating a very authentic, natural sounding voice that if I will let a person hear it, he would not notice it's not a human voice. Why is that? There is a simple reason. Because most of the technology in three years ago, most of the voice generation technology was actually owned by most, by four to five companies.
And currently, it's more widely available. But three years ago, it was less available out there, the technology and even the people that can develop this kind of technology. So we have an internal research and development team that works on this space. And the second part, I think, is the translation. I believe the translation is the second hardest part.
And we solve this part partly with technology. So incorporating new technology on top of: we don't need to invent something that thousands of engineer works in Google or Amazon or Microsoft doing in translation. But on top of that, we're incorporating domain-specific technologies to get the rates of accuracy of the translation higher. But to get to 100%, we decided to develop a platform which is kind of an Adobe premiere for dubbing and creation of voices. So it interacts in a very simple way with a human being, so enables him to take it from 80% to 90% into 100% in a very simple way. I think these are the highest challenges, obviously, when you are talking a high -grade content and you want it to be very automatic, if you want to get something very automatic, you also need to tackle the mix part.
We also tackle the mix part. So we're doing automatic mix. So this part is also handled on our platform and it allows you different tools that enables you to do the work very fast and make changes. So you interact with the machine and you make changes to the voices and you can mix them back together into the soundtrack. And this is what the clients can do on your platform, kind of in a self-serve mode as well? Currently working with the studios, they don't want to work on the platform actually.
They want the end product. This is why we are providing professional services on the platform. So we're providing an end-to-end solution, this is what you read on the website we're providing. You just bring your content, we're plugging into your file system and you'll get at the end the result, the dubbed content. But if you want to work on a platform, you can work on a platform. It's a web-based platform.
We figure out after COVID that everybody wants to work from their home. So actually you don't have to have any studio. We are a very secure platform, but it's a web-based all you need is a Chrome. You don't have to download anything to your computer. And then maybe next will be like a pricing page and like a SaaS type of offering and something like that. Because right now it's very enterprise, like studio focus, right? You got a call and then set up.
So recently we ran, I think a press release that you guys were added. You received a trusted partner network certification established by the Motion Picture Association. What is the TPN certification and how does it help you as an AI dubbing firm? Like, what kind of doors would it open? So, TPN is something that is actually needed to be able to get pre-released materials. It qualifies you actually as a secure vendor that can handle very sensitive materials. Studios invest a lot of money in creating those movies or TV series.
They don't want it to be leaked before time. So when we get content, we get it ahead of time actually to start localizing it in a point of time that the actual content is not ready to release, sometimes months before. And working on our platform, we enable it to streamline some of the use cases. For example, a very simple use case is test screening. Currently, test screening is only done in English, right? So if you create a content in English, most of the theatricals are created in English, actually, the big temple theatricals are created in English.
But when they are created in English, they are taskering it in English, right? But you don't know how your content would play and how the audience would react in Asia or Europe or Africa. Different cultures react to content differently, to jokes differently and sometimes you want to be able to adjust it in early stages. Because if you do it in later stages, it will cost you a lot without technology. So our technology enables you to do it very fast, very cost -effective on early stages. This is why we need a TPN approval and TPN is an equivalent in the software industry for a SoC two.
It's kind of a security assessment, a very thorough security assessment that not only assesses your technology but also assesses your facilities. And actually, we're one of the few companies that got TPN approved worldwide, actually, because our platform is a worldwide spread because we're working with people in different regions across the world. Yeah, it's one of the industries where confidentiality is just incredibly important.
I used to work for an LSP that served banks and so there it was of course super important as well. But like on your side it's just as important. There was this story about an Antman script leaking on Reddit or something like that. And I think they're even trying to sue Reddit to disclose who leaked it, right? So the stakes are quite big. I want to take a bit of a detour to YouTube. I'm not sure if you've done any work on YouTube content, but we recently had someone on the podcast Farbod, from a company that mostly focuses on YouTube videos and he told us that engagement is higher for human-dubbed versus AI-dubbed content on YouTube.
So it's almost as high as for the original. Have you done any research on this? Have you seen that? Or is this not a topic for you at all kind of YouTube content? We serve some of the biggest YouTubers, influencers on a platform. Obviously, we're very discreet on our customers currently, so you won't see on our website where our customers are. And this is being respectful to our customers. It's not just being awarded TPN, it's being respectful to our customers. YouTube is obviously one of the goals by enabling this platform and building this platform because a lot of the content is going into YouTube and to TikTok and this kind of platforms currently.
And regarding people like watching human dubbing, as for AI dubbing, it depends on the quality of the dubbing. Even human can do a lousy dubbing. You can watch a TV series and don't like it just because the dubbing was lousy or the translation was lousy, right? So it depends.
With AI dubbing there are multiple things that you can benefit from AI, for example, you can change the text of what was said in an instance. You see that something doesn't work, somebody responded, you don't have to bring somebody to the studio just to record the line, you change it and you upload it instantly. So there are different use cases we are supporting currently for some of the customers.
So this is a very good question because it leads you to another thing that you mentioned: how this platform can benefit other people. So actually we're about to launch this platform and actually enable people to work on the platform and this will go in a week from now, it will be available to other people to use it. So obviously we aim for people like influencers on YouTube to use the platform, have their content dubbed, and do it in a very affordable way. So we actually learned the secrets of trade on the theatrical and now we bring and democratize this knowledge to everybody. So this is going to be like a subscription? Yes. Okay, so I can go, I can get, starter, medium, pro, and I get X amount of hours, or? So you will start with a freemium, just go free test and see if you like it.
We have two kind of views. One is a very simple one. It's only text, so you work only at text and get the end result. And one is a very professional view. It's the same platform, but you can edit your audio, you can edit your text, you can add emotions, you can enhance emotions of the text, and you can even get professional services on the platform. So obviously you'll get the full thing.
You can actually localize the content, but you can also create the content. So you have just the text and you want to create the content. You can also create the content just from text. Where do you see the biggest market for this, if you stack rank, maybe a TikTok ten-second video versus a big YouTuber that has, 10 million subscribers? Where would you see the biggest potential for you guys in terms of the platform served? If that's even a question that makes sense. We are serving two kind of customers. One is the enterprise kind of customers.
We can categorize them on the entertainment side, but not only on the entertainment side because there are a lot of companies, for example, you mentioned the bank, but other companies have company videos, so they want to work on the platform. They have localization teams, they are spread globally and they want to work on the platform. This kind of customers, they will have professional services and they will be served in a very high-quality manner. But there are a lot of people that can use the platform and localize the content.
And currently, they actually cannot afford it, it's not accessible to them. Either, it's so convoluted hiring those dubbing studios across the world. If you want to dub it into 16 languages, you'll have to hire a lot of dubbing studios in different regions, do contracts, get the materials in different timing, handle - It's a project. On our platform, you can get all the 16 languages in one time.
No hustle. It's very simple. So I believe that a lot of people that cannot currently afford the hustle or the convoluted process, not only the pricing, will use the platform, and I believe that it will democratize actually the way people actually sharing the content and reaching new audiences. I t's very interesting because with my job,
I meet people that create content in different languages which are not English, and they are very successful in their country. Like, they have millions of subscribers, but nobody would watch them in a different region. Right. I created something in French, and I have like 20 million subscribers that know French. Nobody in the US would watch my content, right, if it's not dubbed? Or in Latin America, or in Asia, right? So it enables them to reach wider range of audiences and bring their content, which is sometimes a very educational content. There is a lot of content which you learn through this content.
So it's very interesting times. And I think that we'll see a lot of new customers that currently are not in this space of translation and localization. The link is so direct.
Like, the link between making it accessible in another language and then basically, enabling growth and revenue growth and more business for creators. It is just so much more direct than in many other areas of localization where it's like you think there is an ROI, but here it's like immediate. Like you just know there is an ROI to it. Right. Everybody heard Mr Beast. He saw a direct ROI, right? That's right. And he got into the business.
He loved the ROI so much. Yeah, definitely. I'm not sure how much you can comment on this, but you spoke before that. Well, I'm not going to reinvent the machine translation wheel. W here you sit on the buy versus build, because, again, your solution brings together so many different strands of technology.
What do you think you need to own it in-house? What do you think you can connect to via an API? So the basic parts of transcription and translations, we're buying through API, we don't develop. There are some companies we understand that are doing it in-house. Actually, we intentionally don't do it because you need to own a lot of data. And obviously, company that does this, we work in a very ethical and very responsible way. So in order to build this kind of technology, you need to have a lot of data, right? So either it's your customers' data, as some companies have, or you scrape the data. In essence, we are buying these technologies through an API and on top of these technologies, we provide a complementary technologies that bring it from 70% into the highest grade of the translation part.
Got it. And some of the other components, maybe. I mean, the rest you tend to want. The rest we developed from scratch. As I told you three years ago, actually, voice technology was owned by the four companies, the four big companies, technology companies: Meta and Amazon and Google and Microsoft. So, obviously, you cannot buy this technology.
And if you bought it, it was not aimed for the highest grade. It was aimed for a very narrow range of use cases because it supported a narrow range of emotions and basically mostly narration style. Now, there's been this massive shift in all things AI since maybe was it November when ChatGPT launched? I mean, anything with .ai domain just got like ten X and just so much more general awareness. How are you keeping up with what's relevant for you and your roadmap? And what do you filter out as noise? Just how does this influence you? Because we as observers at Slator it's a little bit of an overload for us, even, but we don't have to make any critical product decisions.
But how is this for you currently? This is a great question, actually, generative AI, since it emerged through ChatGPT, it came into the conscious of everybody. It actually enabled us to explain better how generative AI is actually a product that someone can work, because text is something that everybody works and everybody understood through ChatGPT, how they can make their processes more efficient and it become widespread. And when we're working without technology, we started with the entertainment side and entertainment is a very traditional industry.
So obviously, in some cases, they were technology averse to generative AI. In some cases, we had to prove that actually, we're creating voices that are real. With one customer we did a lot of POCs, they showcased it to a lot of people, they have this audience watch the content and obviously, we passed all of this, but it took a lot of time. ChatGPT just emerged and you could have test it and it was free, so everybody could understand. So, in essence, I think that what we can take from ChatGPT and take to my world is that build a product, not a technology.
And what we're actually doing in Deepdub is actually building a product that people can use for a very simple and accurate purpose, and not just building some nice technology. Generative AI can build a lot of nice use cases. But do they aim to do something good? Do they aim to do something that somebody can understand? It makes the processes more efficient. This is something that I took from ChatGPT.
Everybody could understand it very easily. So this is in essence, if I look on ChatGPT and generative AI, I think that the company that will build out of generative AI will build good products or good platforms that people can work with. They will win this landscape. Yeah. And there's a big pot of gold at the end of that rainbow and there's so much competition right now, it's so exciting to see. And I agree with you.
I mean, it's really about building a product somebody can use rather than having in theory, you have all these capabilities that you as the kind of day -to-day, even individual consumer or enterprise consumer can't really use, because there's always something missing somewhere, right? So if you solve that beautifully for your clients, that'll be a big payoff. I have a good example. There are a lot of companies that are offering creating the voice. Actually, voice is not dubbing. Dubbing is more than creating voice. Voice question already been there.
We had offered through APIs by the big companies, and nobody created dubbing out of it. So creating voice is not a dubbing. It's not the end product.
Actually, what people want is the end product. I want the result of generative AI and not generative technology. So, actually, dubbing is comprised of the whole process, translation and adaptation of the content and creating the voices and mixing in the back together with the videos. This is the whole package. If only a seemingly small part is missing, it basically jeopardizes the entire product, the entire offering. It just wouldn't work for somebody - be like, well, I like it, but you're 85% there, and if you don't deliver the other 15%.
I'm in a traditional industry. I can't go forward with this because it's not solving my problem, right, like, end-to-end. Speaking of traditional industry, have you seen any pushbacks from actors, maybe around AI dubbing or anything related? They'd be like, no, I need this particular voice actor to dub me, or my lip movements can't be changed, or anything like that. Have you seen any pushback from that side? So I think that there is a great concern in the traditional industry regarding AI. And I acknowledge this because this is a new technology and people actually been talking to us.
For example, we have been invited to speak on a panel by the BFFS last week in the Munich Festival. So actually, the BFFS to people that doesn't know, it's the union of the voice actors. So, actually, I think this is a great move because actually, we want to work with those kinds of organizations and find a way that combines human creativity with technology. We think that technology is empowering the human creativity. It doesn't take work out of people. It actually bring more work to people.
People that work with us actually get more work because we're kind of enabling them to pass or to overpass their own limitations. For example, if you have a voice that doesn't fit for a specific show, with our technology, it can fit because you are acting very nice. Well, we can change your voice.
You can do this work. But we acknowledge that in some regions, there are established voices. We actually worked with established voice sectors in the regions. It doesn't relate to technology.
It relates to the way dubbing is done in those regions. For example, you have a very well-known American actor. He will have an established voice in Germany or Latin America.
So we can work with those voice actors. We don't take their job. Actually, we don't replace their voices because nobody would actually know, the new voice. But at the same time, there are a lot of library content that is currently not dubbed because in some region there is not enough dubbing studios, not enough voiceovers, and actually it's actually very convoluted. And there are some regions that are some regions that are not served because of the pricing. So in some regions they are not dubbing at all just because it doesn't make sense economically to dub.
So they are deprived of content in some cases. So we're aiming for those content, those regions and those kind of content, and not replacing the next people that would dub the next Squid Games. You mentioned that certain regions are really struggling because there's simply no talent. But for you, it's also going to be hard to hire talent. It's a competitive market. I think you're looking maybe for dubbing directors like language adapters, machine learning roles, things like that.
How are you going about this? How do you find the market environment now, maybe versus like twelve months ago? Yeah, the hiring environment. I believe that market is acting as the need is out there. I don't find that there is a lot of difficulty getting people to work on a platform.
It's very simple. It allows you to work after your work hours. So, in an essence, we're enabling people to do more money because they can work in their home, in their weekend in some cases, so they can do more work.
So at daytime they can work in a regular studio. At the nighttime they're working with us. We enable people from another region to work because currently they cannot work or they are in a region, they are not close to the studio, so they cannot work in the studio. They work in different cities, so now they cannot. We have calls, for example, with voice actors. They told us, I am going on vacations for two months.
Can I work from my summer house? Definitely, yes. Technology enables you to work whatever you want, wherever you are. That's an easy pitch and you have the global market opening up.
So you mentioned the new product that you're launching, a) is there a name, and b) is there anything further that you have in the pipeline that you can share 2023 or beyond anything you're working on, anything that's on the roadmap? So the platform would be called Deepdub GO. GO - it's a great name, just go and do it, localize it, but it also means global outreach. So this is why I like this name because we're enabling people to actually reach more audiences and this is what we aim for. 2023 would be a great year for Generative AI. You'll see a lot of innovation in Generative AI.
We are going to launch more capabilities on the platform, obviously getting the technology that we have developed for the whole food studios and bring them in a very simple way to other people to use. And we will also enable more and more languages to be available on our platform. We aim to end up at the end of 23 with 60 languages. So actually, you can dub. Most of the dubbed languages will be available on our platform. And as 24 goes, we will aim for more languages enabling.
For example, in India, there are more than 100 official languages. Most of the content is already dubbed to one to four languages. So actually, most of the people in India are actually watching content in their second language. Yeah, that's a big frontier market that if you can tap into it's huge. I mean, there's a number of AI dubbing startups also from India, because it's just such an obvious big market to tap. Well, look, I'm going to sign up for Deepdub GO, get the freemium version, check it out.
Who knows, maybe we'll dump the podcast at some point. That would be my holy grail. Definitely. We'll reach more audiences. It's a lot of jargon, though.
Sometimes I'm wondering, like, I couldn't really host this podcast in German. It'd be like, oh, what are all these terms in German? I don't know. It'd be a struggle. Unlike this podcast, that was fascinating, not a struggle at all. Thanks so much, Ofir for taking the time today, and good luck with the launch. Thank you, Florian.
I had a great time talking to you. Thank you.
2023-07-14 17:57