#100 The Social Impact of Language Technology With Andrew Bredenkamp

#100 The Social Impact of Language Technology With Andrew Bredenkamp

Show Video

Overall stock markets were up super strong. Super fragmented, not only on the supply side but also from a buyer perspective when you think about the requirements for the countries. Our mission is to give people access to information and have their voices heard, so it's about these two-way conversations that we're trying to set up.

And welcome everyone to the 2022 SlatorPod #100. Hello there. Hey, Florian.

Happy New Year. Happy New Year on this January 12th and it's episode #100 Esther. It's the century pod.

Yes, it is. Yes, it is indeed. It's the century pod. We have done 100 podcasts.

Yeah. We haven't yet. We're about to so, great guest today. The CEO of CLEAR Tech, formerly known as Translators without Borders so fitting guest for today's episode #100 is Andrew Bredenkamp and of course, Andrew is also known as the Founder and former CEO of Acrolinx, the content government's platform. Stay tuned for a fascinating conversation.

We had Andrew actually as one of the last speakers for our in-person conferences back in 2019 in Amsterdam. That's right. More than three years ago, believe it or not, good old days of in-person conferencing. Good old days. It's crazy like that two years on and there's still like there's some ambitious people have scheduled in-person conferences. I think even a GALA is trying to do something in San Diego, repeating the San Diego conference, but with the Omicron variant and all of these continued restrictions, it's kind of doubtful so let's see.

But first the news, I mean, we got three weeks to catch up. I checked before. This is probably the longest pod break we've ever taken. Actually, I don't know why, long Christmas break and now we're back after three weeks, but a lot has happened, so let's get right into it. So first, just want to briefly talk about the best and worst-performing LSPs. All right, it's a little bit clickbaity.

It's the best and worst-performing listed LSPs. Okay. Yeah, somewhat reduced number.

Yeah, it's a bit of a reduced number and then we're going to talk about Unbabel's surprise acquisition, Language I/O fundraise, Straker M&A, and GLOBO's investment, so a lot of financial topics, of course, because there was no holiday breaks for the investors and the companies looking to buy other companies. So who was top, Esther? Who was the top-performing listed, meaning quoted and traded on the stock market LSP of 2021? Where should I put my money? It sounds like you're going to do a drum roll. No. Well, okay. I'll just say drumroll, it was ZOO Digital. Yes.

The London, well, London listed, but Sheffield based ZOO Digital, media localization company did pretty well. They finished top of the list among LSPs. So if I had invested 100 quid at the start of 2021, how many quids would I have now? Are you going to make me do some maths? It's just double. What is it? It's up like, okay. There you go. I was going to say it's up a hundred and something percent, so yes.

So, I would have 200 quid. I would like double the amount of money. Yeah.

Brilliant. That's not a bad return, I would say. It's not a bad return, so if we doubled this again, we get 400 quid, 800 quid, and then in 10 years or something, we'd retire to an island and watch Netflix, which was subtitled and dubbed by ZOO Digital.

All right. Let's not belabor it is further. Number two was not an LSP, but basically, a company that owns an LSP, AMN Healthcare. They own, who do they own? Stratus Video, right? Yes. Yeah, they've rebranded, but that's who they acquired. Rebranded and they were up, let me just check the notes here, they were up 77%, quite good performance.

Number three was Teleperformance who own LanguageLine, so another kind of remote interpreter. They were up 40%, but of course, Teleperformance, a much bigger conglomerate. Number four is RWS.

Number five is Honyaku Center, still in the positive. Then Straker just eked out a gain of like 5% or something. Keywords is flat. Uphealth, those are the ones that also like a kind of a healthcare.

They only just began trading, didn't they? Only just began trading, so this is not for the full 2021. Then Rozetta, going down to the negative. AI Media was down 27%, so you would've left with fewer quid and then Appen, which kind of tanked last year is like minus, less than half or 56% down, and then THG Holdings. I actually don't know what's going on with THG. They own... who do they own, Language Connect? THG Fluently.

THG Fluently, you're right. They rebranded and so they were down, but that's not going to be connected to the LSP holding, so mixed bag. Overall stock markets were up super strong in 2021, one of the best years, I think for a while and so, ZOO Digital did well, AMN did super well, and the rest was kind of a mixed bag, so I also checked...

Well, we've got a new one to add, haven't we? Got a new one to add to the charts. Who's that? Tell us more. It's STAR7, our friends in Italy. It's a little bit complex because they're partially owned by a STAR Group, but they listed some of their shares at least on, I think it was in Milan on the Euronext growth, so we'll track them. We'll start to track them and see how they do in the long term. Yeah.

I would love to have like a financial expert on the pod and just walk me through this maneuver, so it's a Swiss company that owns, or that acquired an Italian company, and then that Italian company floats 20 or 25% of their shares in the stock market and raises like 15 million euros, which isn't exactly a lot so, yeah, that's a complicated maneuver. Well, it's the company that acquired LocalEyes, isn't it? You're right. As well, quite recently, so they were also in the news for that.

I'm sure there probably was some part of the financing of the LocalEyes, like eyes, like your eyes, eyes, the see eyes, and that was part of that IPO of that company or that listing of those 25% shares that probably financed that LocalEyes deal. Anyway, complicated. Don't really get it. Amounts seem a little small to go through a listing process, but hey, here they are, and let's go and track them. A firmly private company is Unbabel. Who did they buy, Esther? They acquired Lingo24.

And why was this a surprise? Not long ago. Why was it a surprise? Well, I suppose last thing before Christmas, we weren't sure if any other deals are going to come in, no. More so to do, I think, with the profile, to do with the profile of the companies really, so you've got Unbabel, sort of heavily tech-enabled, et cetera, buying a more, well, decades-old LSP which obviously is also tech, but more kind of on the traditional language services provision side of things. This is one of the first deals where we see one of those kind of AI agency well-funded VC-level startups buying a more traditional LSP, I believe, especially I think in terms of the size is probably the biggest so far and so again, Unbabel based in Lisbon, California / California. Lingo24, I think is based in Edinburgh. They have an operational hub in Central Europe as far as I remember, and the business was founded and it was probably majority-owned by Christian Arno, who I met a couple of times at our conferences and I think he had stepped back from the CEO leadership role a couple of years ago, maybe two, three years ago and now they sold the company to Unbabel.

Again, kind of AI agency / tech-enabled. We had Vasco on the podcast, of course, so Vasco, the CEO of Unbabel, he said that the acquisition will allow them to "expand beyond customer service faster and deliver more comprehensive multilingual solution for customer experience, starting with the first touchpoint in the customer journey: marketing content". So, Lingo24 has a nice customer base on the marketing content side so the core kind of vertical for Unbabel was that customer service, content, automation, et cetera, and now they're going a little bit further and also covering marketing content. They also said that the brand, the Lingo24 brand will stay over the short term, which means over the mid to long-term.

Probably going to disappear. Yeah. Vasco also told us that they didn't have any kind of prior business relationship before the deal, so yeah, they didn't have any crossover there and he also mentioned that Lingo24 comes with a wealth of customer relationships and experience delivering to enterprise customers, so this is probably one of the key drivers of this acquisition that you can get a very nice, very diverse enterprise customer base in a deal, right. So, that's when then Unbabel...

it takes forever to establish these enterprise relationships. We know that you need a ton of salespeople, a ton of marketing dollars and if you can do this via an acquisition, that's great, and then Unbabel can go in and really roll out their kind of tech efficiencies. I would assume that's what they want, right. Yeah, so interesting move by this AI agency to acquire more kind of traditional LSP. We talked about the kind of customer portfolio and I'm really curious if we're going to see more of that. I mean, maybe Lilt's going to try to buy something at some point, who knows, right.

So, interesting dynamics here if you bring in these kind of VC-funded companies into the M&A mix, right, so who knows, maybe Smartling will continue to come back on the acquisition trail at some point as well. All right, Esther, so there's another company that we're very familiar with that we also had on the pod, that is a little earlier in the startup journey. Tell us more about that, and how much money they raised. Yeah, so you're talking about Language I/O, Language I/O and we had the company's Founders, Co-Founders on the pod, so yeah, they've just announced a raise. I think there was sort of, well, a couple of days ago, 11th of January they announced they had raised 6.5 million UA dollars in a Series A so that was, I

think they had a seed round March 2021 about 5 million and total raised to date is 12, just over 12 million US dollars. Yeah, startup in the sense that I mean, they got going several years ago, but they've been looking more so onto the customizable NT solution side and they focus also on customer service content, high volume customer service content. Basically, just to pause you there, I mean, basically, Unbabel's original vision, right? Sure.

Yeah, exactly. The kind of customer service content, high volume, they offer sort of a ton of connectors as well, so trying to automate that process. I mean, I think, to begin with, we were kind of referring to them as an AI agency, so this like sort of putting them in a similar bracket.

But I would say, sorry, sorry to pause you again. I don't think it's an AI agency. I mean, they're pure tech. I mean, they may have a little bit of managed services, of course, but I think this is pure tech.

There's no agency model. It's not like they're sending, I also remembered from the podcast, they're not sending a lot of content out to translators. This is mostly just technology.

Yeah. No good point. I mean, I think, yeah, they've got a workflow.

I think we even, they even provided us with like an example of some of the workflows that we put in the article, so anyone who's interested have a look at that, but it kind of couples relies on MT coupled with kind of extensive glossaries and lots of other things as well. But, so their round was led by Gaurav Tewari from a VC firm called Omega Venture Partners based in Silicon Valley. The round was also participated in by existing investors and one new VC investor called Caruso Ventures.

Apparently, the CEO, Heather Morgan Shoemaker, told us that the round was actually oversubscribed and they were the ones who kept it at 6.5 million dollars. What to do with all that money. No, this is prudent.

It's prudent. I mean, it's like, if you can raise more, but you don't really have an immediate, I guess, maybe use or plan or strategy for it, like why right? Eventually, you need to deliver on all that money raised and you probably want to be prudent, so. Yeah. Yeah. Well, so what to do with the money? Good question. They're going to be doing some active hiring, more active hiring.

They've hired a CTO, well, Chief Information Security Officer, and also a VP of Partnerships recently, but they're now also actively hiring in the UK, in Europe, as well as in the home market of the US. They want to also scale sales and marketing, which I'm sure they'll use some of that money for, they're going to also hire into R&D, so research and development with the aim of expanding into kind of conversational voice, so I suppose more so on the, yeah, like chatbot type thing I would imagine. Yes. We need to bring them back on the pod.

Congrats to Language I/O and moving to a cross-continental deal, our friends at Straker bought another boutique LSP. This time they bought IDEST, which was founded in 1990, not 1999, 1990. Currently employees 18 staff based in Belgium. Annual revenues, currently at 4 million euros. EBITDA around 400,000 euros, so a 10% EBITDA margin.

Transaction closed on January 1st, 2022, and so yeah, Straker is known for acquiring relatively small boutique LSPs and kind of moving onto their platform, et cetera. Of course, now they also have Lingotek, and so the IDEST CEO, Jean-Paul Dispaux told, well, it was quoted in a press release rather, he didn't tell us, but they quoted him in a press release that they first met Straker CEO Grant Straker in 2017. Talked about an acquisition, at the time they said, well, it's maybe a little early, we'd like to focus on growing the business. Growing the business they did, scaling from 1.8 million euros in 2020 to

2.8 and now, well, four kind of run rate, I guess, and Straker will pay... This interesting deal, it's almost, it's more than half is actually in what's called an earn-out, right. For some of the CEOs that are listening to this pod that are looking to sell a company, Straker typically goes quite heavy on the earn-out. Meaning part of the compensation you're getting for selling your company is in deferred payments depending on the performance, so they say, well, you need to hit X, Y, Z revenue targets and if you do, we're going to pay you an additional amount, right. So in this case, they pay 1.7 million in cash Euro and 250,000 euros in shares, Straker shares, and then over two years, the founder, the IDEST CEO could get another 2.5 million in earn-outs, so that puts the entire

price tag at around, what is it? 4.5 million euros, which is quite a lot, it's kind of more than one times revenue and like 11 times EBITDA, so stretchy, stretchy for a boutique LSP. Also, the clients are mostly, I think, public sector, so very heavily with the European Union Commission, given the location, of course, and other institutionals. You said they're in Belgium. Is it Belgium? Yes, they're in Belgium. I hope I don't get this wrong.

Yeah, so I mean, Belgium, Brussels, right. No, they're in Brussels, based in Brussels, so European Commission, kind of obvious. Interesting move from Straker did not see them buying anything in the public sector space so, good luck responding to those lengthy RFPs that certainly are coming down the pipe.

All right, so next up is another former podcast guest, GLOBO, healthcare interpreting, and other interpreting. What happened there while we were away? Yeah, well this is quite recent news as well. I think broke also this week, so GLOBO, as you said, it's a US-based interpreting or on-demand interrupting and services provider.

They secured a growth investment from VSS Capital Partners, so it was a minority growth investment. We don't know the specifics yet, or maybe never, around the amount, the terms of the valuation, et cetera. But, yeah, I mean, it's interesting in the sense that as you said, GLOBO operates a lot within healthcare, so they would work for clients, including hospital systems, physicians, healthcare insurers. Also across other sectors, such as financial services so I think when Gene spoke at SlatorCon recently, he was talking a little bit about sort of the insurance providers, mortgage service providers also, banks, things like that. So, alongside also technology, education, and public services more generally would be their current end markets, and they supply services, as I said, on-demand both telephone and video interpreting as well as onsite and sign language interrupting, so that's kind of the core services there in addition to providing, I suppose, translation of documents, emails, texts, chats. Probably sort of a significantly less or lower part of the business there.

They have their own cloud-based platform as well called GLOBO HQ, so that's kind of the hub for managing everything through and yeah, they plan to expand the service, expand into other markets, sorry, so other end markets, so both. We'll find out a bit more as to what the plans are specifically for which markets to target and where the biggest traction is at the moment. But interestingly, VSS, I mean, they're kind of big into healthcare themselves, the investors, so their current portfolio, I mean, they say their focus is healthcare, business services, and education so you can kind of see clearly where GLOBO fits into that focus as well. I think, I mean, this is mostly US-centric, right? I still think the European remote interpreting, not okay, like video remote / generally just remote, even OPI market is one of the few remaining, I wouldn't say greenfields, but I think massive opportunities for somebody who's willing to invest in it.

I mean, the US you're seeing all this activity. All of these big conglomerates buying interpreting providers now, GLOBO are raising money. Talked about Propio as well, I think. You're right. We don't have that on the agenda, but I think we literally just published it, what minutes ago? Yeah, you're right.

They acquired a couple but they also had some growth equity sort of six months or so ago to your point of investors coming into that market. And again, that's US, right? So, and then we have all of these, I mean, companies, kind of national companies, are long kind of national boundaries in Europe, right. You got the Swedish... DigitalTolk DigitalTolk.

Yes. I mean, we have thebigword in the UK, et cetera. But I mean, if anybody could scale this across Europe, huge opportunity, huge opportunity. It's super fragmented though like we were talking about this at SlatorCon, weren't we, on the panel, just super fragmented, not only on the supply side but also from a buyer perspective when you think about the requirements for the countries. That's the challenge.

That is the challenge and the opportunity, I suppose. Seriously, there are other areas. I mean, there are many, many, many, many other business services that have successfully managed to grow across all of the European Union and the UK and Switzerland, et cetera, like this should not be an exception. Is it complex to tailor your requirements to all of these national legislations and language access laws, and what have you and all of these different countries? Absolutely. But I mean, there would be a massive opportunity to have a consolidated back-end and kind of a consolidated linguistic kind of pool of qualified resources, et cetera. So just putting it out there, get in touch if you want to have some strategy advice.

Just kidding. Cool. All right, so that was it for today's news roundup, and we'll head over to talk to Andrew Bredenkamp.

Welcome back, everybody. Welcome back to SlatorPod episode 100. Very special guest today for this century episode. Joining us today is Andrew Bredenkamp, CEO of the NGO CLEAR Tech and Chair of CLEAR Global, parent company, of course. Hi, Andrew.

Hi, there. Hi, everyone. Hi, Andrew.

Welcome. So, Andrew, usually I ask people where this podcast finds them. In your case I know because you're joining us, I guess, from the place near Zurich where I actually, I grew up in.

Yes, your school. Your former school is a short walk up the hill from me. Small world.

Familiar shores there. Hey Andrew, so we met a couple of times, of course, in person actually, since the pandemic, but last time you also presented at our Amsterdam conference and you had a great story to tell there around language data, et cetera. You have a very long background in the language technology space, so before we go into CLEAR Global and what the mission is there, why don't you tell us a bit more about your professional background and kind of your journey in this space? Yeah, happy to. I think first thing I would say is really, although I've spent the last 25 years or so doing language technology, I'm really a language person. My first degree was in languages, then in translation, and then in linguistics, and I only late on got into the language technology and AI, particularly in natural language processing.

So, the reason I mentioned that is that I really do come from the angle of why this stuff is useful and what it's all about rather than being a technologist, having a screwdriver, and looking for screws everywhere, or hammers and nails, whatever the analogy is. So, I come at it from a desire to communicate and a passion about crossing language barriers and how we can communicate more effectively. Technology is just a fascinating way, another piece of the puzzle as to how to do that well, so my background is really, I ended up having studied lots of things, ended up doing a PhD in Natural Language Processing.

Then moved to Germany in the nineties, the late nineties to work at the German Research Center for AI. I then led a couple of projects there. I ran the Transfer Center and out of that came a spinoff that we ran in Berlin for 20 years or so.

Alongside that, I started working with Translators without Borders, as was, I was a founding board member of the US entity. Lori Thicke, founded the French entity, and then as we expanded, we founded a US entity, and then I took over as chair when Lori stepped down, became more and more involved in that. As I stepped away from my company, the end of 2019, I suddenly got very involved with TWB.

We made a transition to CLEAR Global, and we'll talk a bit more about that later and I've since sort of stepped in to join, to really help to drive the technology piece of the puzzle that we're now doing at CLEAR Global, and very exciting times indeed, so it's a good place to be. Before we jump into sort of CLEAR Tech, CLEAR Global, I think many of our listeners will be familiar, very familiar with Translators without Borders. I mean, you were there for more than a decade, why don't you tell us a little bit about the kind of pre-CLEAR Global era and some of the key milestones and challenges along the way with TWB. So, yeah, I think there's probably been three phases to it, to the history of TWB and now in CLEAR Global. The first phase was very much, so the whole thing started when Lori Thicke asked some of her translators if they would volunteer their time to work with humanitarian organizations in the Paris area so it became very much if you would work for free, could we do translations for these local NGOs. As it grew, it grew into a large community of volunteers who were offering their time, spare time largely as translators, between jobs or evenings and weekends to help with this content that needed to be translated.

But it was very much a volunteer effort and as it grew, it became harder and harder for us to meet the needs of our partners, the big UN agencies, huge international organizations who are trying often to respond in crisis situations where they needed guaranteed turnaround times and really well-run projects and so there was a need to put a layer in between that of professional project management and technical resources, et cetera. All of the things, you'll know if you've worked in an LSP or similar, you'll know that these things, it becomes more complicated than just a translator translating stuff and so there's all those layers at scales that we had to put on top of that. So we started, there was a shift from pure volunteer to hiring professional staff and hiring project managers who could help to marshal the resources and work with the community to meet the needs of our major partners. We work with all the big UN agencies, all the big international NGOs, as well as increasingly local organizations worldwide. So that was one big shift that we made and moved into phase two, which is where we were still largely translating but it was translating as a much more professional organization and then came, I think, it didn't come all of a sudden, it came gradually over time that we evolved into adding way more things to our portfolio of services we were offering companies. Our organizations, partners organizations that we were working with so language services really meant more than just translation.

It meant software localization, but it also meant advising them on communication strategies. It meant designing posters and doing radio broadcasts and subtitling, a whole range of other activities that went beyond classic translation, localization, and technology came into the mix. We were being asked, could we provide machine translation for some of these language pairs that no one else was doing for low resource languages? Could we provide voice technology speech recognition? I'll talk more about that in a second. But the whole technology thing started and at the same time people were asking us, so we're going into Cote d'Ivoire, which languages do we need? And suddenly we needed to research. We needed to have a research profile where we could find out how can we quickly discover what the language need is? How do we reach people if we want to communicate to people in languages they'll understand, so out of that came a need for research, a need for technology, and the traditional sort of language services piece that became also often very embedded in programs so we had offices in countries where we're providing language services, as well as this global remote community that was working so it became a bigger thing out of that. Way more than translators and the translation remains a huge part of what we do, but we were growing outside of it and that was the need for sort of the idea, the drive behind needing to give space for those other pieces to get and I think, so that's the first part, the name change.

The second part of the name change is really about the without borders piece, the without borders concept is very, it captures the idea very well if you're flying doctors from Europe or the global north into Africa for some crisis or the Philippines for a hurricane or a typhoon or whatever, but our whole approach is deeply local first, right. It's really about understanding the local needs and having local people, helping local people meet those local needs. So, it's not about us coming in and saving the people in these places who don't know how to help themselves, they do.

All we're doing is enabling some of that or helping some of that or bringing together conversations, making them happen so the without borders thing felt a bit, was a bit uncomfortable to us. In the sense, it doesn't capture our approach and our ethos, which is very much local first so that was kind of what was behind the desire to shift away from that slowly. Translators without Borders remains the name for our community, which is really a global, a huge global community, which works across borders and they are largely translators so we felt that was a good name for that community, but the other pieces now have a larger mission and room to grow into it.

Got it, so you had this evolution and then kind of launched into that brand expansion also. Can you give us a sense of the, I mean, the size and the breadth of the organization? I mean, what you've already described is vast. But I mean, in terms of the number of offices, volunteers, the mission, et cetera. Sure. We were ready for COVID before it hit so we've been a virtual organization since the beginning. We have no big brick-and-mortar office anywhere.

But now we've grown the community or the community has grown, I'd like to say we can take all the credit, but in fact, it's largely, I think the community has grown itself. People have been flocking to us to volunteer their time and effort, which is hugely appreciated. We've grown the community, I think from when I took over as chair, I think we were about 3000 people when we hired Aimee, our Executive Director and she and the team have grown it now to over 80,000 people and they are in 148/149 countries.

Something like that. We have hundreds of language pairs, over 200 languages covered and it's really, it's become a huge global thing that we are now starting to really get our, yeah, get our heads around what we can, how we can work together with this community. I think up until now, it's been largely a resource that has offered their time for language services but there are many other things we'd like the community to be doing too. How big is kind of the admin group of the organization? Maybe, can you just speak a little bit about financing and funding and where you've kind of been most successful? So, we have a team of people who manage the community, I wouldn't say manage the community there. It's sort of nurturing their community and working with the community.

It's a very collaborative sort of effort and largely the funding for our work comes from two or three different sources. So, the first major part of it is partnerships we have with the big international organizations, so with the big UN agencies. We have global partnership agreements where we, annual agreements where we agree to offer them language services of various kinds. We also have similar agreements with big international NGOs and the like so, Save the Children and all those kinds of organizations, so that's the sort of partnership model really for language services. Very similar to a sort of client-vendor situation with an LSP. Although of course, we're negotiating sort of, unlike an LSP we're very focused on the mission part of this, so we don't do their press releases, we don't do their websites.

We don't do any fundraising activity for them. We're working on the content that they need, the information that they need that helps drive the mission. So, it's really operational content, mission-oriented content designed to reach people who aren't normally getting access to information, so translation for us has to be for a reason.

If you just want to translate your website from English into French, we're not going to provide those services for you. There's plenty of commercial vendors that will do that. That's not our mission. We're focused on delivering that content where we can provide a humanitarian development aspect to that where it's not currently being done because it's not commercially viable, so we have those kinds of packages.

The second kind of work we do is largely project funded, program funded where big international donors, institutional donors, big governments, the government agencies that fund international developmental crisis response. They provide funding for us in many contexts to support large-scale crisis response, so in Northeast Nigeria, for example, in DRC around the Ebola crisis, in Bangladesh around the Rohingya situation. Many of those kinds of situations, there will be a huge international response and we will be there to support that response with their communication needs, translation, localization, and other communication needs around that, so the programs will be funded as kind of a shared service across those responses. And then when you think about the group, the CLEAR Group, you've got a couple of different sort of things going on there, you've got CLEAR Tech, CLEAR Insights. Maybe walk us through how you collaborate across the organization and if you're thinking specifically about one of, like a recent project or something, can you give us an example of how some of the organizations work together across those groups? So, yeah. I can give you an idea of how it should work in practice.

In theory, it never quite works like this, but the broad idea is that we're an evidence-based organization, so we don't want to do anything unless we understand why we're doing it, is there a genuine justification for doing it, and are we doing the right things to have the most impact? So, our mission is to give people access to information and have their voices heard, so it's about these two-way conversations that we're trying to set up. So, first thing is we need to do some research to understand what languages are spoken, what channels are available, what's the digital access? Do people have access digitally to information or is everything on radio or how do they get access to information? What are the levels of literacy and how do we reach the most marginalized people who get left behind in many of these other programs? So, the first thing is to understand what we're doing. Often people come and say, we want a chatbot and we'll go well, really? Are you sure you want a chatbot? And they want the new shiny toy, but there's no data to suggest that that will actually reach more people or have more impact, so we want to do the research to understand are we going to have impact by doing this thing? And that will be often a collaborative thing and that would give us confidence that what we're going to do is the right way of reaching these people and engaging them in these conversations. First thing, second thing is to design a program which might include a technology piece, but it might not, right.

The best way to reach people might be to make some posters and print them out and stick them on the wall, right. It's unlikely that that simple approach would work. Typically, you want to have multiple channels, and where we do do technology, it's usually folded into other ways of other community engagement or accountability. Accountability is about asking people, giving people access, letting them tell you what they're thinking and how they're engaging with the work that you're doing.

Community engagement is really about getting them on board, so both of those things, multiple strategies are usually needed, so you want to have multiple channels, but you need to know what they are. So, the research will tell us what they are, and then we'll design those in a combined package. That would then be the right way, the most effective way to reach people, and then we have the program work that needs to happen, and that will normally be a collaboration.

Most of our projects involve partnerships where we work with big international organizations or preferably also local organizations that understand the context really well and can understand how all the bits and pieces need to work. So, the strategy is kind of research, so we know what we're doing. Design programs, which will often be supported by technology, which gives you that extra scale and that extra reach and then partnerships with whoever we need to work with in order to get stuff done on the ground because as I said, we're largely a virtual organization. We do have staff in countries but typically most of our work or a big part of our work is provided remotely, so they're the sort of key three bits to it and underlying that, of course, there's also the community who supports all of these activities as we go along. You mentioned a couple of countries before and projects, like where's the current focus and like, how do you pick those projects? Do you pick them, do they pick you? Do you have like a list of 100 things that you could work on and then you could have prioritized? So, we've recently added a few areas, so traditionally our focus has been in South Asia, especially Bangladesh and Sub-Saharan Africa, but over the last year or so we've been running projects in South and Central America, now recently started a project in India. We've done projects remotely in over 80 countries.

As in, we have worked together with partners who are doing work in those countries where we've supported their work, so our reach has been into over 80 countries. We have offices, real brick-and-mortar offices in three countries but we have reached into many, many more, and I would say Central America, especially South America also, but Central America is going to continue to grow. We're doing a lot of stuff around the Venezuela refugee, migrant situation. South Asia is a huge area of so many different aspects of our work, we're getting engaged in there and Sub-Saharan Africa is, yeah, we're expanding in there as well. Kenya, Nigeria, DRC, Democratic Republic of Congo, but now yeah, working in Uganda and Rwanda and other countries as well, so it's an expanding situation. Just starting a project for Amharic, for example, which will involve not people in Ethiopia, but will involve us working with Ethiopian staff.

You mentioned Venezuela, so when I think of Venezuela, I think Venezuelans crossing the border into Colombia, but that's Spanish-Spanish. Where's the language component? Right. Well, I mean... Don't want to put you on the spot.

So, there are regional languages. There are non-Spanish speakers, but you're right. Spanish, it's not a, obviously, then it's not a translation problem that we're facing, but it is an access to information problem.

So, that's one aspect of it, is that first of all, what we've done is opened up a new channel of communication through conversation AI, through a chatbot that we made available in Peru, Ecuador, and Mexico. But an important aspect of that is, and it's really about providing people access to information through a new channel, so it's not translating anything. The information is all there, but it's not been... this channel is making it accessible to more people, that's the first thing. The second thing is the streams of people who are flowing from the South towards the US.

They're not just Venezuelan migrants or refugees. It turns out that there are, suddenly, we were discovering because we were listening as well as speaking, it was a two-way conversation. We were able to discover that there were lots of French speakers in this flow of humanity. Turns out that there is an established flow of people from West Africa who are going through this and joining these channels and trying to get to the US so, there was a lot of questions coming in French around information that they were trying to get ahold of. On top of that came Haitian Creole because of the continued crisis in that country has led those people to be joining that as well, so it's not a simple, these things are never as simple as they look at first. You mentioned chatbots there, so can you tell us how that works exactly? Like how do chatbots enter the picture and when, because you said it's not always appropriate, but how and where do you use them? Yeah, so, chatbot starting, chatbots have become a thing.

They were a huge hype and rightly because I think they're a fascinating new area of new channels for communication with people, but they have suffered from every other tech, the fate of every other technology, which means massively over-hyped and then people get disappointed with them because they make assumptions about them, which the technology can't live up to. So a few things, I think, chatbots are huge. It's not a very well-defined thing, right. It could be chatbots can be incredibly dumb.

They can be very transaction-oriented like they can just say, I can help you open a bank account or move some money from one account to another or whatever in banking. They can help you answer your support question. If you go onto their website, whatever. They can be incredibly dumb things, but if they're used properly, then they can be really used for two-way interaction, for conversation.

So, conversation's a big word, but what we mean by that is that we listen as well as speaking, so we're actually taking in the effort to elicit information and hear from what the person has to say as well as just saying, we're going to tell you some stuff. Now, obviously, you can then think compared to a poster, this is a two-way conversation, right, automatically, and that's a really interesting area, is that we can build really interesting experiences like that. Second thing is they can be multilingual, so everyone in Sub-Saharan Africa is multilingual, right, so the question is, you can't just put something out in one language and expect it to work for everybody and so, what the tech community has just discovered is this thing called code-switching, which has been around forever.

But it's this idea of people switching languages as they speak and this is really hard for traditional tech to get to work with and so what we've been working on is building multilingual chatbots that can exist in that kind of environment, can be two-way and can be comfortable with switching languages in the middle of a conversation, et cetera, and the last part of them that I think is key in our approach has been that we really take the listening piece seriously, so we look at the conversations. We work out what the conversations are about and we were able to discover, for example, in our first chatbot, which went out in the early days of COVID in Sub-Saharan Africa, in DRC, we took the WHO's information and put it into a chatbot basically as an FAQ, so you could ask it things about what the WHO was saying about COVID. Now, well, what turned out was that young mothers, especially, were really interested in, can I pass it to my child, or if I'm pregnant, will the child get it? Or what's the risk to my baby. WHO had said nothing about that yet but we were able to give them feedback that this is a huge content gap that you have because we were listening to what the conversations were about and then we were able to plug that content gap and move on, so that listening piece, it's not just a slogan, it's actually something that makes us better in the way we engage with people. So, we're learning.

Sorry, just one last thing. I think we're learning how this works. We've done now a series of chatbot programs in South/Central America, in India.

We're doing one in Kenya and Nigeria and whatever. Everyone's been different right now, so we're still learning what the different ways of using this tool are depending on the context. And what's the interface there. I mean, is it a smartphone and if it's a smartphone or like a semi smartphone, and if it's a smartphone, does it have to be connected to the internet or can you kind of from time to time reconnect it? Yeah. There are a couple of different ways. I mean, I think what we've done so far, largely, has been smartphones, has been with text messaging on Facebook or WhatsApp messenger or Facebook messenger or WhatsApp or telegram, whatever.

We can also do it via SMS gateways and so going into lower level connectivity, people have that. We also have a thing, I know this won't go out on phone, but we have a hardware solution which we call Tiles, which is a Raspberry Pi with a little screen on it and you can put conversational AI onto a Raspberry Pi, which is a little credit card size computer and then you can put that out in health centers or hardware stores, or places of worship or wherever people gather who have no connectivity or perhaps no literacy or low levels of literacy and give them the opportunity to talk to, literally to talk to them and have a conversation. So, the conversational AI as an engine can be used in lots of different modes. What do you see as coming next? I mean, where are you sort of spending time and money on R&D when it comes to sort of AI and machine learning-based language technology? So, we work with existing technology wherever we can, so if language technology exists we're not going to build it again.

But in many cases, especially for the most marginalized people, they speak languages that aren't well supported by commercial software and so we spend time understanding how we can quickly build language technology for those languages often with very little data available, so often we have to create the data, which is another context where we want to mobilize our community to help us create data more quickly for these languages. So, we built core technology for voice recognition, for the stuff, the underlying machinery for chatbots, as well as machine translation, which we continue to work on. But at the same time, we're also looking at deploying, not just building the engines, but actually building applications, so opening up channels of communication in those languages and so it's a multi-layer thing. We collaborate with, there are amazing networks now of local, so much expertise, especially in Sub-Saharan Africa.

The networks, there's a network called Masakhane which develops natural language processing technology. We're collaborating with them on using, developing expertise, working with local experts and often researchers or young students who are doing their Masters or their PhDs in these areas, and helping them also understand where their technology can be used to have social impact. So, it's a very collaborative environment and the research is really around those key three areas in low resource languages.

We also work with Big Tech, with Google and Microsoft and others, Amazon and Facebook, and others on collaborative efforts, open-source collaborative efforts to drive more availability of technology for low resource languages, so yeah, there are many players in this. Can you tell us a bit more about Africa specifically? And I think you traveled there pre COVID, not sure since then, but like what's kind of the language technology environment there? Also from a talent perspective, is it hard to find these people or convince them to help you, work for you? Are you competing with Big Tech, like how does that work? Yeah, I would never say they work for us. They hopefully work with us, which is maybe a pedantic distinction, but an important one. There are huge networks so, when we started the whole technology piece, we were the only people doing this, seriously, nobody else. We helped Google build, Google and Microsoft build their first Swahili machine translation capabilities back in the day.

But now, we are a tiny part of a much, much larger network of experts. Some experts, as well as a lot of people who are getting into this, so anybody, any young student who gets into machine learning and AI wants to do either natural language processing or vision, they're the two things that you do. So, there's been an explosion of expertise and talent, and so it's become less, we need to do this on our own and more, we need to work out how to work with this expertise to help it have social impact, as I say, and not all just remain as research papers. So, I mean, I was really interested in what you said there, with kind of collaborating a little bit with Big Tech.

I mean, can you tell us a little bit more about how that might work in practice? Is it sort of they're feeding information or data into you? Do you channel anything back once a project is closed? Where's the benefit for both parties? Yeah. I mean, clearly, there has to be, it has to be a win-win. We only work with them on projects which are aligned with our mission and where we're seeing impact for us. One major initiative that we did has been we ran a project together with those big organizations I just mentioned around COVID information. Most of the existing machine translation engines didn't work very well with COVID information.

They didn't know what social distancing was or all of the other terminology and jargon that we suddenly invented when confronted with COVID and so we worked in a collaborative effort with all of those organizations in a thing called TICO-19, where we built language data to train all of their engines on or to improve all of their engines with respect to COVID information so that all of their engines were getting better. All of the content was open-sourced, released under Creative Commons that is widely available, so it wasn't just us working for someone and giving them content that belongs to them and was locked away, so it was an effort that was for the common good and helps drive things forward. Of course, lots of people still use Google Translate and Microsoft translation software capability in order to get information, so it's not something we want to ignore just because they're a commercial company that makes lots of profit. That's something that's not interesting to us.

What's interesting is are we achieving our mission? Over the past two to three years? Have you noticed like an improvement in all those low-resource language MT engine slash applications? Because we've noticed that like there's been a huge interest, especially also from Big Tech around low resource and then you have these massive multilingual models, which are supposedly kind of translating anything to anything and like, is this PR, is it research PR or do you feel there's an actual breakthrough happening here? Great question. Great question. I think it's over-hyped.

I think, I mean, these are serious research efforts and the results are really impressive and they are serious research efforts, directionally really, really interesting. What it doesn't mean is that we have now for 100 languages, machine translation capability, like we have for English, French, Spanish, and the like, which are now at sort of almost at human levels. The translation itself, the translation quality deteriorates quite quickly once you get past that first front languages and so, yeah, often people will think that there is machine translation capability in all of these languages. It's not usable in the way that it is for the big languages, but I'm old enough to remember what machine translation was like for French, English-French 20 years ago, so it's a journey, right.

It's not a binary, suddenly it works or it doesn't work. It's a journey that we're on and these things live by data. They live by getting used and getting people engaging with them and so it's a huge milestone to say, at least we have capability in those languages and it will start to get used. The key thing is to make sure it gets used and doesn't stay in the research community because that way it will grow, it will get better and ultimately it will get there. It will get to the place where English, French, Spanish is now, but it's not happening quite as fast as some people would like you to believe. How can our listeners get involved with your efforts? How can they support? What should they do? Yeah, how can they be of use? So, your donations are always very welcome.

We like, obviously, we need money to run the operation as Florian said and so donations are always very welcome, but I think from anyone with an expertise or a talent in language we have a huge range of things that we're engaged in and you can sign up as a volunteer on our website and join the community and we'll be in touch to discuss how we can engage with things. I mean, there's, as I say, there's a lot for us to do. We have grown incredibly fast over the years, mainly because there is an almost unlimited demand for what we do.

This idea of giving people access to information in reaching the most marginalized people has enormous resonance from anyone with a background in language who understands how much of a barrier it can be to be stuck in a situation where you don't understand what's going on around you and imagine that multiplied by a thousand and imagine the world going on, the internet happening and you're not able to be part of it and that's really a huge motivation for people to get involved and the way in which they get involved will change over time, but join our community and become part of it. I would strongly recommend it. So with so much going on, what are the top two or three exciting initiatives for you in 2022? For CLEAR Tech. Let's limit it to CLEAR Tech. So, we have some new channels, this Tiles device, which is for the first time, literally last year for the first time, we were able to get conversational AI, voice recognition, and conversational AI on a Raspberry Pi.

This is a new development that is happening and we don't know where that will go. That's now getting down to the size where for $15 you can buy a little computer on a half-size credit card-sized chip and put tech on it, so that, we're really interested in where that's going. Where can we put these devices? What kind of reach can we have with those kinds of technologies? And chatbots have only just started, we've scratched the surface on that, so those kinds of interactions are really interesting. A big area for us is really how do we mobilize our community for some of these efforts as well? How can we mobilize them to help create language technology for their own languages? So, our community of Hausa speakers in the north of Nigeria, how can we work with them to help build more Hausa technology and validate the data that we're seeing and label it and help us build better language tech.

So, that's one aspect to it and I think on the research side as well, how can we gain more insights from our community that will help us do the right kinds of things? So, I think engaging the community and really bringing the technology, we used to have this team called Special Projects who were working on little tech things to try out, proof of concept type thing, and I think last year, and now really this year is going to be the year where that moves into mainstream and that's going to be a lot of fun. Exciting. Well, thanks, a big thanks to you and to the whole team for the really valuable work you do, really appreciate it and hopefully, we can help a bit by spreading the word again and helping with the positioning of CLEAR Global now heading into 2022. So, Andrew, thanks so much for joining us today.

Thank you for your time. I really appreciate it. Thanks for giving me the opportunity to talk to your audience. Absolutely. Super interesting.


2022-01-24 21:41

Show Video

Other news