#tradumatica20_perspectives
all right and uh good afternoon everyone uh my name is Pilar thank [Music] um the editor of the registered automatica right now uh as you probably already know this is the uh panel uh the series of panels we have organized um commemorating the 20th anniversary of the journal uh these series are organized by Thalia Rico who is writes here Talia hello would you like to introduce the session we have right now we are starting uh do you have a couple of minutes to um reorganize my mind uh I'll go ahead yeah uh these series have been organized by Italia Rico Manuel Mata and myself uh we've been talking up to now uh about uh training translation interpreting training and also uh about research from the field of translation studies um the session today is uh on uh future perspectives what's good what we have uh to wait what is going to happen what we have to expect is going to happen with transition Technologies and how it's going to evolve in the future and affect uh translation practice the translation sector um um I'd like to thank Celia and Manuel for having invited me to moderate this panel uh in particular but above uh of all I would like to thank um our two guests uh that uh for agreeing to take part of this session in particular um as this is a round table with two guests we are going to try to establish a dialogue between them so feel free to answer each of the positions you you may uh expose um um but before explaining how this dialogue will develop I would like our guests to give us a glimpse on their uh ties to the world of translation uh let's start with Mr Jurgen Gren currently director of resources at the doctorate General uh at the European commission so Jurgen please differ is yours thank you so much so uh I think you said it all already in just that sentence but uh yes I'm Mike my name is Jurgen Quinn I'm Swedish national and I'm the director of resources in dgt actually general of translation and basically I um I'm responsible for Human Resources but I'm also responsible for e-translation so the machine translation system of the commissioner which is used widely across the European Union so this is really my biggest link to uh to current Link at least through translation but in the past and in my life I've always had a very great passion for for languages for for trying to make myself understood in many different languages and to communicate across borders and this is this is why I actually work in the European Commission sort of a convinced neurophile this is why I'm here so the the connection if you will with the world of languages has been there since a very early age from the very early age on seven eight years something United and you know so I'm very happy working here in DDT as a director of resources and a really great team who is who's trying to to make the best possible machine translation if you ask me it is the best machine translation system in the world I'm just putting the bar at a certain level here you see but that's really my connection and my background thank you we uh brought some some topics to the floor um now is the turn for Michael fergalic who is a professor of computer science at the University and I think he has something to do with the bird team as well yeah I've been teaching translation Technologies at the University of alakant from 1995 I'm currently a professor there I also started a small company called Promise it which does language Technologies uh and uh as as part of a research in or research and development on machine translation we started the apertium project which is still alive it's a open source rule-based system as you know and uh I have been for a while for about six years I was the president of the European Association for machine translation so be quite involved in the organizing conferences uh the eamt conference which we expect we we design in a way that it brings together all of the actors in The Machine translation Arena so users translators companies researchers vendors and so on so very much involved into in these things and uh I always say that this is the best in my life because I also like languages and computers when I was a kid and I'm doing exactly that so who could ask for more very lucky of having having you both uh here um thank you again for accepting this invitation uh as I already said at this panel we would like to address the future perspectives of translation of translation regarding Technologies throughout the previous round tables in the series and in other forums we have pointed out the most disruptive technology issues or those that currently have the greatest impact on translation teaching professional professional practice and Industry in this panel we aim to approach a future perspectives of translation Technologies from an in-depth knowledge of their present we will try to provide answers to some of the most common questions raised by academic and professional community and also rise questions that may not yet have an answer the current development of machine translation and its consequences are likely to occupy a large part of this dialogue but we also want to broaden the focus and pay attention to other Technologies related to natural language processing or IC T that have also an effect on translation so I suggest we start by the end and so in fact so let's start with the users perception of the translation um based on the experience you have you both have in European institutions or in any other context now that we all have mobile devices with from translation apps what do you think the end user's perception of translation is uh and of interpreting and of translators and interpreters themselves um please um Jurgen put your mind starting with this topic no not at all not at all maybe I should say in this context as well that before I started here as a director and resources I was driving the digital single Market agenda for the commission with vice president answer this is why perhaps I'm also here in this role too shall we say bring in the digital aspects also into into DDT which is already very digital of course so it's not that it anyway started with me in any way but that was my background just just to say I mean there was a couple of questions which were sent to the ground by Mikhail beforehand and I think it's it's an interesting way to start to be honest to say so what would a person in the street reply if you if you said what is a translator and I I was thinking about that and I was looking around and and I found many people or several people even here in the house would have a kind of a nugget in you and say ah they don't know who we are you know nobody really cares about us and you know that kind of of attitude was was I wouldn't say that it was it was majority anyway near that but a little bit like this and I would say I completely disagree I completely disagree it is a positive and if you ask people who are not in this profession there is a very very positive view people know what translators do people know what translation means it's actually one of the few professions in my humble opinion which has a very clear profile of what they do if somebody asks uh somebody on the street what does a commission official as a resource is direct to do that would be a little bit more tricky to answer that question I can promise you sir but for a translator it's very clear and also I will say that one thing however there are two I would say perhaps two not necessarily caveats but two nuances here one is that the um differentiation between interpreter and translator may not always be there if you ask the man in the street or the woman in the street this will be a little bit yeah or something but they do understand that both of these professions ensure understanding that's exactly what they do comprehension they understand that these guys and girls are helping me to understand something which I otherwise wouldn't understand and that's Clarity or purpose for me is key and it's very interesting in my humble opinion as well to to see how translators are extremely tech savvy extreme it's actually I worked many years in the other um parts of the commission where people would would feel that they are tech savvy and they would work in word or in Excel or some PowerPoint something like this but here they have a whole cat environment huge number of plugins huge number of different difficult technological instruments or tools that you use every day and that you combine four or five of them highly specialized so I think again it's a very very clear profile and I think it's uh something I would say maybe even for me the conclusion here is that human translation is not replaced by Machine translation there is a difference and it will never go away I have huge confidence in the future of translation and uh maybe it's okay because there will be other questions coming along so I don't want to monopolize this but that's that's my my sentiment probably one of the topics that we will be dealing with during the session uh how we perceive machine translation but Mikhail you may also want to add something to this topic yeah I am I I'm sorry but I'm not that optimistic uh and not because I'm a tech guy but uh I was thinking probably about young people I mean maybe older people still know more or less what a translator or interpreter is but I think I I have I don't I don't have the statistics so we maybe we should do this survey asking the young man or woman on the street the boys and girls what a translator is and they might come with one uh professional which is The Interpreter so they the interpreters are more visible I think but when they you say no no not those and say Oh you mean Google or you mean so I think my my guess which I don't have any data to back is that translators is more and more each day meaning uh a machine translation system I agree with uh Jurgen that it's a profession with a clear profile but I think it's a profession that is hidden from most regular people they they might don't know I mean uh exactly how it works I mean if you work in a European institution maybe you deal with translators and translation every time but I I'm not completely sure that people know and I'm actually worried about the invisibility of the translator professionals text to text translator okay uh I'm I'm a bit uh worried about that in view of what I hear people say which is not statistically significant perhaps I think translators are not going to be replaced by Machine translation oh well let's see because there are some areas where we are doing without professional translators already one could contend that the areas where professional translator sorry where machine translation has been being started to use uh without any human intervention has been those where there were these are new areas so areas where there was nothing before like for example uh customer reviews translation user generated content previews in Amazon or in TripAdvisor and things like this so probably machine translation is being used there where no human translation or no professional translation was used before but there is also the question of how good machine translation is getting and what kind of translation jobs will leave and we will talk about that uh during the the this session but I think the profession is going to change a lot uh there is a very strong industrialization of translation as a profession which means that uh translators who are the workers of that industry uh may not be as visible as I think should be uh desirable and this is also I didn't want to forget to uh to uh to congratulate you for your 20th anniversary if you consider you mentioned this was the the reason for this so and also uh where the question about my opinions uh I am not a very good Visionary so when you talk to talk to me about the future I was explaining this to be live before when you talk to me and ask me about the future I'm going to get it wrong absolutely wrong I think or at least take it with a grain of salt a pinch of salt because uh for example I didn't see neural machine translation coming and I was doing neural machine translation in the 90s starting to do with it to play with it and I didn't see it coming and now it's here to stay so not a good not a good predictor just in case no one wants to take my opinions too seriously I think sometimes stressing the right questions allows to develop the right research that we need so uh no worries um we will we will pick your questions uh uh at least um so do you think that from the user perspective um now that they've got these mobile devices where they can translate stored messages as soon as they need or ask uh on on the road um do you think they have the feeling that they can do it without translators um no Jurgen what do you think I think uh since this is also a dialogue and I'd like to to come back a little bit on what Miguel is saying as well which is linked to what you just asked now um mobile devices translation Etc in my opinion um it's not really it's not a threat to translators because I never had that market this Market is is non-existent and I think you said it yourself Michael you said well let's be careful about things that are happening and then you said in markets which have never used translators before yes so I'm not being controversially Miguel I'm just saying yes but one could see it a slightly different way as well you can say that hang on a minute my translation capacity or my skills my expertise is still needed in a world with machine translation and in a world with machines translation you have hundreds of of billions of pages which are being translated every day and which actually brings into the field that I have chosen in my life to make people understand other languages basically it brings even more possibility for people to understand and to comprehend so that in itself is I would say a a good thing for me but of course I mean you are right the translation profession is changing and this is why I also mentioned that I used to be in the digital single Market because that may means that I was driving or at least at the political level uh part of of the Dynamics behind digitizing the European economy and I saw what actually happened there and I saw that if you're not on this high-speed train if you're not trying to master these tools using them for example they do with tools or you use them you don't become slaves to them you use them so if you don't do that well then unfortunately this train will either run over you or run away from you and that is something which I think is even worse than moving into and harnessing like we said in the digisting molecule the digital Evolution and that's just two things more to say this just about demand of value so you are right something have changed the demand has changed is huge it's more than ever the value that some parts of the market put in some parts of the of the of the demand if you will is also changing so in non-specialist area they don't put a very high value on translation this is true many of these areas didn't exist before who didn't use translate professional translators before like I don't know tourist menus and stuff like this yours in the horror examples are so but but I think that you know in the area where professional translators are the best which is you know the kind of of the translations professional translations of high value documents this is not going away and that's why I'm in my opinion positive that's why I have great confidence in the future you know yeah well um this uh plan of digitalizing Europeans in particularly in Spain I'll start with a program which is called the perte and right now uh the the the the commissioner for for this program is looking for initiatives to fund so that's why we need the right questions probably uh Michael please yeah [Music] um as as to machine translation and adding value of course I mean we know the translation profession is uh is there because it adds value uh to documents which for example would be unreachable for other people and things that is clearly a value added here and uh the amount of value added in the and the actual price that the market puts in on that depends a lot on what people perceive about that so it's very important the perceptions of people are very important uh I just uh you know when Boomers like me get married uh that was 33 years ago in my case uh one of the pieces of equipment that we bought for our house and we were very very careful in selecting it if we had the money or just making the most of our money was a Hi-Fi system you know turntable amplifier maybe a radio tuner and we wanted the best speakers because we want Hi-Fi High Fidelity or sound now we listen to music on our Mobile phones on compressed formats on your on your car with crappy headphones and we are getting used to good enough we have much more music at our fingertips at our ears but we are happy with the reduced quality so this might be another change in the perception of translations by people so people may have started to get used to dealing with machine translation and making sense of bogus translations that they see on an Amazon page or on a Trip Advisor page because they know oh this is machine translation or whatever they don't know how to translate this Chinese stuff and I know what I'm buying so I'll just check around so they know they know how to deal with this kind of output so there's something now there's something new particularly the younger people they're just very practical about all these things so uh of course there are areas where translators would always be needed and I think when I say always I mean at least in my lifetime so but I said my predictions are usually wrong so I think they will be needed and I'm absolutely too uh absolutely uh uh in favor of seeing machine translation and all of these cat tools as tools that's what they are and uh and there should be taught as early as possible to professional translators and get get used to them because as Jurgen said that's a train if you just jump on it or just pass CDC leave uh that's your choice and uh and it's good it's the elephant in the room I mean no one can deny this exists and the cold the whole profession is is changing I'm not a translator so I don't know but that's for my conversations with other translators and in conferences and Etc so yes uh it's a new market and this changes the profession because professions in uh are related to what the market demands um and finally uh I'll leave that for the next question probably thank you because we've got a loop of of topics we would like to deal with please remember that the we've got you've got uh access to the Chart all the attendees May write their questions or comments suggestions whatever there um Talia will bring uh your um questions to the floor whenever whenever you want because that's a kind of dialogue if you see that there is anything that might be interesting at any point just please uh jump in um speaking of perceptions I would like you now to uh would like to invite you to share with us some of your perceptions concerning particularly uh translators and interpreters uh in particular about how translation and interpreting professionals are adapting uh to the new scenarios resulting from the development of Technology um are they technologically empowered um what are technology what are the technological reasons that guide translators are interpreters decisions um May I start now by nickel for instance thank you yes I think professionals uh and uh and even my students are adapting to Technologies uh I mean is either that or die I mean you're you're gonna miss the train the problem I see is how they are adapting to this Technologies and I think as teachers as instructors of transaction Technologies we should think about that because uh so I always have see have seen my teaching of transition Technologies as a case of uh as as you know as trying to improve or trying to help future translators Empower themselves technologically so that they can make the right choices and they can assess the different Technologies for that needs and not get carried away by perception but rather try to measure reality try to to check exactly how much a tool is actually helping them or not so that that's a lot of teaching to do because it means you have to teach them about economy you have to teach them about timing about productivity about how to measure themselves how to make sense of numbers which is I found very challenging in my in my in my teaching uh and uh uh this there's also a problem with uh with there's a new kind of I mean machine translation is changing now we have a new kind of output now and uh I think post editing is completely different now because the machine translation systems which are not neural produce new words they invent words uh they produce deceivingly fluent output which is uh it looks like it looks so good but it's probably not a translation so you have to check carefully if the output is exactly a translation or it does something missing or even worse something has been added creatively by the system in what we like to call hallucinations so uh so this is something that is actually happening and this has this is changing the post editing business completely because uh if if you know like the canonical way of doing professional translation with machine translation is post editing its output then we have to think about that but uh the technological empowerment also involves some kind of I would say uh uh professional awareness about translation and the technological ingredient of that professional awareness which I I think we will have time to discuss later but I think it's very important is that I don't think students are currently aware or as aware as I would like of the fact that this is getting into an industry and they're going to be many of them are going to be just workers in an industry and they're they're going to be a piece in the in the machine and they need to find their way there so that they don't get just lost or so they all of these things I think are very important too and they the technology plays a very important role in the the way now the professional relationships are changing in in the translational profession so this is a very important thing that we should be uh teaching I'm talking about students that's what I what I deal with but I think maybe professional translators would agree or not with this I don't know you like workers the sense that Michael was pointing out I hope we all do but I I can come back to that maybe in the latest I it is not sort of the the kind of Notions or concepts that speak to me so much but what I really have more of an ease with understanding is is the the word of empowerment and and to see whether the empowerment of translators is is there whether it's high whether it's fast I would say it's the empowerment is quite high and I don't think that the evolution is is is very fast to be honest I think the empowerment is that the translators they are trying and testing the different tools that are put to their disposal and they start using almost organically a lot of these different tools which are coming along language memories terminology databases cat environments etc etc machine translation proper Etc so all of these things are coming in I think quite organically which in my uh vocabulary at least means that empowerment is is quite High and also Mikhail you asked a question in in one of your mail is saying is technology Ram down the throats so I think it's a very good question it's an excellent first um and I I would say no but I can understand that the perception will be a little bit like yeah I have to use technology all the bloody time and but then I ask but would you like to translate without any technologically technology tools no the ones that they are getting used to are the ones that they want it's a companion which they don't want to live without there is no translator in this house who wants to translate without language memories zero and that is I mean and very few want to translate outside of without machine translation I mean and if we think about it these are some of the most advanced AI based systems in the world and they are used by translators organically so I would say you know empowerment for me is quite high but I'm not entirely sure to translators themselves think in that sense to be honest I'm not a translator but uh I think that that they they don't see how tech savvy they actually are and I think that's an interesting uh observation at least from my Horizon it seems that way but I know having worked in the digital senior market before how incredibly tech savvy these people are so for me empowerment designer I just wanted to mention that you know Jurgen mentioned this uh uh question about forcing Technologies on the throat of translators I I haven't used I mean this sentence come from from a translator so it's actually complaining about that so there's actually at least a sector of translators that think that this is something that is being uh being uh decided for them so they're that they're not participating and the European Union has at least uh in the in the um in its documents for uh Horizon 2020 I seem to remember there was a concept of responsible research and Innovation which meant that any innovation has to have a success should involve All actors in The Innovation so this this was the sense of my question in the sense that you cannot just force this technology into people uh they have to really appreciate that this is going to help them and the best way is to show them numbers but then to show them numbers they have to be there has to be some number of literacy which which is is not that good in in the students I get so in the sense that they they don't know how to for example figure out how much uh the cost of a word would be for them if they want to charge if they want to make 30 Euro an hour for example so some these are things that very simple very simple proportions and arithmetics they don't master and and that's the only way to convince people about Technologies is that hey this is saving you time and money remember I remember in 2010 there was a study by Autodesk where they asked translators how do you think you work better by post editing by translating from scratch you don't know or the same so you get you got responses in all over the place people believe that posted thing was made them slower other people believed that posterity made them faster the measurement said that post setting made all of them faster so perceptions of perceptions of productivity is not the same as productivity productivity should be measured and this means numbering counting time money so there's that that part of the empowerment which I feel more challenging when I teach in in class yeah but I also wanted to perhaps mention that what you said Miguel which is which is a very good word is mastery I think the Mastery of technology is is key uh and this is this this can be more or less difficult to be honest huh I mean I'm not saying that everybody comes in here and becomes a whiz uh computer kid straight up this is not the case it takes a long it takes some training it takes a long time but but then you get used to it and then and then you use it more and more and you see that it helps you and I want you to link back a little bit to the demand uh as well nikkien because the demand is is just growing all the time and there is no way that human translation can even face this we cannot face just inside the commission we can't face the demand which is here it's just impossible for us we simply cannot do it I mean we have if if I look at what how many um Pages we translate with the translation every year we are at 200 million something like this the capacity for our translators to translate uh by hand and with all the tools that we have is around one and a half 1.7 million pages per year okay so there's a little bit of a gap between the what's going on in the machine translation world and what's going on in human translation world you know and I include also the pages that have been partly translated by Machine translation in that so it's it's simply impossible and that that's that's uh why I come back as well to to this about the the Mastery uh the future I mean the future is is difficult to foresee now I mean we have crisis on crisis on crisis uh may not feel sorry this is a parenthesis I feel really sorry for politicians who've been elected in the past four or five years I mean the stuff that they had to to manage oh God I mean it's difficult huh really difficult but that was a parenthesis so it's of course difficult to foresee but what I can see is that human translation that Notch of perfection that machines cannot give you uh in text with high value where you actually need it that is not going over no and I say the same things to the 50 100 sorry uh stages that we have in Commission in DDT every year 50 plus 50. every year we have that and I I hope that they listen to I have to say because at least that's my observations of reality now whether you will be have the best possible foresights in the world you can lie down but uh this is this is at least what I can see okay so let's let's go to something more uh back to to the floor uh let's talk about machine translation directly it has already um been presented many of us work regularly with machine translation so let's see what do you think are the main handicaps of machine translation or cities right now uh what are the next borders the next challenges that mission machine translation will have uh to face uh in the short term future uh Jurgen if I mean sure I mean I think uh one of the the things with machine translation as it is today the nmt is um is about um I think the main concern is actually how how uh fluent it is so the empty fluency it kind of I don't know loves end users and professionals into some kind of sense of security which which then brings on uh an increased need if you will for focus on translators so this I think is is one rubric the other drawback is of course that it doesn't it isn't human so it doesn't Pro it doesn't produce the level of perfection that is needed for certain types of documents um the other thing and where we are going now is that nmt neuromus infrastration was really a revolution a safe change huh we have uh now waiting if you will for the next step changed to come um we are having increasing or decreasing returns if you will on machine translation we can make it better uh but the improvements are are somewhat in the margins I would say what is important now for us or generally speaking I would say for the for the uh for the sector is data data quality ensuring that we have a lot of data we have a lot of Corpus that we can use we can come back to this later but this is also even more important for smaller languages that this Corpus is made available as a clean data in Miguel has as much more experience in this than I do but I think I'm not completely off the mark when I say that this is the quality of data is something which is really important the next step for us as well is to see whether we can use high performance computing to to uh to have an additional step change if you will in quality or machine translation so those I would say would be main challenges throughout there are certainly others but for us these are the things that we are trying to work with right now I'll jump in challenges um One Challenge I see is that the current Technologies for machine transition uh the neural machine translation is heavily if not completely dependent on proprietary Technologies that run the gpus so the gpus the processing units that you use for for uh Mt need proprietary firmware you don't really know what's going on in your computer exactly so this currently very little to be done with Nero machine translation in an open architecture with open software so I see that that as a challenge because this means that we depend on a particular particular vendors another challenge of course uh is that neural machine translation is extremely intensive computationally which means two things one thing is that it's extremely expensive I mean do you mean you mean you need very strong and I don't need to tell Jurgen about this because he runs the translation uh Factory so he knows how expensive these machines are but the other thing is that it's very expensive for the environment because to produce all the electricity you need to run this you are going to end up producing a lot of CO2 unless you Source your energy entirely from uh from renewable sources as as uh as as regards uh another problem which is related to something that the Jurgen said is the quality uh of corpora that you used to train it uh Jurgen knows that we are in a in a project in a series of projects where we are actually generating trying to generate clean corpora and it's actually very challenging and one of the most challenging things is detecting machine translation when you think you're dealing with professional output because you're you ideally need your Corpus to contain professional translations and not uh uh machine transition because instead to say graphically your system would be drinking its own urine okay so it's it's very very important in a neural machine translation is so hard to distinguish automatically from neural from from professional translation that this is a challenge it's a very important Challenge and lots of people are doing research in this and it's quite challenging and uh yeah that's basically it uh the other the the cleaning of course I mean the fact that machine translation basically relies on very uh uh computational intensive Technologies and Hardware on one hand and a lot of data on the other hand makes it it has as a result an enormous concentration of machine transition power in a few hands okay so this is something which is not good or at least I don't see it as good because I mean e translation is a nice Island in that a nice exception of that because it's a it concentrates on machine translation power but fortunately it's Public Power so at least smes can use it that reuses it to he uses uses it to to break the digital barriers and the language barriers between Europeans so that's a good that's a success story I think but then we have the Googles the details the microsofts and they harness a lot of machine transmission power and it's in a few hands so I don't like that um Jurgen would you like that something to that no I mean yes this is something that I see that I saw also in the other areas I worked with in in the more digital single Market areas I work with and this this was a problem also there so I mean you're writing again and uh and I thank you for for mentioning that there is an island of Hope somewhere and that we are we are dealing with this here in uh in European commission it is indeed it's a Secure Public system being able to use by by smes across the whole of the European Union and is used by all the Institutions and it's used I would say very extensively as I said 200 million pages okay it's nothing towards compared to to Google Translate of course but it's still quite big and the good thing about it is that it's secure and that's something that Miguel perhaps you want to come back on as well because security of the machines system is is actually uh an issue which is sailing up on the agenda as well and and for us I mean what we do is that we take the data and delete it straight after that you have your your translation we don't keep any data whatsoever so it's fully ddpr compliant if you will whereas if you send it to Google or even to detail they keep the data and they use it to improve your the the machine translation systems and I mean I've seen uh or heard of anyway uh Public Services who send full documents to Google Translate which are not necessarily published yet and you know of course Google has access to it it's in the contract you just read it it's there they can access it and use it any way which they want so uh you know let's let's be clear also on the fact that security is an issue foreign yeah security is an issue of course um we already mentioned when uh research paper presented at the Angela 2018 uh about the investment and the fundings invested by the by Europe in research in machine learning placement while the biggest Roy are being sometimes obtained by companies somewhere else outside Europe uh that means that this is a bigger a big business as well um and that's uh and our point of view to be considered when talking about machine translation um the last conference well the Comfort net uh conference to the 2022 that took place in roads in which Michael uh also took place um uh dealt with some of the improvements that seems to be approached by by Machine translation developers like preparing data uh to be better understood if I may buy machines uh tagging terminology of help machines meeting that decisions that they are not able to meet formality terminology things like that that would mean that the translator skills might be needed to prepare data for machine translation uh is that inspiring do you see that as a possibility Michael yes I mean this is the old story of of control languages also and you know uh we're all always seem to think that the translator is uh professional translators are going to come after machine translation does its job and just to clean up and clean the mess but there's a lot they can do in many other possibilities so they can interact with the system or they can just uh like for example in terms of this translation prediction where the system kind of helps you write the translation or with uh pre-editing which means which makes a lot of economical sense when you're going to be translating into many languages one thing I forgot before which I think is slightly related to this is that another challenge for a machine transaction now is adapting to the user So currently uh I mean I'm not aware of any machine translation system that profiles the user that is reading the output of the machine translation system in such a way that it generates the output that that person needs or wants uh if he or she is a post editor Maybe they have a specific post editing style and they want the output to match that post editing style if they're just end readers maybe the fact that they know or don't know and other languages may make a difference in the way they read the machine translated text so getting into the reader's shoes something that professional translators try to do every time when they're translating particularly literature they try to see if they have a model of their reader in their mind and they say this is the reader I am writing for uh is something that's at least that I know of seems to be missing what I have to check what we have to check if Google Translate for example produces different output for different people I wouldn't be surprised because they're profiling us up and down they know everything about us so they could just generate different and I I have never checked that I know they produce different output depending on the country I know so so this is this is something this is a challenge that I I didn't want to forget it's related to this because maybe you want your output to look like a particular kind of output um yeah I mean there are some things on the on the chat as well which I have a little bit of difficulty perhaps answering is about fully automatic high quality translation which is related to universities perhaps is the question but going into to uh to what you said Miguel and the question that you just posted are is is of course I mean we we're here talking about meta metadata uh partly at least now and also the role of the translators in in creating best possible data so uh for for us what we're trying to do is to is to go towards the urms visual language memory uh and and build as much metadata in there as possible metadata is data about the data it informs you about what's going on with the data who wrote it and things like this and this can be important when it comes to deciding whether the quality is good whether it's a sort of an experienced in-house translator or or a newbie trainer stuff like whatever I mean these kind of things are important for for us and we are working on this kind of metadata but the other part of of what was said here as well is is about translator and the role as data curators if you will that is also something which is which is coming along the profession is changing there are other things to do as well when you are a translator then then just having uh to translate and even if you had this education uh and you did a master or whatever things change in your life as well maybe you are interested in technology maybe you want to go into trying these these new things these are very high value-added type of professions as well so I mean you know money-wise it's kind of interesting um so these kind of things are also I think open for translators but I say usually in the house here is that the e-translation is the system of the translators you are doing the system this is your system uh and this was perhaps a new way of presenting the things when I came in three four years ago but it is I mean every day every single day we have 2 000 um sentences going into uramus two hundred thousand sorry I said two thousand the two hundred thousand going in every night to your miss our language memory segments two hundred thousand segments every night these two hundred thousand segments are used to improve each translation at regular intervals so it's not every night that we improve it but at regular intervals to every six months or every 10 months or something like this we improve the different engines with this high quality creation by the translators and the Freelancers it is top of the range translations which are then used to make e-translation even better so it's it is sincerely not my all the teams no it's the translator system they do it they improve it they use it so I think that is also something to to bear in mind with this maybe this is only for us but maybe it's also more uh more generally spread Adam can I pretend about this because this is this the thing that Jurgen said is very very important is that one thing that people are not aware of and this would be part of their technological empowerment is that the current machine transition systems learn to translate from text translated by translators okay so it's their work that is being repurposed into machine translation that's one important thing and I think we'll come back to that to that if we have time that's very important because that change is completely the perspective of translators when they are translating things particularly that are going to be in ending up so we're public because they will be crawled and used and filtered and used for to train machine translation system the other thing I wanted to say is is something I just forgot so I'll just let it run I I don't have yeah so oh yes I know data duration that's another key thing so you're gonna I'm glad you brought this so uh many administrations many many Administration in multilingual countries like Spain because Spain is a multilingual country at least some of the areas of Spain are are not aware of the fact that they are generating a lot of value by translating and that that value is being just thrown down the drain because they are not keeping a proper translation memory as the EU is doing with the ramus and the EU is not only doing that it's also publishing a part of it so that people can train their systems can do the machine translation so that's the dgt translation memory which is an invaluable resource so this is an example of very good data curation and data you know data maturity but most administrations in Spain as I said we have a lot about more than a quarter of our population population lives in areas where there's more than one official language are just translating everything over and over again fortunately for at least some of the languages in Spain machine translation is very fast and quite accurate so people can easily generate translation well think about Basque fortunately bask already published the basket country government already published some transition memories and that's a very good idea but I think this is particularly important from a public institution at the EU and other public institutions they should make the most of taxpayers money by actually making these resources available to site so that they can improve things they can make their own machine translation they can do things because security one of the keys to Securities having your own system or one you trust but if you don't trust anyone you have it in your house and you run it there so you need to train it and you need to train it in-house because instead you don't have any so depending on your level of Securities all of these things are very related and I think there's very little awareness about some of these issues at least in when my students come to my translation Technologies class no one has been talking about this and then I just had a limited amount of time to tell them but there's a lot of things that that are very important in their empowerment right knowing about these things and one is day decoration just keep your data clean ordered well available so that you can use them in the future just just as a freelance translator that's that makes a complete difference that's really uh very interesting what you're saying Miguel and I agree 100 that one of the fights that we are trying to have is is to to make member states understand that they need to invest a little bit in in a language program and and to to invest a little bit in the cleaning of that data and to invest a little bit even more to ensure that somebody can actually use this data so we have an open data portal and we have a public sector information type of initiatives whereby we are trying to get the member states to to ensure that whatever language data that they have that make it public we are trying our best as well making part of Europe is we can't give all of you of this but part of urim is 20 something like this is still I mean we're talking about two points five billion segments so I mean it's quite a lot Stiller so um you know this is we are making that available across the board in uh as yeah I can see the uh thank you Maria for for sending the the link so this is something which is really important I think that the that but there's another discussion perhaps to be had in terms of digital Extinction and things like this and and how important this kind of work is for smaller languages uh I mean we have worked with the Icelandic and they are truly afraid of being digitally extinct for read for real so I mean there are what is it 450 000 people living in Iceland something like this speaking Icelandic so less than half a million and uh nobody in Iceland uses Icelandic when they go on the internet this is what digital Extinction means basically yeah so of course for them to have you translation and things like this is hugely important for them uh so they are great sponsors of ours but they also see shall we say it's almost as you could see that language technology is the part of nation building in the future or Nation preservation in the future you can almost draw those kind of parallels if you want to go a very high level of reflection but okay maybe I'm moving away from the real area here I think I would like to deal with which is uh the balance between different languages major minor languages uh are there represented as well uh in all uh the systems um that we usually use um is machine translation being building a bridge between major and minor or from minor to major or don't know what do you think about that nickel if you want we we have a stake in that because we've been working in a prayer team and one of the main uh one of the main uh tenants of apparently was that we loved small languages so we started to write machine translation system for languages which still don't have any other machine translation system uh unless Google has published them yesterday so for example for asturian or for example for aragonese or or even oxygen uh so these are these are languages which are in terrible danger of being lost and the language is lost in this world today if it's lost in the digital world so I completely completely agree with Jurgen here you have to be alive in the internet if you want to be alive as a language so so uh I see that of course uh the big concentration of uh of uh translate machine translation power let's call it uh in in in in big companies usually means that either the languages are neglected some languages are neglected or some languages are actually adopted and I'm not completely sure this is good either because this this means that currently if you want to translate from some languages in the world you're going to resort to a technology that has been trained by someone with who knows which texts with who knows which technology and doesn't leave anything but a machine translated output in the hands of those language communities so language communities don't have anything but a dependency on a particular provider so my experience with apartheid which is a rule-based system and I know the rule-based systems are kind of old but the good thing about rule-based systems that they Inc is that they encode the knowledge about the language in a computational way but the language is encoding the encoding of that language is explicit so you write rules dictionaries and stuff like that and that may and if you release those things with uh with an open license as a per team did it can make a difference for some languages and in fact if you only have to check uh the oxygen Wikipedia how it's growing after we release the oxygen language machine translation systems to get an idea of how this is this is important Icelandic yeah I remember I went to uh Elric 2014 and the mayor of Reykjavik was explaining that Icelandic was uh was a was a was an endangered language and I was like oh boy I was completely shocked and Jurgen said that there's a 4 400 000 people in Iceland that speak Icelandic I don't think everyone living in Iceland speaks I Icelandic most of them speak English so that's that's really a problem with the language and and it's great for Iceland to be part of of initiatives regarding corpora and e-translation and things like these because this may have this may give Icelandic a chance uh as a language but it also means that there's a lot of fake machine translated Icelandic out there in the internet which you have to filter so uh it's it's it's always difficult but it's it's I mean wouldn't you agree Miguel it's it's it's both good and bad at the end of the day I mean it's uh it's it's it's great that you can actually Safeguard smaller languages and ensure some form of understanding comprehension of these uh to anyone in the world as you were a that's good so that's it that's that's a good point the other side of course and I've noticed is also in my past life the other side is of course that there's such a predominance of one language that the you know on the internet that everything comes through the prism of of English and Anglo-Saxon um perspective if you would always and there was somebody who told me that the the English is so prevalent in in the internet that even it was a Polish guy even they they have courses in Polish in what was it uh yeah AI tests sorry in AI development okay and the tests so the courses are in Polish but the tests are in English because that's the only information this is the only data that they can get so courses in Polish tests in English so this is the thing and that's not even start about the different biases that you will get through using data which is only Etc so I mean again why do we have so many pages in um in in e-translation it's also because a lot of data mining is going on in a huge variety of languages that's also why we have so many pages in in the translation okay um well um after approaching the different languages that are involved in machine translation let me introduce another concept that this nowadays also uh going around when we translate we translate texts but we also translate apps and videos and live streaming and many other formats that somehow condition the way we approach translation A New Concept arose uh some months ago augmented translation or translating with tools uh that integrate other NLP Technologies I don't know speech recognition uh terminology Management in a more interactive way and so on and so forth um what the main technological Improvement or changes do you think will have an impact on translation profession translation business apart from machine translation um Jurgen what what are you working in right now uh in the commission so for us right now we are working um if you go outside of machine translation need translation and trying to improve that the language memories and all of these things that are directly related to the translation profession on the translator is we are also working quite quite extensively on speech to text which is important of course as well as like the next step for us in trying again to to communicate with people trying to to build understanding about what we do in the in the European commission and the European Union more generally speaking so speech to text is something that we are working on right now it works reasonably well we're also working on on things like uh but this is this is perhaps for for policy professionals you know they would need summaries multilingual summaries of of uh certain textual studies or things like this that they they have difficulty getting otherwise and the human cannot translate uh easily at least or swiftly uh study of 150 pages in in Polish but it's kind of useful to have it when you talk about I don't know copyright directive or something like this um but with the different technologies that we also are looking at you can actually have a multilingual summary of that just as an example I have a hard time talking about Technologies which are not machine translation because you know how special specialization works you know more and more about less and less and as you know almost everything about nothing so uh but but I think I I was thinking more in uh in how it how things can change in the translators Works work work uh station or the environment so one thing that I see is this adapting to this particular translator so now you have uh in a in a cut tool you may have a lot of different Technologies at your fingertips available there to help you translate and but you are different from the other translator next door so you want to have your personal broker that decides where you where your time is invested so that you produce your translation faster so that's that's an interesting thing is having something which is probably going to be learned uh from your behavior that actually selects the technology you need for each segment something like that so this is something that I this is the one of the Visions in my it's probably already happening in some sense when you have a threshold for a machine translation and translation memories but that's very very very simple I was thinking about something more even just selecting different machine translation systems or uh deciding what kind of translations basically uh annoy you and just keeping them all out of their way and things like this so this is something which I see as regards uh Speech I just was reminded of of this a Nigerian scholar called well scholar and artist called tunde adegbola tunde adegbola says that for many African African languages maybe writing is not a choice in this time of year writing them so we should go to speech from speech to speech and video to video and things like these because the languages are never going to be written many languages and we're talking about big languages like you know they're not small like Nigeria is a become country is probably going to be like the second or third country in the 50 years from now in population it has about 200 000 Millions uh now and they have languages which are as small as 404 40 million and 30 million like equal and Uruguay and things like this so many of these languages are oral strongly oral writing is very bad for them because for example the tonal system is very important in understanding these languages so you've probably heard about the talking drums so the talk in terms basically emulate the tone pattern of sentences and and basically you throw away the consonants and the vowels and you still have a message that's
2022-12-29 04:34