Telling the big stories with a bit of help from AI - Pierre Romera Zhang

Telling the big stories with a bit of help from AI - Pierre Romera Zhang

Show Video

[Applause] hello hello uh thanks for this um very nice  introduction uh so I'm very glad to be here   today um and yeah I'm going to talk to you about  ICI and how we use technology and different data   sources to do our work um so well first small  introduction uh you probably don't know me and   I think it's it's better that way um uh so I'm uh  chief of Technology at ICI and I've been working   with leaks since uh 15 years in fact uh I started  many many years ago with Wikileaks and other   similar organizations and since then I've been uh  working on global um investigation that involve a   lot of journalists and a lot of Technology because  this is probably what all those investigations   have in common um so I will give you my contact  at the end again but if you want to reach out   after this conference uh don't hesitate to send  me an email I'm on uh every social network you   can even contact me on Tik Tok if you want um and  I'm happy to hear your stories if you have any or   if you want to share any document with us um so  let's start with the ICI um I don't know how many   people in that room hear about the Panama papers  can you raise your hand maybe the Panama papers   not bad um The Uber files more confidential but  still important and ICI did you hear about ICI   well yeah just a few of you um so this is  because we work in the shadow and most of   the time you hear about Le M the New York Times  The Washington Post our Media Partners but you   don't really know about the organization that  run those kind of Investigations um so yeah we   are the ICI it's a pretty old organization that  started to Swift to Global um investigation in   2014 uh and basically we are a group of um three  sorry we are an organization made of three groups   um the first one the staff like me second one the  members and third one the partners so it's a bit   difficult to understand the first one so as I said  the staff it's about 40 people bit more now uh we   are across four continents uh we mix knowledge  between uh journalism data analysis coding U   design um it support and basically our role is  to coordinate those big investigations um but icj   started in um uh I don't know I don't remember the  date uh 1988 I think no yeah yeah probably um but   anyway it was at first a group of members um that  basically exchanged tips uh knowledge and you know   it's very funny because the very first Workshop it  was 25 years ago they all met to learn how to use   pgp um and so this group of journalists they are  all you know very famous investigative journalists   in their own country in France we have I don't  know uh fabis Ari uh we have I don't remember but   anyway uh we have many um uh famous investigative  journalists in many countries and yeah and they   at the beginning just exchanged tips but they  were not really working together and so because   ICI wanted to use that uh member that member group  they created partnership and so those partners are   already working with us dur during investigation  some of them are members of ICI some are not for   instance in France we work with Le with Radio  France with um cash investigation and most of   them are not members of ICS they just work with  us during investigation and we consider them as   partners so as you can imagine it's a pretty big  role to coordinate this work between that many   people and that many organization and this is what  we do at IC this is really our speciality and I   think we kind of um give everyone The Playbook to  know how to coordinate this kind of Investigation   because before IJ there was some great effort to  work together with different news organization   but ready IJ uh made it professional so today I'm  going to talk to you about four No in fact five   ICI investigation using AI um I'm going to talk  to you about this investigation very briefly if   you want to know more about them you can just go  on our website and uh you will see all our stories   and I will explain to you how we use AI along the  years and other technology to enable investigative   effort so the first one is going to be the implant  file 2018 uh it was a global investigation on   medical devices medical devices can be many things  like a pacemaker for instance um but we realized   that it was a very unregulated uh sector and there  was a lot of problems with those devices uh around   the world but you know when you don't have any  numbers when you don't have data you cannot really   give a diagnosis of a problem it's very hard to  know you know if there is truly a problem with   medical devices or the regulation so we tried to  uh work on uh several data sources uh one of the   most important one came from the FDA in the US  um and so basically we got what they call Adverse   Events which basically are events where a device  is suspected to have caused serious injuries or   even death with the patient and the problem with  those uh Adverse Events was the fact that they   were all redacted there was just text um so you  know very hard to extract numbers and statistics   when you just have text also they did what we  called Under reporting meaning that instead of   saying someone was dead they were saying something  like the P the patient expired or something you   know not very clear that for human was obvious  but for a machine was much harder to um analyze   also all the data we had in this report were  just as I said plain text with absolutely zero   structured data so zero spreadsheet zero uh  columns it was it was very hard to analyze so   that was the first time that IC used um machine  learning to try to identify which reports were   talking about someone dead and which one was  not and also we wanted to know if there are   some sort of discrimination or or you know  a population that was more victim of those   problems so we try also to extract genders from  those um from those reports and well we didn't   do uh ethnicity because most of the time it was  not in the report but that would have been an   interesting uh uh aspect of the analysis and  so we listed a lot of terms that we knew was   related to death and we uh basically train our  machine learning algorithm to identify the report   the death and you know read the description for  us thanks to this process we managed to extract   more than 2,000 uh cases where the P the patient  died so it was quite a success for first attempt   but it was also uh very expensive uh it was in  2018 so at the time you know machine learning   was good but not as good as today um and it  was also making a lot of mistakes so in fact   our reporters read all the cases that was flagged  as um you know positive in this analysis and they   read them so basically we did machine learning  and then we had reporters to read the report to   confirm the machine learning was right so you  might think well what the [ __ ] why do you   need machine learning if you read everything well  it was an experiment so we had to try we had to   verify the work from the machine learning because  ICI has one secret recipe it is the fact checking   everything we publish is fact checked three or  four times which ensure that when we publish   something when we publish a number like this one  we are certain that this number is true and the   result is quite uh important because ICI is doing  this kind of Investigation since many years but   we never get sued by anyone we got of course  threats we got threats every days but nobody   was able to uh Sue icj because all analysis we did  were uh fact checked and were backed by a team of reporters a bit later I think uh yeah almost the  same years we did the morous leaks um so we got uh   tons of um documents from an offshore low company  we really like offshore low companies um and uh   we had to coordinate the effort with our African  Partners to explore those documents the problem at   the time was the size of the team we didn't have  many partners in this project it was something   like 30 Partners uh so it's not a lot usually when  we do investigation it's more like 400 Partners um   and so we really needed to explore those document  in a very smart manner we really needed to create   a way for those journalists to quickly identify  interesting documents so we worked with uh quartz   uh it's a news media outlet that is um I think it  closed uh so it's it's it used to be great but now   they're drastically reduced the the staff along  the years and I think they don't exist anymore   but at the time they worked with us to um identify  similar documents so what are similar documents   it's a very interesting aspect of uh the research  when you have many many documents you really need   to be able to classify them so we try to class  class ify them to narrow down the research uh   we created several technology like tax returns  or business plans and we we train our models to   um identify those documents so if a journalist  wanted to get all the business plans from the   leak they were able to to do it and you know uh do  more research it was a very interesting approach   uh because For the First Time instead of using  uh existing models and asking our own team to   do it we involved the partners and we asked them  to help us to flag documents by categories so we   did what we called uh supervised learning so many  many uh reporters helped to yeah flag documents we   taught the model that uh a document was in in a  certain category and then we were able to scale   and to analyze all the documents automatically  as I I said this uh sign significantly sorry   significantly accelerated the process of exploring  the documents and uh it was also the very first   time that we use machine learning not to create  any sort of analyzes but mostly to uh speed up   the research so you might wonder what it looks  like um so this is data share uh this is our   search engine for liid documents it's a technology  we created many years ago that is open source that   you can use on your own computer um you can  install it in a few clicks especially if you   are on Linux it's super easy um but you can also  use it on your server IC use it on um on their   server uh to explore Millions terabytes of leaked  document and well we started to develop that in 2015 uh to um basically distri distribute  the work of reading the documents and putting   them into an index and so here we created  this very simple feature that is called   Tags that you have on the left um where  you can see that we created clusters of   documents so we classified some document by  clusters so as I said the journalists were   able to quickly filter out documents  uh falling into a certain category   um then it got bigger um the year after that we  got another leak uh this investigation what called   The W Lees it was basically a lot of Records a  lot of files related to Isabel dentos which is   the daughter of the former president of Angola  and very quickly we realized that she was using   her power and her network to get money out of  Angola even even public f um but as I said again   there was a lot of documents and the problem it's  not a problem for aan people but it's a problem   for American journalist or French journalists  the document were in Portuguese so how do you   coordinate an investigation that involve hundreds  of journalists if they don't speak Portuguese so   you might say well easy let's use Google Translate  right in fact we can't do that well first we don't   want Google to to have our documents and so  we are legally not allowed to do that because   it will be like sharing our documents with Google  and if we are protected because we are journalist   Google is not so Google could be I don't know um  sued by Isabel do sentos because at a point they   they process our document on their servers um so  yeah preserving the secrecy of the investigation   while uh allowing many Jour to explore it was very  important so we decided to use offline translation   uh at the time we use an open source technology  called apertium it's very famous it's very fast   uh but it's uh not very good at translating  from uh Portuguese to English in fact they have   no models to translate from Portuguese to English  so we did something very dirty we translated from   Portuguese to Spanish and then to Spanish uh from  from Spanish to English and so as you can imagine   the translation was very very bad honestly  uh I'm not proud of it um but it was still   good because with this translation and with data  share our reporters were able to search into you   know hundreds of documents so even if they were  typing words in English or in Portuguese whatever   uh they were able to find documents and then even  if the translation was not good they were able to   either use um a Portuguese speaker to help them  or they could you know just try to understand it   correctly um since uh 2020 we really improved  this technology recreated to translate a lot   of documents uh we are now using Aros translate  which is much more efficient much more uh good at   translating uh from Portuguese to English so now  the translation are great but they are also much   smaller um uh so it takes forever to translate um  that many documents so that mean that we are still   able to use apertium to translate those documents  if we need to be fast if we have millions of   documents we will probably need apertium but if we  have less document and I will talk about it later   we can use Aros which will provide a great um a  great uh translation as you can see and you will   see a lot of links in the bottom of the screen uh  we published the technology we use to translate   all those documents because as you can imagine we  cannot run that on one single servers we had to   distribute it and so we created a technology that  was able to distribute the the computation between   different servers using a a bus to share the work  and uh it was extracting the text from an elastic   search so we called it elastic search translator  um it's open source like everything we do and you   can find it on our GitHub like um like many other  tools um then the year after that started the   biggest investigation in journalism history not  because of the impact the impact was huge but in   in terms of the number of people involved in this  investigation we got more than 400 journalists   working with us during that uh investigation  uh we started to get the leak in 202020 during   the confinement during the lockdown um but the  investigation itself lasted for almost 2 years   and so we had a lot of documents uh almost uh 12  millions of documents that came from 14 different   offshore provider so an offshore provider an  offshore service provider is basically a company   that helps you to set up an offshore um nshore uh  company or nshore entity there are many of them   and uh usually they have for customers very rich  people that don't want to pay taxes so 14 of them   great lot of stories to tell um problem all those  files were pretty complex to read um most of them   were in English that's easy but a lot of them  were just uh ink of paper so uh either scans uh   andwritten um andwritten text or or you know uh  just very long Report with a lot of information   um so we needed to be able again to identify some  very specific type of documents and also to um uh   extract structured data from those documents um  so for the first step well again we use data share   to analyze the documents and to search through  them and uh we use machine learning to identify   documents by categories just like we did for  um the mor isue leaks but uh more than 10 times   bigger the problem with this machine learning  operation was the cost you know when you uh uh   use machine learning algorithm you have to store  vectors uh to speed up the process and when we   tried to store the vectors of all those documents  it was massive lot of data to store and at the   end combining the storage and the computation it  cost St it cost ICI uh $50,000 just to do this   uh you know document classification so not ideal  that's something I will do every day uh but that   was worth a try because it allow us to search  through the document faster um as I said we also   wanted to extract structured data it's important  to understand that when you have such big leak a   lot of them are PDF emails words document some of  them are spreadsheets or databases but it's a very   tiny portion and the problem with this kind of  leaks is the fact that there is so much personal   data in it that you cannot just release it with  everyone because if you do so people will be in   danger we have in this leak like the ID card of  Shakira we have personal emails poem love letter   this kind of stuff so very personal uh so you  don't want to publish that on internet because   people will be in real danger and there will be a  huge confidentiality issue but there is a lot of   stories that we cannot tell because we are still  a small organization and even if we involve a lot   of journalists at the end of the publication  the work is kind of done you know you don't go   back so often on the leak so we really wanted to  extract a list of all the structured um data all   the companies present in this leak to publish some  sort of offshore uh registry so you know offshore   jurisdiction they don't have a public registry of  companies that's why they are used you know for ax   s um so we had a lot of companies in this leak so  we wanted to extract those companies and publish   them in what we call the offshore leaks database  you have the link again here um and so anyone you   uh or any researcher out there can use our data to  do their own research and it works uh very often   we got requests from journalists because they  found an interesting name that was not interesting   last year but it is interesting now just let's  imagine all the new deputies that might be in our   database that were not very famous uh last year  but that might be famous uh next week um and so   those journalist contacted us and they said well  I find some interesting result about that person   can you give me the documents or give me access  to the panra papers and so this process works very   well IC is committed to give access to as many  journalists as possible to those kind of documents   and so because we created this corporate registry  where we really are able to um uh recreate new   collaboration on the basis of this uh database so  to do so uh so we I skip a little bit that part   um we extracted structure document uh using  uh machine learning again so basically we   trained our uh models to recognize the name  of the company the name of the officer so the   person that have a role in the company uh and  this kind of stuff like the address uh and at   the end we were able to publish all that data  that was extracted automatically um on this website as I said before um yeah yeah as I said  before our secret source is factchecking we get   a lot of analysis a lot of documents but when  we want to publish we need to check that it's   true and so because this process is very uh very  hard very long we decided to create a platform to   make it a bit more playful uh we kind of gamify  the process of factchecking data so we created   an open source platform it's it's very simple you  know it's based on Jango you can publish it um uh   read it on your server uh and it basically offer  you a way to upload a spreadsheet and upload the   list of uh value so you say for instance I  don't know I want to identify countries in   this spreadsheet and then the reporters are  able to say if the value you um you extracted   using machine learning is correct or not it's  just that's simple just verifying that data and   because we built this platform uh prophecies the  reporters in our team managed to verify all the   records we extracted using machine learning pretty  pretty quickly uh so it was pretty fun to do for   us it's funny to do some sort of tinder-like  but for fact checking but it was also very fun   for them because they have this nice interface  they could use on their phone and you know uh I   remember one of our data journalists was working  by the pool she was you know just swiping on her   phone to verify um the machine learning result  um we are still using prophecies for many other   projects but we really uh created this nice and  friendly interface for the P papers um last year   we released another um investigation into offshore  activities but this time in Europe in Cyprus it   was called the Cypress conf confidential and we  got our end on a lot of documents um that were   basically showing that Russia was using Cyprus  as a gateway to Europe so they were either buying   European passport or they were um you know  sending up companies or whatever you need in   cus to have access to found directly in Europe uh  problem again a lot of documents were in Greek and   Russian so we needed to translate them the problem  here was the fact that the way the documents were   formatted it was very hard to detect the language  of the documents so we had this problem you know   we try to OCR a document to turn it to turn an  image into text but to do so we need to know   the language of the document but to know the  language of the document you need to extract   the text and you know it's almost impossible  so it's it's you know you you can have very   efficient OCR technology like the one we use but  in in that case it was just not working so uh we   created another open source technology that we  call plld for PDF language detector that was   able to OCR the documents in different languages  we just as for Russian Greek and English and then   it what it used the confidence level to to  assert what is the language of the documents   and so thanks to this process at the end we were  able to identify the language of every document   correctly so we were able to OCR it correctly  search result are much better when you recognize   cilic as cilic um and we were able to translate  them obviously using the technology I mentioned   before so again machine learning help us to speed  up a process that could have been done by hand   and but because we had so many documents we  were able to do it in a much more realistic time frame as I said we have a lot of uh open source  technology uh I'm a strong believer of Open Source   um it started many years ago uh and I still think  that it's uh the best thing that happened to the   world since um computer exist um and icj really  want to follow that philosophy and publish as much   as possible the source code of uh what what they  do so as I said we created this data share search   engine uh that you can find on this address that  you can download on your computer so sometime it   works sometime it doesn't be patient please it's  not that easy to set up this kind of search engine   on a computer uh keep in mind that when we use it  usually it's on dozens of servers so so it's not   easy to fit that kind of computational power on  a laptop um we follow a philosophy that we call   Extreme scalability meaning that we build this  software we designed it so it can run on a small   laptops with small memory but it can also run on  hundreds of servers and so when we get a leak like   the Pandora papers we already have the tools to  analyze all the documents but if you do have a   Le you know there are so many now uh just  spend I don't know 30 minutes on the dark   web and you will probably find something to  download on your computer you can use data   share to help you to navigate the documents  our primarily focus is unstructured documents   so they are PDF words images so things you know  that are not so easy to read for most computers   but that data share is able to read pretty  quickly and pretty efficiently and we support   something like I don't know 2,000 different file  formats so when you have a leak and you have a   jumbo of different file format data share is able  to read them um so it's uh yeah it's at the very   center of of what we do um as I said it's uh it's  performing OCR using open source technology like   Tesseract um it's use uh name entity extraction  using um open source models so you are able to   identify find name of people organization and  places um and yeah and it basically allow you   to search through your document with a bunch  of of filters as I said before uh we use um OS   translator to translate the documents so this is  also an open source technology that you can run   on your computer you have to be a bit familiar  with command line to use it uh but in just if   you click you will be able to to run translation  on your elastic search index and you will just   um just work so I I guess it's a pretty U big plus  for us so what's next um there are so many leaks   out there uh so many different kind of data  that we still struggle to analyze so what we   really want to do in the future is to make data  share a platform that you will use not only to   read your document but also to perform your own  analysis so you don't have to care about can you   read this PDF can you read this image can you read  this word document can you read this huge mailbox   you have access to you won't have to care about  that because data share will extract everything   correctly for you as it already does but then data  share will offer you an API and a sandbox so you   can run your own analysis over the document so  let's say you want to identify different clusters   in your leak data share will really offer you an  environment where you can run your own lysis if   you want to use Java I don't think many people  will use Java but anyway if you want to use   Java you will be able to do it if you want to use  Python JavaScript whatever data share will offer a   space to do that we also want to uh create some  sort of um service around data share but it's I   mean it's it's not our priority right now thank  you very much and uh if you have any questions [Applause] I'm and I forgot to  say uh we are an NGO so we are   donor based so if you want to support  our work I encourage you to go to that link so do we have some questions yep I was wondering how do you get the data you do  for your investigations to people send to you and   if so do they use specific tools is thata share  usable to share some data with you and uh yeah   thank you um so yeah it's a big topic um I spend a  lot of my time meeting with sources and traveling   to get an our drive and just go back home and  basically put it into Data share that's mostly   how we start an investigation but as I said ICI is  also a network of of members so there are you know   journalists around the world that have their own  documents so very often they come to ICI and say   hey I have two millions documents I don't have  the servers or the technical skills to analyze   them they just end it over to ICI and that's how  an investigation starts of course not every leak   leads to an investigation this is why we are uh  we have so many reporters at ICI uh but I think   the best way yet to start an investigation is  probably to share documents with us we do some   investigation that are not based on leak like the  implant files I mentioned before it was not based   on leak it was based on public data and um and  data we uh uh scrapped from public website uh   but yeah most of the time if you have a leak  it's probably the the best way to work with   us uh we work with um only news organization  that's one of the reason why we are protected   but our sources can be anyone like uh  insiders hackers uh whatever um and uh   I think you asked a question about how we  review the data or what's the process but   basically it's yeah it's a very uh manual  work when you have so many reporters you are   able to distribute the effort of exploring the  documents uh but that's it there is not really   any magic formula to review the documents  just have to go one by one and read them all hello thank you for your presentation and for the  work of AJ very very important for our society so   please continue thank you um what I want to  ask to you if um the findings that you that   the journalist are are finding I wonder um if some  um people that are targeted by these leaks if they   want to Target you and steal the information  that you have found and how you do that is   I imagine every company protect as much as  you can so every investigation starts with   the threat modeling we have to know who our  enemies what resources they have and in most   cases all the one I I mentioned they don't have  that much resources because our enemy is not the   NSA yet um but yeah at the beginning of every  investigation we try to assess what's going   to be the risk and what security measure we're  going to take I didn't really talk about it but   in fact we are pretty strong on that point I mean  we do the best we can um and every time we publish   something regarding one of those actors we got  DDOS attack we got int intrusion um attempt so   we are very careful uh it's almost never uh  possible to know where the attack comes from   so most of the time we just guess but but very  often it's the same kind of actor like you know Russia hello uh have you command a language  structure to speak between uhal bit louder   please yes sorry uh have you a common langage  structure like sticks for example in cyber   security to exchange between journalist or is just  uh the data share model that is used by everyone   and that's become the facto this language um so  I'm not sure to get it but I'm going to try to   answer anyway tell me if I'm wrong um so all the  model we use are open source and publish with data   share and the one we use to analyze the document  are in fact coming from other organization so   we don't have our we don't have our own models  um but as you can imagine when we have so many   documents we can train models with very unique  um very unique data set so that's something we   also try to do currently we are working with the  oslomet university uh to build a model that is   able to detect passport and basically extract the  name of the person the country the date of birth   the photo uh and when we have a leak because  uh thanks to this model we are able to quickly   recognize uh passport the problem is that we  cannot really publish this model that is strained   on confidential data because you know there there  are ways to know if uh which document have been   used to train the models you can retrieve some of  the information analyzing the model so it's not   trivial but it's possible and so it's not a risk  we are willing to take yet and so that's why this   kind of model are still not open source but the  technology to train the model is open source and   so if someone has their own uh data set they can  train the model themselves using our technology   yeah thanks uh you me you mentioned you had a few  dozen people uh as staff a few hundred as members   uh do you have a ratio of technical people uh  in both of those groups because it seem like a   huge amount of work so in the member Network  it's close to zero technical uh people it's   just journalist and uh some of them are uh not  the same generation than me uh they don't really   use technology to do their work so they don't  have much interest but at ICI the organization   I think it's uh a third of the organization  that is no even half of the organization that   is technical staff meaning uh developers data  analysts data journalists uh it people um so   yeah I think it's pretty unique to have a news  organization where such a big part of the staff is technical um country can see you like  a stress bit louder please country   can see you like a a stress um um Secret  Agency is a partner or an enemy for you um so the question is uh countries can be our  enemies and uh survey agency are either a threat   or a partner for us am I correct um so I would  say an enemy without any doubt um we never share   anything with uh governments uh we don't share  anything with intelligence agencies uh we know   that some intelligence agencies use our open  source Technologies because they told us um   but uh we don't want to work with them uh that  would create a very dangerous precedent for the   work of the journalist that work with us and  um yeah and we will continue that way one of   the reason why we publish the offshore leaks  database is also because we don't want to work   directly with government entities uh but we  know that when we publish the off database the   tax authorities for instance took the data and  run their own research in France just in France   uh it's estimated that they managed to get um  400 million back from the taxpayer just using   our data just using the data we published uh in  the word we estimate it's about 1 five billion and   two billion but we don't really know because they  don't always communicate like the US they don't   say how much is uh based using our data um but we  will continue to publish data as much as we can   because we know that he can have an impact even  after uh the work of the journalist is done um I   was wondering how uh which kind of infrastructure  are you using is this on premise or maybe you do   both with some cloud provider and also another  question related to something you said earlier   you said that you have multiple shes and I wanted  to know if you access your sources and if you uh   you way to ensure that nobody is trying to  PO poison your data right um so we have clude   infrastructure and uh other infrastructure  that are in Secret locations uh our clude   infrastructure is Amazon um mostly because it's  um easiest not the cheapest but easiest option we   have um and also because it allows us to scale  so when I woke up in the morning if I need to   analyze 10 million document I can do it in a  day um but we also have other infrastructure   for more sensitive documents uh everything is  encrypted obviously um but yeah that's probably   one of our biggest weakness is the fact that  we have to use the server provider uh either   it's AWS or Google cloud or whatever we have to  trust uh someone else uh we know that there are   some uh friendly organization like uh flet for for  instance in Iceland that offer um great protection   for journalists um and we try to use this kind of  organization as much as possible but when we do   so we don't make it public because we don't want  to um you know um attract people to their servers   and to answer your second question yes some of  the source we got might have an agenda I need an   agenda it's always an issue and every time we have  to assess it but um the question is what's the   public interest when we receive data for instance  from hackers ransomware organization they benefit   from us if we work on the document they share  because that that allow them to do their business   because that creates some sort of uh pressure on  their victims uh but if the data is important for   the public interest we will still investigate  it and we will still try to find interesting   stories for our readers sometimes we got leaks  from different sources and we decide not to use   it of course we do some research first but we  decide not to use it because we realize that   the public interest is too little or not worth the  risk um but in situation where we know that there   are potential inter interferences sorry um that's  where the fact checking part is very important   because you know very often people think that  it's because we got an email of someone saying   they want to create an offshore company that we're  going to write it in the paper it's not how it   works we got an email with someone that wants to  do that then we investigate then we verify this   information and then we are able to publish it um  for instance if you take Shakira or El tundran or   uh I don't know any famous person in the leak of  course we find them much faster because their name   are known but once we got this kind of information  for this kind of person we really need to run uh   the kind of verification that we will do for any  other kind of stories so we have to um to verify   that the company exists we have to ask the company  that set it up if they did it or not um and there   are many you know uh many laborious steps before  publishing the story do we have other questions nope well feel free to send all the data links to ICI thank you very much Pier

2024-12-31 13:33

Show Video

Other news

The Nuclear Option FULL SPECIAL | NOVA | PBS America 2025-01-13 18:00
3: Threat Hunting with Microsoft Defender Threat Intelligence 2025-01-13 07:34
Kevin Novak, An AI-First Data Scientist on the Technology’s Current Limits and Future Potential 2025-01-10 00:55