Data and Trust - In Our Opinion Privacy and Internet Policy

[Music] okay hi danny i'm very happy to uh interview today danny weitzner who is a long-time friend and colleague at mit and dany is a principal researcher in the csail laboratory and also the head of the internet policy initiative at mit in csail and many other affiliations which makes him the perfect the perfect colleague to also be on our duality advisory board and the world expert on cryptography policy issues that have to do with uh government law security cryptography and privacy and i think we're in a very interesting time in this respect in the world and let's start our conversation so first of all i wanted to ask you danny to tell a little bit about yourself and about your background so i have a checkered past i studied philosophy in college that was not very marketable so i then spent four years doing technical work working with a combination of small rich investment banks and poor non-profits in new york city i also had the distinction of having installed what i think was the first token ring network in sub-saharan africa connected with a world bank project so so i have a technical background but i always felt like there were these critical questions in computing and policy so i went to law school and i didn't really know what that would where that would take me but it where it took me was to the very beginning of what i think of as the internet revolution so i arrived in washington in 1992 i was the first lawyer in washington dc for the electronic frontier foundation and we worked on this whole fascinating range of issues first amendment issues free speech on the internet surveillance what kinds of how would the fourth amendment apply on the internet and a whole bunch of of other legal issues and that's really where i first met a whole number of really leaders in the cryptography community with diffie ron rives all kinds of wonderful colleagues and we we had to very quickly figure out this set of questions about surveillance and encryption because of course the uh fbi and the nsa were very worried about what would happen if there was what we now call i suppose end-to-end encryption and they said there should be this thing called key escrow where someone else would hold keys they would all you remember all that and that was a kind of a sort of honeymoon period i think of of internet policy because it was all upside everything was very exciting the internet was growing really rapidly it created these extraordinary opportunities for people to have access to information to speak and there also seem to be a lot of exciting privacy opportunities we have these very strong privacy technologies that appear to be able to do a great job of protecting people's private communication against all kinds of threats one thing led to another i helped start a group called the center for democracy and technology both eff and cdt remain really important advocacy organizations but in 1998 i had this naive feeling that uh i had done we had done everything that was interesting about internet policy so i came up to mit to help tim berners-lee set up for the world wide web consortium and worked on um a whole number of private policy related technical design questions in the web uh privacy security standards a bunch of other things and had a little stint uh working in the white house i was deputy cto for internet policy where i mostly worked on consumer privacy issues there were a lot of difficult questions about just the global internet which had by that time gone from a kind of a exciting thing in the silicon valley garage to something that the whole world depended on but governments hadn't really figured out how to be serious about the internet and internet policy and after that i came back to mit started the internet policy research initiative and the final thing i'll i'll say is that i think that in in looking back at what has changed about internet policy and technology and computer science i think that particularly in the area of security and cryptography there was always a sense that what was interesting was what could be provably secure what properties you could guarantee with mathematical certainty and i think that's kind of come headlong into a more challenging nuanced set of questions about how to actually evaluate the risk and manage risk of using information technology because what we know now is nothing's perfect and the systems that we put out in the world have a lot of complexity to them a lot of nuance to them and we want to be able to manage risk in those systems because we depend on them as a society but we're not going to get perfection but neither do we want to just settle for the best effort so i think to me the really interesting challenge in the maturation of a lot of these technologies is that we now have to treat them as essential but also those that are subject to to some risk and and where the goal the intersection between technology and policy i think in many ways is about being able to characterize and manage those risk levels so that we can do the things we need to do with the information we have without incurring risks that we think are not acceptable it's an age-old problem right do you strive for perfection or do you just uh do a good enough job however my own opinion is you know in cryptography it's a very risky thing to do a good enough job because we never know what the guarantees are if somebody's attacking me and i'm unaware of it just in some sense it's not that having a fast enough algorithm but being secure enough only will be judged if it gets attacked or doesn't get attacked right well having some standard to strive for it probably is more important in the field of cryptography than in you know in other fields that have only to do with performance but photography really has to do with guarantees or in advance guarantees but we know this we can debate this yes touch on a different topic that what you say raise and that is saying we wanted a security to do what we need to do what is it that you think uh is the potential you know let's say maybe it has to do with collaboration maybe it has to do with computation in the cloud maybe it has to do with making sure that our civil liberties are maintained what are the potential of privacy enhancing technologies is it just making sure that we're safe in doing what we were doing all along or is there some new things brave new world that we could yeah so it's such a good question and i think that in a lot of ways what has matured our kind of computing and networking capabilities that is you want to get information from one place to another in the world wherever you are almost you can do it you want really fast computing for most imaginable computations you can get it you get it reliably and it's generally cheaper and cheaper as time goes on so those are good what is harder is data so the the data that actually feeds that computation i think we're discovering is a much more complex set of properties and we know we have more and more data we have whether it's data about people's health or data about transportation or financial data you know you name it data that might help detect terrorism we have there's so much information out in the world and i actually think we are mostly underutilizing it there's most mostly i think there's this huge sort of surplus of potential knowledge out there for which computer science has a lot of the tools to conduct analysis but we have a hard time for a lot of reasons getting that data in the right place in the right people's hands to contribute to the analysis that we need whether it's how do you make roads safer or how do you develop new drugs or how do you figure out you know just exactly what kind of health care a particular person needs based on commonalities they may have with others who who who they may share similar properties with those people may be spread widely around the world so what we know is we want to reach lots and lots into lots of this data but we're worried about how to manage the risk both of just the underlying security of that data but also the potential abuse of that data you know we we focused a lot in the privacy debate on things like the surveillance economy you know does google know too much about us does amazon know too much about us facebook probably they do but i think what's really interesting is all the institutions that have huge amounts of valuable data about us that we can't get at so we don't know enough about ourselves is i actually think the interesting privacy problem we and we don't know enough about our society to be able to make well-informed decisions and and that's where i think you know to back to your question of what kinds of guarantees can we get from you know different kinds of computational environments from different kind of analytic environments we generally i think are over indexing on the privacy risks because we don't have a way to measure them or manage them in a way that's that's reliable again i i think we've i don't want to say we've solved all the communications and computation problems because there's always more work to do but those are mature but i think our interaction with data is actually quite immature and there's a huge amount of growth to be realized there so to summarize is that the potential of the privacy-enhancing technology like polymorph encryption or multi-party computation or things beyond just communicating securely is in harvesting data while adhering to some rules which we haven't defined yet so these days yeah uh there's a lot of interest in the last few weeks actually in sort of policy you know in regulation for this type of technology and uh i guess um i was hoping you could talk about that first question is why is there so much interest all of a sudden both i i think there's a federal bill that's coming it's bipartisan there's also some uk usa uh interest in some sort of challenge for cyber security and you nobody better than you to tell us what's going on where do you think it's going it's a quite exciting time i think uh in privacy and security regulation we do have in the u.s as you said a new bipartisan privacy bill that's working its way through congress and and why is that happening i i think it's happening partly because the public the citizens are just getting increasingly nervous about who has their data and what's being done with it and you know there's there's been high profile cases where you see abuses from facebook from twitter from from others there's also the fact that i think while the u.s really led on internet policy questions for the first couple of decades we're now if you're kind of keeping score behind so so europe uh passed uh this very broad uh general data protection directorate which is a really kind of state-of-the-art privacy law and i think the us is now feeling like well maybe we better get our get our act together uh as well i think one of the one of the great things that's happened as a result of both the gdpr passing and and the consideration of the u.s law is that it's really um i think encouraged the computer science community and and and companies that are developing computing products and services to look more carefully at the privacy arena so as an example both in the european law and in the new proposed u.s law if you can say that you're not uh releasing personal data you actually get out of a lot of the requirements or you may get out of a lot of the requirements of of the law so for example if there's a way to put together the right combination of of multi-party computation protocols and homophobic encryption algorithms maybe maybe that gives companies more leeway to perform more analysis on data without all of the regulatory uh restrictions so i think that's now now is that going to happen or not that that's going to really depend i think on a careful dialogue between the technical community and the regulatory communities around the world right now again there are a lot of socially positive computing activities that don't happen because of worries about privacy risk is an example financial fraud that probably isn't detected because the companies that might do that detection are nervous about whether they can use the personal data they have for that kind of fraud prevention we talked about health information uh hospitals other other organizations that have health information are sitting on a lot of knowledge partly because they're nervous about whether making it available to researchers will open them up to privacy liability so these privacy laws that are either in place or being passed i think create the opportunity to have really focused discussions about exactly how we can use data for these purposes that everyone's going to recognize as positive but do so in a way that conform with the rules because as you said the problem we have had in a lot of cases and still have with privacy is privacy is kind of a general quality it's a general feeling but it's hard to know exactly when a particular computation a particular use of data is or is not consistent with privacy rules as the laws get more clear i think the technology will also be able to get more clear and in the middle of that process will have i hope a lot more innovation a lot more scientific advance and ultimately a lot a lot more useful uh knowledge so the in particular this this bill that's being passed is it like an american version of gdpr um it has yes it has a lot of a lot of features in common with gdpr it has a lot of the same broad rights you know rights like you have a right not to have your to have your data only used consistent with the purposes that it was collected you have your right to have the data held securely you have a right to access your data to correct it to delete it all those kinds of things um so the broad rights are pretty similar the mechanisms are are different just because the u.s legal

and regulatory system is different and of course the big difference probably is that the gdpr is as its title says a general data protection law that is it covers essentially all uses of personal data with a very few exceptions the law that's being considered in the u.s is one of would be one of many federal and state privacy laws so we already have privacy laws on financial privacy on health information privacy and and you know many others uh privacy of tax records of voting records all those kinds of things the u.s landscape is always going to be a little bit more complex but i actually think that's a good thing because what it does is it allows for example if we want to really get into the into details about how could you get better research access to health data you can go to the department of health and human services and say let's really talk about how hipaa the u.s healthcare law applies

here so i happen to like what people call this sectoral model of privacy rules what the new law that the u.s is considering does is it fills in this big gap in kind of the area of general commercial activity not social all the social media companies but also you know retail companies uh just all the different organizations that consumers interact with that aren't covered by one of those particular laws so maybe it'll pass this year maybe it'll pass a little later but sooner or later we'll we'll have some law in that in that space so uh this is sort of a more general question relating to policy whenever you hear a talk about trying to motivate to a general audience at least a the whole question of security cryptography privacy uh often people will quote the right to be left alone which is attributed to judge brandeis these days i guess i was reading one article here in preparation for our interview there's another quote saying privacy rights are civil rights these are slogans they're not really grounded in in law right i mean it's and the one one thought that comes to my mind is how far does that go so so what is it based on and second of all you know one comes another thing comes to mind is is it just that uh what i say should be probably should not be found out is it my data that should be secret is it can i go into another extreme is should i be able to deny things after the fact so say i'm doing computation using fhe or some other form in the cloud is it and then somebody comes to me and says they subpoena me or maybe they threaten me they say show me all your work there is a public record of it maybe we have ways now to actually deny that we have encryption schemes where we could sort of convince them that we did one thing where it's the truth is we did something completely different and that of course goes into this fake reality thing but it also is something about our right to you yeah any thoughts well let me start with this question of privacy rights as civil rights which i think is is really interesting and quite relevant one of the things that is actually enlivened interest in privacy protection in the us is actually the whole the discussion about about race and policing and uh racially and ethnically based discrimination that i think we're paying more attention to as a society so actually in this new privacy bill that that's being debated in congress now there are specific rights having to do with assuring that for example individuals are not subject to algorithmic decision making based on race gender or other protected categories we already have of course laws on the books that say for example you can't be subject if you're applying for a loan or applying for a job you can't be subject to discriminatory uh decision making the challenge now is we have these very complex computational techniques for categorizing people and predicting whether they'd be a good employee not or a good credit risk or not and we now have to figure out how to how to really seriously asked the question are those techniques discriminatory in some way in violation of these rules you you know of all the work that's going on unfairness they're pretty complicated questions but there's a reconnection of these questions of discrimination on the one hand and privacy on the other hand and a recognition for example in the case of surveillance that a lot of the targets of surveillance tend to be people who are racial or ethnic groups that are that that are categorically mistreated they're subject to more police attention and and so the the way that we construct surveillance laws has disproportionate effect on people who are more subject to police attention and the same thing um with any other kind of any other kind of law that involves the use of personal data we know that a lot of the decision making algorithms are based on biased training sets to begin with that means that the bias that we have in society carries over and of course the mechanism through which it carries over is the use of personal data so that's the sense in which privacy really is a civil rights issue because it's either a basis on which we're treated fairly or unfairly the the vector is the personal data one way or the other you know your question about surveillance and deniability is is is really interesting people who come into the us um who go through immigration processes now often have to submit uh their social media identities i think it's a really horrible horrible practice uh i mean it's it's one thing to say well you know if if you're you're coming into the country and you have a criminal record in some other country then i understand why immigration authorities might want to pay attention to that but to just be scrutinizing people's opinions uh people's uh exercise of their free expression rights online as a basis for assessing immigration i think is really a bridge too far whether it's a specific constitutional violation in the u.s is so a little more complicated because of people's immigrant status but clearly our citizenship status but clearly it's a violation of human rights norms uh to do that in in my opinion so yeah i i think that privacy because so much of our interaction with the world is through the vehicle of our data privacy becomes this almost universal rights litmus test right how our personal data use is used really is a reflection of how we're treated as individuals period uh so it's a big it's become a much bigger question it's perfect that's where i actually leads me to my last question i've been very generous with your time is under the understanding which is how why i have decided to also try to commercialize some of the technology that been developing for so many years that we can at some situations where it seems paradoxical but we can actually derive utility without compromising privacy what do you think needs to happen to accelerate adoption of these type of technologies is it um scientific is it standardization is it just you know increasing the comfort and trust of people in these fairly advanced things to assimilate fhe you can encrypt you can compute and get the result um that's like i guess what duality does mpc multiple people can exchange messages and at the end they get what they want they never revealed anything about their data right um what any any how do you see it so i think clearly these kinds of secure computing technologies have this very important privacy i'll call it a privacy managing role that is for our audience that you know when we say the word privacy some people immediately think the french privacy but i think i think it's cryptography so computing things on data without except for the results right so so it allows us to understand where we have risk and where we don't and when we can manage it and when we when we can't so i think you know to me the the first part of the process has happened thanks to your research and the research of others like we understand the basic science and math behind the techniques i think that we're in this interesting second stage now where what what is happening is people are gradually seeing that there's benefit from these approaches that there's a practical way to realize benefit uh and so it's one thing to kind of know in abstract terms what these kinds of tools can do it's another thing to see oh there's a really concrete benefit there's a you know we reduced fraud risk in a financial transaction we know more about cyber security we know more about health factors whatever it is i think that you know more and more practical examples actual implementations are going to help more and more people realize what that there's a benefit and in some sense that's what policymakers need to see because i think the the next step after this sort of piloting the next step is to provide confidence to i'll say more institutionally cautious users that it's safe to do this and where's that going to come from it's going to come from actually a kind of a risk balancing that says oh i can get some benefit from it that's what that's why the pilots are important and also i'm not going to get in trouble for doing it and for a lot of companies that have the kind of data that we're talking about they have regulatory obligations they have legal risk they need to know that that that risk benefit equation comes out positively for them and i think that's where governments different regulatory agencies have a really important role to play to say if you use this kind of technology in this kind of setting we will consider you to have been appropriately careful appropriately compliant with respect to whatever the privacy and security rules are for that data ultimately that kind of assurance i think has to come from governments in the same way as this kind of general regulatory category called safe harbors you know as an example people who use health data for today know that if they remove certain data categories from health data from patient uh patient health information that that that data will no longer be subject to hipaa rules or penalties so we need and that's just one example of a kind of a certainty that government can provide that can say if you do x y and z we're not going to worry about you you're going to be considered to have satisfied your obligations what we don't want is a situation where commercial users government users say well let's try it and see if anyone complains that's not going to be a good approach that's not going to scale well we need more pilots to illustrate benefit concrete benefit and we need government action to encourage usage and reduce the risk i'll give you one example of where this happened very effectively in the past in around 2010 when the cloud computing industry was first getting started um there was a lot of uncertainty about this idea of handing all your code and all your data over to google or amazon or someone else and say you you do everything and we'll we trust you um uh in the end one of the things that made cloud computing more commercially viable was that the federal government set out a set of standards for procurement of cloud computing services and basically said to federal agencies well said to the cloud computing providers if you could meet you know x y and z requirements there were security requirements there were cost requirements they were availability requirements then federal agencies can buy your service and use it so and have a kind of an easy path through the procurement process that led to this explosion in use of cloud computing services both in the federal government and outside the federal government um because some central organization that was trusted and authoritative had looked at this particular set of technical offerings and say yeah if they meet these requirements it's good to go so we need that kind of thing i think over time in in this area of secure computation so that um so that any anyone who wants to use these services obviously has to decide they trust the vendor but also has this extra layer of trust that that someone else some third party has looked at it and they're not going to get into legal trouble for doing it fascinating wow danny this has been really uh you know they have a million thoughts going through my mind and projects that we should be doing together yes indeed that's yes indeed uh love talking to you i'm so grateful thank you [Music]

2022-08-11

Show video