Tech Spotlight 2021 Full Event
Welcome to the 2021 Tech Spotlight virtual ceremony. The Tech Spotlight initiative recognizes the critical work of individuals, groups, and organizations that improve the lives of others through scientific and technological innovation. I started the Tech Spotlight in 2019 in order to recognize and to advance the good that many technologists are doing. Even as the world, and certainly we at TAPP, come increasingly to recognize some of the bad that comes with technology. Our purpose in the Spotlight, and in the Technology and Public Purpose Program at Harvard's Kennedy School is to try to bend the arc of technological change in the direction of overall human good. So to begin today's ceremony, let's take a look back at the last year.
2020 tested our strength. Our resilience. This year, we relied on advances in tech more than ever.
Join Tech Spotlight to recognize efforts that make our world safer, fairer, and more inclusive. >> Once again, hello. This is Ash Carter, the director of the program . Welcome to spotlight. Let me first -- you have heard the purpose ofSpotlight Cam which is to recognize people that have recognized themselves and their work, and although technology is a wonderful thing and technology -- we all are technologists, it is generally doing wonderful and good things. It does not do that automatically, and we need to work at, I was paraphrasing, Reverend Martin Luther King, bending the ark of technological change for the overall good. I want to thank you for joining us and also special thanks to the selection committee and to our student researchers, our special keynote moderator panelists, and the Kennedy school golfers a center team for making this event possible. We had an incredible lineup of folks who are working at the intersection of technology and public purpose. Here is a little snap preview of the six projects
that we will recognize today. First, in a year when we moved our lives online, project Galileo is protecting from debilitating cyber attacks , civil society groups , like those working to advance racial justice, and combat COVID. Recognizing the threats to privacy in our online world. The U.S. Census disclosure avoidance system safeguarding the data of census response.
When we centered conversations on racism and bias and discrimination, the data set nutrition label is creating a solution to limiting algorithmic bias. Committee control and police surveillance is involving communities in decisions regarding police surveillance technology. And, finally, in a year overturned by the pandemic and health disparities, the COVID-19 molecular Explorer is shortening the timeline to discover COVID-19 antivirals and mind the gap is creating the resources doctors need to better assist patients of color. All of these groups are addressing the harm caused by tech leveraging tech also for good, and they had innovated an incredibly challenging year. I am glad we can use our platform at the Harvard Kennedy school to
recognize their incredible and uplifting work. Please enjoy this ceremony and be inspired and I hope compel yourself to join in and do good for technology. With that, Laura, I will turn things over to you. >> Thank you, ash. My name is Laura Manley and I am the director of the project at the Harvard school. I will be joining you today to and see the Spotlight. First up, we are thrilled to
welcome our keynote speaker, Doctor Safiya Noble. She is a true advocate and pioneer in the field of technical purpose. Her research focuses on the design of digital media platforms on the Internet and their impact on society. Her work is sociological and interval is very, knocking the ways that it intersects with issues of race, gender, culture, and technology. She is an associate professor of the UCLA Department
of information studies, where she serves as the cofounder and codirector of the Center for critical Internet inquiry. She is also the author of the best-selling book on racist and sexist algorithmic bias and commercial search engines, algorithms of oppression and how search engines reinforce racism. We are honored to have you today. Over to you, Doctor Noble.
>> Thank you, Secretary Carter for the invitation to give some brief congratulatory remarks. It is really my honor to offer some brief thoughts to celebrate the incredible research contributions made by important scholars and technologists working at the intersection of justice and equity in society. I really want to congratulate all of you for being recognized by the Spotlight at Harford Kennedy school Belford center and I am thrilled to congratulate you on behalf of all of the colleagues . Your work inspires us and it is one of the reasons why we got up every day and try to do our part with you. You know, public purpose is really essential to healthy innovation. We are living in a time where projects that support democracy, inclusion, and fairness really must move to the center of our national attack agenda. Now, more than ever, we need deep investments in technology that
prioritize rights, dismantling white supremacy, and fostering gender equality. What we really need are people who can reimagine and reconstruct true social political economic and environmental equity. The people, communities, the companies, all of the people who have gone into making these projects that are being recognized today by the Belford center have really demonstrated a commitment to reduce societal harms and protect some of the most important values we need to consider. Of course, these values are on the minds of many of us and these are concerns with privacy, safety, security, transparency, accountability, and inclusion. You know, it was just a decade ago that I began my own research journey into the racist and sexist harms that come from large multinational recommending systems. It is hard to believe that only a decade ago, most technologists were arguing that computer code could not do any social harm, and that, you know, when there were digital technologies and digital media systems that ran amok or that created harm, these were likely the fault of users and not of the systems themselves. Many people suggested that
these systems failed is -- that it was really just a matter of consequence, you know, the price that we pay in order to have these large-scale text systems. Instead of facing the truth that these technical systems are often an expression of values, values of their makers, values of their founders, the values of their investors. Now, unfortunately, I am sorry to say that we have mountains of evidence, a decade later, to show how profoundly the harmful effects are of many different types of competing technologies, and these include all the -- emerging artificial intelligence systems. Every week, there is a new headline in the news about the danger that
comes from data sites that are profoundly discriminatory and they are often used to train the machine learning systems with almost no oversight and no possibility for legal protection or recourse. Today, I have to say that I am relieved, I am glad to know that we are here to acknowledge the work of so many people who are working tirelessly to imagine a different future. These projects and teams that are being recognized today help us to imagine how technologies can be used to mitigate, reduce, and even illuminate harms in society. At a time when there is such an egregious and, dare I say, immoral and unfair dissolution of resources in the world, we need creative, courageous, and audacious voices to speak truth to power. We need designers were looking at the most pressing issues of structural
and systemic racism and sexism and trying to find ways to solve these legacy and contemporary challenges. We need thinkers who are not afraid to challenge the status quo. We need people of conscience who value people over profits, to emerge and refuse to put their skills and label in service of repression. We have an opportunity to see such communities insist in the text base and they are interested in using their skills and knowledge to shine a light on the most pressing problems of the day, so we can solve them. No matter the moment we inherit, the time or the era, we are responsible for living lives that make a difference. I am proud to be part of a community of people committed to making a world more fair and socially just and to give you the words of encouragement today because your work and our work, together, all of us, it does make a difference. You are setting the agenda that the silicon valley
founders should follow. At a time when Wall Street has put a premium on extractive and damaging technologies to maximize profit at all cost, you, the recipients, today's recognition, you are really showing the world that we can develop technologies that can help us appreciate and reimagine a different and better pathway forward. This Spotlight has recognized where projects initiatives as technology that seeks a more fair, inclusive, and safer future. It is my honor to participate in recognizing these amazing projects and initiatives today that demonstrates a commitment to public purpose and areas of digital , biotech, and the future of work. I truly congratulate you for your brilliant
example. You are the change we want to see in these fields and your ideas will set a new bar of excellence for those who follow you. Thank you and congratulations.
>> Thank you so much, Doctor Noble. We look forward to watching your work. >> Thank you. >> Let's move onto announcing our 2021 finalists. In a year where tech was so central to our lives, there have been amazing innovations and clear benefits but also significant societal harms. We were excited to review nearly
200 nominations from 14 countries, all projects and initiatives that were doing their part to make this world a safer, fairer, and more inclusive place for us all. Each nomination was evaluated by our team of researchers and our expert selection from three key areas, impact, innovation, and alignment and a mistreated commitment to public purpose. I am pleased to present the finalists for 2021. >> Here are the tech Spotlight finalists.
Community control over police surveillance, ACLU. COVID symptoms study app , by the COVID symptoms study. COVID-19 molecular Explorer , IBM research. The data nutrition label, the data nutrition project. Garbage in, garbage out, face recognition on flawed data, the center on privacy and technology at Georgetown law. Mind the gap, black and brown skin. Privacy not included, Mozilla foundation.
Project Amelia, probable models. Project Galileo, Cloudflare. Project lighthouse, Airbnb. Racial disparities and automatic speech recognition, Stanford computational policy lab. Save as an shelter training program, operation safe escape. Smart noise, Microsoft and OpenDP. Student privacy project,
EPIC. Terms of service ratings, terms of service didn't read Association. 2020 census disclosure avoidance system, the U.S. Census Bureau. Up solve app,
Upsolve. >> Congratulations to all of our finalists. We are now pleased to welcome the global editorial director of wired. Previously, Gideon was editor in chief at MIT Tech review, editor -- the first science and tech writer at the economist. Today, he will moderate
a conversation with several of this year's finals, including Kasia Chmielinksi, Michael Hawes from the West Central Bureau, Chad Marlow from the ACLU, and Alissa Starzak from Cloudflare. >> Thank you, I want to assess -- the economist has been around for 117 years. I was not the first science and tech writer there, but it was a moment of a great tradition. I am pleased to be here, recognizing these projects but I have
-- they are, as said before, a very broad range of projects tackling subjects that are fundamental to the challenges we face today. This was a year in which we lost so much and yet people mobilized. They organized and collaborated and they innovated in the face of multiple crises. These projects are being recognized from the medical field to the structural racism to data privacy and use of data in general and as said, many of them are using technology to try to correct some of the harm created by technology because technology is always a double-edged tool. What I think these projects -- they tackle deep and fundamental problems, some of which have become even more urgent in the past year of the pandemic, while we have been here the lockdown. What I will do in the next half our or so is talk to some of the panelists and some of the people behind the projects and try to tease out what they did and how they have worked during the pandemic and some of the common scenes and what they are, that unites these very different projects. I am going to start by having some individual conversations and bring them all back for a group discussion. Let me start by welcoming
Kasia Chmielinksi, who works on the dataset nutrition label. Welcome. >> Hello, thank you for having me. >> Thank you for joining us.
>> Let's talk about this because the issues -- what we have seen in the past few years, they have been around the quality of data and how it is used and in particular its algorithms as AI algorithms and other data and use it to make decisions. That, the quality of that data that is being adjusted and what biases it may contain, it is an increasingly urgent problem for us to tackle. Talk to was a bit about the data and nutrition label and what they are trying to achieve. >> Great, I think when you talk about algorithmic bias, we have to consider how it is happening , and to your point, so much has to do with data. If you look at the model
development pipeline, it starts with, hopefully, a business question. Not always. I would like to answer some kind of question using AI. And then you will say to your data scientist, go find some data to see if we can support that and build a model and sometimes there is certain kinds of data and some of it is public and they go in Google a bunch of things on mine and she finds some data sets and comes back and says, okay, let me build a model. You train that model and it picks up whatever is in that data. For example, if you build a hiring
algorithm and you train it on historical hiring data, which is something Amazon did, actually, historical hiring data is going to be filled with historical biases about who is hiring and it was not. You might have a data set that is very comprehensive in terms of telling you what has happened in the past but that means more women and people of color were not hired for positions, where others were, not because I do not have the credentials but because of the biases of the time. That hiring model will pick up on those biases of the original data set and will start to perpetuate those harms. The problem I saw as a practitioner is that often we find the issues with algorithm at the end, and that is problematic for two reasons. One, it might be harming people already
, in particular communities already being harmed or underrepresented. The second is it is very expensive at that point to go back to the beginning and find more data and retrain and go through that process again. Our goal is to use of food and nutritional label as an idea or an inspiration to look and say can we actually build something for a data set which will tell us at the outset whether that data is healthy for the model and the use we want to use it for at the end? So, that is our focus.
We built what we call data set nutrition labels and these are ideally a way to standardize how to tell whether data is healthy. That is for a subjective, so it is qualitative and quantitative information that gets you to a place, hopefully, to compare data sets to each other and give you a little bit of information so you can go deeper and say, is this really good for the intended use that I have? >> What would be the most important use of labels like these? In which sector do you think it is most urgent to have that kind of label? >> That is a good question. There are some industries and domains where you are seeing the most mature uses of AI and I would say starting there, especially when it comes to AI -- committee harms. For example, if I have a bunch of data on trees in Central Park and I am making a bunch of models about how I water these trees, that is me -- that is less important to having a data set about that because that is not directly tied to people who are getting access to resources or access to health services, et cetera. You want to look at domains that are more mature in their use of AI, specifically around people and communities. To me, we have healthcare, also finance, and lending consumer and collections, things like that. There are a few others
you can hit on that list but I think you would want to start there and that might be the criteria to identify which domains to do some prototyping or something. >> Can you give me any specific examples about the algorithm or basic -- the process of using AI changes for the better after using something like the data set nutrition label? >> Definitely, it is not always a case of -- it is rarely a case of malintent, where someone says I want to build a terrible algorithm that will do terrible things to people. It is more often a case of resource constraints, where there are not a lot of data scientists and they do not, you know, they do not have a lot of standards eternally to check the data.
That is what is crazy, that is going so fast and changing so much and there is so much data and kinds of data that there are no standards. It is up to every single practitioner, maybe their team or organizations to put in place standards and that is why we are hoping to standardize across. I have seen in my own work and industry and finance and healthcare, the implementation of some kinds of standards at an individual level, like an individual organization level. Just saying, we should look for the diffusion and proxy variables and for these kinds of things. I have seen many cases where just doing that kind of investigation leads to people saying, hey, let's hold on and make sure we are adjusting this model, to make sure we are not picking up on that particular bias. The other challenge for this is a lot of these models and cases are proprietary. You never really know what is happening inside of these companies
unless you are inside of them. You do not know what kind of adjustments have been made. That is why we are hoping to also desperately what the label methodology up there so organizations that want to use this but do not want to create labels to publish can do that or they can even use the label as a proxy of the data set and not publish the data set but published the label to show some kind of level of transparency. >> What do you think it would take for the label to become an industry standard in some form? >> That is an excellent question, when we ask ourselves a lot. We are very much a prototype. We have had some varied conversations and I think it is a bottom up
and a top-down type of thing. From the bottom up, what we produce is open and the methodologies that we use and that we recommend are also very much working off of other projects, alongside them, projects like [ Indiscernible ], model cards, fact sheets. There is a lot happening. We are not the only ones. We recommend people use certain methodologies and that they take a flavor of that and hopefully the ideas we are driving, a culture of transparency and an appetite for transparency. In the same way that, because the can of Coke that a drink has a nutrition label on it, I start to think about transition -- I go to a bakery and think I wonder, you know, what the label would look like. It does not have one. You want the same kind of a deal where people start to expect they will have certain information about a data set and that is very ground-up. It is about openness and conversations like this and really driving the change from the ground up. I think there is a top-down. There
is policies, we would love that every open data state that is published comes with something, like a label, and we believe in the standardization of documentation , so I think driving from the policy down as well as working with organizations that are putting out certifications. For example, the responsible AI Institute has a certification around responsible AI and the data nutrition component is actually their data component to that certification. Trying to become part of certifications that are also being used more from the top down. >> Great, well, thank you very much. Next, I will turn to Chad Marlow . They are hoping communities draft local laws that can give them more of a say in whether and how police used surveillance technologies. Welcome, Chad.
>> Thank you. >> Obviously, what was just talked about, AI bias that is driven by bias data sets, policing and the use of technology is one of the more pernicious examples of how that can go one, like they use it to inform algorithms that do things like predictive policing. Those data sets are informing biased policing practices in some cases. As a result, there has been this movement over the past year that we have seen in the wake of Black Lives Matter protests to ban the use of technologies like facial recognition in certain cities and it feels like the national conversation around policing and surveillance and the use of tech has really shifted in the past year. It is almost a bit about how
you have been involved in shaping legislation to help. >> Sure, so it really starts with a kind of revolution non-revelation that happened. I think America went large and saw millions of people marching in the streets and protesting the death of George Floyd but protesting a lot more than that.
Protesting , really, the racially biased system of policing that we have in this country. When I say revelation non-revelation, what I mean is that is not a revolution to people of color at all. You could go back to Bobby King in the 80s and go back to the racist lantern lost in prerevolutionary war New York City to see how policing and surveillance is unfortunately focused on communities of color. I do think, to some extent, we are looking at a national revelation and that I think our country, as a whole, is beginning to understand that policing is experienced very differently and provided very differently depending on the group of people that are being policed. So I think, when it
comes to the use of surveillance technologies by police, that when those technologies are used by an institution that has racial bias problems, you will find that those technologies have racial bias problems and the problem with the way that surveillance technologies are acquired and used by most police forces in this country is that the police are given the authority to decide to acquire and deploy those technologies unilaterally and in secret. So, when society rises up and begins resisting racial bias in policing, it cannot really do that with respect to surveillance technologies if the public and not even elected officials know that police are using it in the first place. So the idea is to shift the way that surveillance technology is required and used to a far more transparent process in which the public has a chance to learn and organize either for or against the technology and how it is used and the decisions are ultimately made by local city councils who are democratically accountable to the people whose opinions need to go with or go against. So that is kind of
the idea. It , at its core, is a transparency in public engagements measure but ultimately it provides the grounds for which the most impacted communities can actually push back and even block the use of surveillance technologies, which, again, artist proportionately used against communities of color and other vulnerable groups in this country. >> You said that, to give communities the visibility and understanding to decide if they are for or against the use of the technologies. Now, I think realistically, looking at what has happened in the past year, I would expect that people are going to use things to try to limit the use of these technologies. Do you think there is a way that this may end up in a better informed, in other words, using the technology but in a better informed way with more transparency about, for instance, where the data came from or an attempt to apply these technologies in a way that is fair and allows the police to do the job of protection but without a disproportionate way of harming communities of color.
>> The answer is I do not know. That is actually correct because it is not for Chad Marlow or the ACLU to make those decisions. It is for locally impacted community groups to make the decision. So, you may find that a group in one of our cities, like national Tennessee is in favor of these but that same community in Oakland, California, is against it and that is fine. What you do find as a result of the way that we set up these laws is that there are a series of many, many questions that have to be answered about the technology and what it can do and what uses you wish to use it for, and whether it will be applied equally to all communities that have to flow out any part of the determination process. And there are certainly surveillance technologies that have good and bad uses. Like automatic license plate readers, they provide
data retention's limited and it can be very good for Amber Alerts, for toll collection, and for things like that, which we have seen. Amenities might decide to approve it for those purposes, but we have also seen it can open California where they were going to deploy only in black and brown communities and openly use it to tip off ice to wear a document that people live and that was unacceptable. I really do think that this is intended to be an educative process in which each of surveillance technology and each use is understood and considered by impacted communities and decisions are made in what they believe is kind of an overall cost-benefit analysis. Where that comes out , as long as it is of following the bell of those communities, I think we all have to be okay. >> And the issues around this technology are really complicated. These decisions around that, they are complicated. Federal lawmakers are easy to get confused
about these issues. What do you think is a way that we might get to some common standards and practices and understandings, basically ways in which different communities can learn from each other about what the best applications are because I can see one outcome of the way this works, that if every community has its own specific practices, that might be good because it corresponds to the needs and concerns of certain communities, but also how do we get a national conversation going about which methods actually work and which ones are harmful? >> I think that happens by undergoing this process. I think by different communities -- if you agree to allow a surveillance technology to be used, you're not locked in stone, right? This is a forward-looking aspect which you provide information on how you will use it but also has an annual review, in which you look back and see how things went and consider whether you want to stay on that path. So I think this constant kind of forward-looking and backward looking analysis is going to start revealing certain truths about the way that the technology is used. I do also think it is very important and I think one of the risks here in
the moment that we are at in our country, it is important to focus in on one of the things we are talking about in the wake of George Floyd is about reducing the funding of police departments and shifting it to better uses and that the debtor promotes public safety. One of the concerns we need to consider is that as Police Department are being asked to reduce their budgets , we do not want to create a situation with police departments saying we no longer have the money to operate the way we traditionally have. That is to say it is expensive human driven racially biased policing but if we shift to surveillance technologies , we can do the same work surveilling communities of color and other vulnerable groups for less money and end up reducing more efficient surveillance technology driven and -- I think that this has to be a common and ongoing consideration and I think that the more people are engaged, the more they know and the more they can look back and examine, the more, quote, unquote truths they reveal that whether it is safe to use the technologies at all. >> Thank you very much, Chad. I will bring you back in a little bit. The next person I will call on is Alissa Starzak from project Galileo, which will start in 2014 by Cloudflare and provides free protection to public interest groups and nonprofits that are at risk of online harassment. Hello, Alissa. Thank you for joining us.
>> Thank you for having me, Gideon. >> So far, we have heard about -- a lot of the groups that have been involved in organizing around Black Lives Matter, around COVID-19, around democracy over the past year, they have been the target of cyber attacks, hacks, harassment, or D DOS attacks that try to take websites offline. Let's talk about how they have been running for several years. What changed for you during this pandemic year? How did you mobilize? >> That is right. What we saw, as more people moved online, we also saw more.
We had attacks for a number of years and it is and often an easy way to take someone off but the challenge during a pandemic year is when everyone is operating through online, it looks completely different. What we saw was also around the protests around George Floyd, for example. We saw as they ramped up, the cyber attacks ramped up as well. At one point, we saw 20,000 requests a second, which is a crazy number from Internet standards, against a particular website, trying to take it down. It was a site that was organizing for protests against George Floyd. I think that one of the things
that has been interesting over the course of the past year is we really do have to think about how we project vulnerable groups online. It is just -- the online world, as it becomes important, the protection that we give them there is as important as the protection we give them in the real world because it is critical to their work and critical to making sure they can communicate and it is an important part of where we are right now. Even outside of the pandemic. >> How are the threats of these groups are facing been changing? >> I think one of the things we are seeing, actually, a lot. We see a couple of things. We see people trying to take websites down but also see attacks on trying to get into systems and to potentially corrupt systems or to change data or delete data. We see ransomware and we see all sorts of things that look more
at internal systems. But we have been doing over the course of the past year is to think about other tools we can give to organizations and social justice organizations that will help them protect themselves in a number of different ways, not only protecting their external facing site but also protecting their internal systems from attack. That is an area we have focused on because we see the need to during the pandemic. >> These are threats and the kind of tools that are well beyond what Cloudflare normally provides to clients. You are focused on detecting websites but here, we are talking about a whole range of security threats. Do you see your work is continuing to expand
on Galileo? >> I think we want to do everything we can in that area. We also want to think about how we partner with other organizations. What we have seen over the course of the past year is that there is a need for the sector to step up and make sure that groups are protected. Things that are outside the scope of we can do, we
want to partner with other organizations that can help. We have lots of conversations with civil society groups and how we organize together and how we make sure people know what tools are available to them and think they can get for free. I think one of the things that is striking and particularly when you start thinking about the civil society face in general, is that it is of honorable population online as well as offline. You often have small groups and they are underfunded and they do not have a lot of staff, particularly if they are the committee organizing space and may not have technical staff. There is a lot you can do in the tech world and in the private sector to just help them alter their defenses and make sure that they have what they need and make sure they have access to other people that might be able to provide tools and that is ultimately what project Galileo is about. >> It feels like if you work in this sector, the nonprofit space, in public interest organizations, the risks to everything is increasing from the online sphere, as well as the offline one.
Friends of mine who work at these organizations, that is the thing they are increasingly thinking about and trying to figure out how to protect from. Do you expect we are going to see the need for this kind of protection increase? >> Absolutely, no question about that. I was talking about the increase -- we saw just on the people who are part of our program, over the course of this 2021 year, we saw a fourfold increase in the number of cyber attacks and the number is such a significant number. We are talking about 13 billion cyber attacks over the course of eight months, which is crazy when you think about that, this normal day, even on the website . That does not cover all the other kinds of attacks you mentioned. I think we really do need to make sure that we have a system in place that protects people who are operating online for a range of different attacks. I think there is a
lot of work to be done but I am really excited we can be part of it and we hope to continue thinking of ways of innovating and doing more. >> This might be beyond the scope of project Galileo, but do you have any thoughts on legislation that could help protect groups like this better? >> You know come I think one of the things that the federal government has been looking at a lot are cybersecurity and things for the federal government itself. I think that is one of the things they're looking at, the tools that are best practices. That is not legislation exactly but it is an education that can give -- it is something that groups can use or civil society groups can used to say these are the best practices. Is or what the government recommends for themselves. These are things we should have, too. I think it is incumbent on the private sector to make sure
those tools are available to the same groups, so it is not something that is only available to entities that are willing to pay a lot of money for them. They really have to be something that is available to the most vulnerable online as well. >> Thank you, Alissa. Please stick around. I am going to next ask Michael Hawes to join us. How are you? You have been working on the census avoidance system, which is something about differential privacy and making it hard to identify people. This is something, one of the biggest concerns and what I was just talking about with Alissa, that, for example, people who work at nonprofits that are doing social justice work, one of the risks to them is having their data laid out and put online so they can be attacked or docs. What you have --
you have the need to protect people better, even -- talk a little bit about how the system works and the concept of differential privacy and how you apply it. >> The Census Bureau takes great pride in everything. There the leader in quality statistics about the nations of people in the economy. Our ability to collect that information, as you pointed out, her ability to collected requires the American public trust is to safeguard the information against privacy threats. So, one of our flagship data products is the census, the constitutionally mandated one every 10 year of the U.S. population. The data products that we produce for the census, for the 2020 census, have incredible granularity of information. It is rich information down to very small
geographic areas. There are substantial risks inherent in publishing that much data. So, the 2020 census disco for avoidance system is a modernization of disclosure techniques. It is an approach to being able to protect privacy and competency now the in public data releases. It is based on the privacy risk accounting framework of differential privacy , which allows people to quantify the exact amount of privacy risks associated with each and every statistic that we have been publishing. By quantifying that risk, it allows us to precisely calibrate the amount of the statistical disclosure avoidance safeguards we need to perform , the amount of statistical noise or uncertainty that we would have to infuse into the data in order to protect it from reconstruction and identification attacks or the ability to pick specific individuals out in those data projects.
>> Did you have to strike a balance between making it sufficiently noisy that you can protect but still sufficiently accurate that it is useful? >> That is a fantastic question and it is a challenging one. The only way you can completely eliminate all risk of reddened vocation would be to never publish any usable data at all. Clearly, that is not an option for us. Our reason for being, our mission, is to
produce quality statistics. So, we have to find a balancing point, where the data is sufficiently accurate to meet the intended uses, to meet needs for allocating 60 or $75 billion for redistricting. They need to be accurate enough to meet those needs while also being sufficiently protected, sufficiently noisy to prevent bad actors from being able to identify those individuals and learn their sensitive information.
>> One of the problems is that the more data is available about people online or that can be obtained from PACs, the easier it is to correlate that with census data and re-identify people. You are fighting this constant uphill battle. How do you see that playing out? >> The privacy is constantly changing. We see that landscape
transformed over the last decade, not in just the proliferation of this third party data that could be used to link census data to pick out certain individuals but also on the technology front. Computing power has increased substantially and there is new sophisticated optimization algorithms that can leverage that third-party data in these new types of attacks against the public data releases, so, over the last decade, that has fundamentally been transformed and changed the degree of privacy risk inherent with publishing official statistics and our own internal research showed us just how real this thread was and the growing insufficiency of traditional or legacy approaches to disclosure avoidance and that is what prompted our transition, how are modernization and our disclosure of everything, this private approach to performing disclosure. >> Do you think this privacy approach could be applied to other cases, where others are being stored? >> Absolutely. It is already in use with a number of technology companies and we have been using differential private solutions at the Census Bureau since about 2008 are some of our smaller data products. This is the first time we have been applying it to one of our flagship products, like the census. But, the principles of the privacy, and of differential privacy have brought up ability and I think a number of statistical agencies and other numbers of national statistics around the world will be looking very closely at how we are implementing our solutions for the 2020 census and learning the lessons from that, plus we are making all of our tools, our privacy tools available in the public space, so that other agencies can capitalize on the research and investment we have done.
>> It is great to see you leading the way in this technology and I really hope other places do so, too. Thank you, Michael. The final -- the final session and the final quick talk, I will bring in Payel Das. Even before the pandemic, there was this increasing interest in using AI to explore the space of chemical compounds and potential drugs without having to synthesize them in a lab, so you can do virtual testing effectively. You developed that to address coronaviruses specifically.
Can you talk about what it has achieved and if it has helped researchers identify any drugs? >> Yes, sure. Yes, so since the launch of Explorer, we have found a huge amount of interest from diverse stakeholders from public and private sectors. We have collaboration and partnerships helping us to discover problems. One example is most recently we have been working with Oxford University and others -- then we have identified already a molecule that exists, as it is, which is an effective [ Indiscernible ]. It is a novel one. We had to test four molecules only to get to this , which is, you know, it is a game changer but we are using AI to create this novel molecule that we started with, and then by experimenting and synthesizing only four of them or five of them, we are identifying a lot of effective data. >> Normally, you would have to synthesize hundreds of them maybe.
>> Or even more, sometimes. That is where they are coming into play. >> Wonderful. Obviously, the biggest excitement has been about vaccines. So, do
you think there is the potential for developing therapeutic drugs as an important tool in the fight against COVID? >> We do think so because there has been the vaccines -- they are not fully effective and they are not going -- we are already seeing that. It is not going to be the whole population being vaccinated. There will be all that need for drugs for those that are vaccinated and for people who are not vaccinated and then for that much -- for other pandemics, [ Indiscernible ] treat those drugs. The same technology we are building, we are taking it to developing vaccines and we have already started working with some partners out there. >> Yes, and beyond COVID-19, what are the diseases or other problems do you think a tool like this could be applied for? >> The tool has -- [ Indiscernible ] . This is what allows us to create hypothesis by learning smaller amounts of data. We all know that creation and
scientific domains is expensive , and it would be noisy and imbalanced and so on. The technology we have created, [ Indiscernible ] -- now we are taking it to several different applications, coming up with [ Indiscernible ] that are safe to use. In fact, last month, we have published of our work, where we had showed how the technology can help us discover novel and safe [ Indiscernible ] that are also not inducing resistance to bacteria. This is some of the real applications and points we have created, where this is not a level of prediction and these are real and there are huge potentials coming up with new antimalarials that are safe to use , and also it does not induce resistance to bacteria's and also can treat some of the hard to treat negative bacteria's in less than two months , compared to how it used to be, which was several years. We are developing new materials and capturing data but the potential is really huge and endless here.
Another aspect, it is promoting opening for discovery. We are participating in global partnerships about AI, which is a global initiative that has [ Indiscernible ] and the goal here is how do we come out with the infrastructure and conditions for better use and development of AI, that is grounded in human rights and human values. >> Great, thank you very much. I will ask all of the panelists to join us now. Can I see Chad, Michael, and Alissa ? Same thing for Kasia. We have two or three minutes left but I wanted to basically try to do a quick lightning round of what is one lesson, if you would like, from the pandemic about how -- you can see how it changed the way you think about the way you work or gave you some sort of inspiration for how you might work differently in the future ? I know it is a very general question, but anything about this past year working under these extreme conditions that might be getting some thoughts about how you might do things differently in the future. I will start with you, Kasia.
>> Oh, man. That is a difficult question. I mean, I think that a lot of, you know, prioritization sometimes happens without the feeling of urgency . You say what should I work on next and that seems like fun and we will do that. I think, especially around data sets and cleanliness and quality of data, it can really clear up quickly what was important and what we needed to do first and what we need to get out there first. I would say, urgency is a great method of prioritization and the second thing I would call out is just a lot of things that we try to build or do as a team, to try to do in person and we had to adapt doing everything online and the realization you can get a lot done and have more meetings and a day with more partners around the world if you just moved to a totally remote model, which I think is not a surprise to anyone here. Those are the things that stood out to me. >> That sounds like a good conclusion. Michael, what about you? >> I think the biggest lesson from my perspective has been kind of rethinking how we communicate. When we were chatting a moment ago, you brought up the fact
about disclosure avoidances balancing privacy and accuracy and you cannot make decisions about how to properly balance without talking to the people who are going to be using the data. So, we are engaging with stakeholders, the actual users of census data has been critical, and in the past, most of those activities, those engagement activities, would've been face-to-face and I think the pandemic has taught us about the value of alternate approaches to how we communicate, mechanisms for doing that kind of engagement on a larger scale when you cannot be face-to-face and I think we have learned a lot from that process. >> What about you, Payel Das.
>> The one lesson I have learned, that we have learned, is how to work as a community and how to grow up the community, whenever we are trying to address some of the urgency of science. I think it is not just one person or one team. It is not one company or one organization to do everything. To tackle these challenges, we really need to work as a community and a trusted and responsible manner. There is huge potential and a lot of work on how do we build such communities. I think going forward, we will work on infrastructure and policies and how to do that [ Indiscernible ] for addressing future needs.
>> What about you, Chad? >> I might say the more things change, the more they stay the same. That is to say, you know, before the pandemic, we were talking about the risks of surveillance to persons of color and poor persons in the real world and the world suddenly stops people and they go online virtually and the same risks shifted to the same populations on a new platform. So, to change the world, sometimes you need to be focused on the underlying fundamental problems that need to change and these ripple moments, as much as this last year does not seem like it went quick at all, in the scope of history, it blew by and they still are with a great deal of work to do. >> You get the last word, Alissa.
>> I would say the thing that has been striking over the course of the last year is the need to be connected. I think that the level of urgency of that in a forward-looking way has never been as dark as it has been over the last year. Just thinking about access to things like broadband, basic, really things that communities do not have, we have to make sure that is not where people get left behind looking forward. And I think, we will change -- I -- on a more positive side, I think one things that has been useful about the pandemic is the realization you can connect globally with people all over the world and lots of different communities, if you have those connections. There is a positive angle to it, which is if we can connect to the world that way, you can actually have increased connections and build your community but there is still a lot of work to do looking forward and I think, again, we have seen that work is urgent and needs to be done now and people cannot get left behind. >> Thank you to all of our panelists. Back to you, Laura.
>> That is all the time we have. Thank you again to all of today's speakers, to the TAPP and to the Belfer Center teams, doing all the really hard work behind the scenes and to all the 2021 tech finalists and those continuing to build and innovate for the public good, to learn more about our meeting 2021 finalists, visit BelferCenter.org/TechSpotlight. And if you know of any projects and initiatives that should be recognized for next year, please stay tuned for the nomination portal to open next fall. Thanks again for joining and see you all soon. Bye-bye.
2021-05-22 21:13