Future of Work and Disability: Machine Learning Models on Candidate Selection, a Policy Perspective
VERA ROBERTS: Welcome to the second Future of Work webinar. And the We Count Digging DEEPer series. I think all of you can see that there captions are available today. We have ASL interpreter who for people who would also like to use it.
I would like to begin with our land acknowledgment. OCAD University acknowledges the ancestral and traditional territories of the the Haudenosaunee, the Anishinaabe and the Huron-Wendat who are original owners and custodians of the land which we stand and create. I would also like to acknowledge that you are joining us today from many places near and far, and acknowledge the traditional caretakers of those lands. Hello everyone, I am Vera Roberts. I work at the
Inclusive Design Research Center I have the honor of moderating today's session. Welcome again. I am going to tell you about our agenda. So, today we will have presentations from our panelist.
Followed by discussion. Then, we'll have opportunity to have questions from the audience. Now, I would like to take a moment to introduce our guest speakers.
We are very fortunate today to have Alexandra Reeve Givens and Julia Stoyanovich with us. Alexandra is the CEO for Center of Democracy and Technology. It is a think tank that focuses on protecting democracy, individual rights and the digital age policy. In her spare time, Alexandra serves on the board of Christopher and Dana Reeve Foundation.
She is also a Mayoral Appointee on the Washington D.C.'s Innovation and Technology Inclusion Council. Julia, is an Assistant Professor of computer science and engineering, data science at New York University. She researches responsible data management, and analysis. She ensures that fairness, diversity, transparency and data protection in all stages of the data science lifecycle through her research.
She is the founding director of for the Center of Responsible AI and New York University. This is a comprehensive laboratory building a future with it responsible AI will be the only kind accepted by society. Thank you for joining us today. Alexandra I am going to let you begin with your presentation. ALEXANDRA REEVE GIVENS: Wonderful hello everybody.
It's very nice to join you from down here in Washington D.C. I am going to share my screen, because I have some slides. They were shared ahead of time. For those of you who are not able to see the slides directly, know that the content is going to exactly be reflected in my words. So, you can follow along that way if you would like to. I am going to talk today about two areas that focus, to particular pieces of this question about bias in the selection of candidates.
My focus is going to be on what some of the legal frameworks are, what some potential avenue argument against bias and the types of content that we in my organization are pushing companies to think about when they engage in these issues. And tools that we hope advocates will have. I am going to start by doing a quick overview of some of the types of tools that we are talking about to help ground the conversation. Some of these may well be very familiar to those of you in the working group. Just in case I think it is useful to pause and talk about some of the tools being used, and some of the concerns that they raise.
So, one of the classic examples of the use of AI and hiring is the use of resume screening. Tools that help sort fast, stacks of resumes into smaller amount that may be tailored to a particular job description. The types of concerns that we may be aware of here from a disability and general inclusion perspective is that the tools are often trained to look for traits or characteristics in an existing an employee pool. We can think of the risk of that perpetuating existing patterns of inequality in the workforce. One famous example was a tech company that used one of these tools for their internal hiring processes. They ended up realizing the samples of people coming through were skewing overwhelmingly towards men.
The reason was the algorithm had been trained on the resumes of successful existing employees, which had really poor gender representation within them. From a disability perspective, we can think of gaps in the person's resume. where they may have gone to a place of higher education, participation on a sports team, you can name it.
All of the different types of traits that might be inadvertently being factored into the screening here. And how that can end up disadvantaging people whose resumes who may differ from those within the established selection pool. Another example of a tool is the use of video interviews that then report to conduct sentiment analysis or facial recognition analysis on a person's conduct during that interview. Another example is the increasing use of games and logic test and hiring tools. There is a sample appear on the slide for those of you who were able to see it. Here we have, for example, a test for numerical and logical reasoning, which invites the candidate to click on an image where it says, which side of these two images has a larger proportion of yellow dots? And what they are measuring there is the speed and accuracy of your return of that answer.
Obviously, that is a specific example. You can imagine if someone is colorblind, choosing a number of dots on a particular color on a screen may be troubling. If you think of someone with a mobility impairment, speed of response time being seen as a gauge of your intelligence or quickness and response is actually not a fair or accurate measure of your abilities. On the right side of the slide, we see some examples that describe what this tool is reporting to measure. What some of the games in this particular company are measuring.
The factors include things like attention, effort, fairness, decision-making, focus, generosity. Some of those too we can be worried about being coded words or measures that will have a significant advantage for people who have a particular disability. I'm happy to talk more about these in the Q&A. We are always working to service examples these types of tools and the potential exclusionary effects that they may have, even exclusionary effects that are not remotely intended by the designers. But may well end up having those consequences for people that are being measured through these tools. So, what does the legal framework have to say about some of these? The good news is that the law does actually say quite a bit.
I am pulling up here text of some key arguments in the Americans With Disabilities Act. I am going to jump right to that even though many analyses also look at Title 7 in the United States, which prohibits discrimination on the basis of race, gender, sexual orientation and ethnicity. Within the Americans with Disabilites Act, there is actually some very clear and specific language that raises very real concerns about the legality of some of these tests and the potential discriminatory effects. The ADA requires the testing formats be accessible.
We need to make sure that candidates can write request reasonable accommodations. So that is one clear angle here, the test that I began showing which had the discerning four-color dots, for example. As an employer, we need to be able to make a reasonable accommodation offered to someone that is unable to interface with the testing format. Then, the ADA says more. It has very useful language about prohibiting discrimination, including the prohibition of employment tests that screen out or tend to screen on a person with a disability. One of the obligations is on employers that they may be liable if they fail to select and administer employment tests in the most effective manner to ensure that test accurately reflect the skills that they purport to measure.
Rather than reflecting an applicant's on employee's impairment. So there, what the laws getting at is saying, is this test actually measuring what it is trying to measure? The skills that are required for this job. As opposed to evaluating someone's ability or disability. Another legal argument, under the Americans with Disabilities Act, is that there is a prohibition on preemployment medical examinations. This comes up in the settings not exactly the test that I showed you in my opening remarks. Another settings where there is an increasing use of personality quizzes and personality tests.
There is actually been some good case law established that if the test draws on, for example, psychiatric evaluation measures. If it is administered by a psychologist, it crosses the line from being a hiring tool to being a medical examination that is prohibited under the ADA. So, the good news is that there is some pretty strong language. The bad news is, it's very hard to bring these claims. They rely on individuals knowing that they are the victims of discrimination, knowing how they were screened out, and then being able to effectively prosecute their rights.
As we know, that can be enormously challenging. One of the things that we are very focused on is not only educating people about their rights, and also going directly to employers so that they are thinking about the risks of exclusion pre-emptively and taking conscious efforts to avoid them. We have recently partnered with a group of over 20 civil rights organizations in the United States to publish the Civil Rights Principles for Hiring Assessment Technologies. This is intended as a tool for both policymakers advocates and employers to know about the potential factors of discrimination and take conscious steps to be able to address them. Some of the things we emphasize in that is first of all a principle of non-discrimination. Which should run through all of these tools.
The second one was a really important point for me and for my colleagues as disability advocates. This focus on job relatedness. Here, what we are worried about is the overwhelming tendency. A very common habit amongst employers to think of tests and not actually go through that rigorous exercise of figuring out whether what they are assessing is essential for the job in question. That is actually a requirement under the American Disabilities act, as I mentioned. It is an easy step to overlook.
Particularly when being sold a product off the shelf that purports to help you screen through candidates. But we are saying is essential principle needs to be the check of saying, what are the attributes that are actually required to do the job in question? How do I measure somebody's ability to perform those particular skills required and nothing else? So, that is an essential piece that we are pushing for very strongly. The third is the goal of notice and explanation. If somebody doesn't know what they are being evaluated for or how they are being evaluated, it may actually be very hard for them to know whether their disability may adversely affect how they doing the test. Then, maybe hard for the employer to know. For example, if you think about employment tests that is reporting to analyze how someone presents on a video.
Lack of eye contact may have a significant impact on somebody's success or failure under that metric. But an autistic person may not know that they need to ask for an accommodation or to flag that disability as a reason why they should have an alternative motive testing The notice piece here is essential. Again, I should pause to say I have fundamental concerns with the testing technology to begin with. This is not to imply that I support that is a valid method of measuring someone's effectiveness. If we are going down the path of an employer knowing this tool, they need to very accurately, fully, and fairly describe the process that they are using. So, employees know how they may be affected in the course of that engagement.
The fourth principle that we are pushing for is rigorous auditing. It is very hard for people to know whether or not their platforms are discriminating against people if they're having adverse impact on particular groups, without rigorous and frequent testing how people are doing and how they are coming through. For reasons I will talk about in a minute, I actually think the auditing is very hard in the disability context.
Nevertheless, it is an essential principle we need employers to be frequently checking and looking for the impact of the tools that they're trying to use. Then, the last point number five is about oversight and accountability. Here is the notion that we have to have employers understand their legal obligations, ethical and moral obligations, if they are considering using these types of tools.
We need real oversight and accountability for lawmakers and from regulatory websites as well. In the United States, one area of focus is the equal opportunity employment. Looking at how federal contractors, which are a large percentage of the United States employer base, are using these tools and what accountability we can insist for there. And the third is thinking about how legislators on Capitol Hill are gauging on this topic as well. Recently in the United States, a bill was introduced called the Rudnick and Accountability Act. What it mandates is anybody using an algorithmic decision has to audit for potential bias and if they uncover it, have to take proactive steps to mitigate the bias.
What is interesting about that regulatory approach is that, it doesn't say exactly what the answer is. It doesn't say come down exactly on whether or not these tools should be used but it says if you are making the choice you need to have a certified impact assessment methodology that accompanies the use of the tool. That is an important piece of the puzzle, as well. When I think about how where the points of intervention are, a lot of my work at the Center for democracy and technology is focusing, as I mentioned, on empowering advocates to know what potential areas of concern are, help raise general awareness. There is also a lot of pressure to be put, as I said, on employers on their in-house counsel. We are spending a lot of time talking to people at the American Bar Association and to other lawyers about what legal vulnerabilities look like in this space.
The third piece is thinking about how employers can be empowered to put pressure on vendors, and to ask the right questions of vendors that are generating these tools. As I mentioned earlier in my remarks, it can sometimes be tempting to take a product off the shelf and to not think specifically about whether or not it's right to use that product, how should be tailored for your organization. Whether it is asking the right questions and actually screening for the things that matter for the role you are trying to fill.
We want to empower employers and other decision-makers to be much smarter, more informed consumers of these products and make sure that they are engaging in that self-examination as to whether or not they are appropriate for use. And if they do use them, how to mitigate against potential concerns. If you'll indulge me, I'll spend one more minute I’m just talking about some of the challenges in the space and then I would love to pass it over to Julia and then to our broader discussion On the challenges piece, there are a couple specific areas that come up in the disability space in particular when we're thinking about the use of AI tools in hiring or in the workforce. One is that even if a company, a vendor of these types of products is aware of the concerns, aware of the potential bias, is fixing that at the design phase is actually quite challenging. Some of them are doing thoughtful outreach to groups to think about how we get more voices at the table to think about potential negative impacts of these tools. One of
the challenges, and Julia may talk about this, is that these tools train the way that AI learns is by taking an existing body of data studying the patterns within it and from that, making inferences to informed decisions in the future. The sheer variety of ways in which disabilities may present the sheer range of disabilities, coupled with the challenge of getting data about people's disabilities, because many of us feel very concerned about sharing them publicly, let alone in a database that is used to train algorithms. The lack of training data makes it very hard for people that are trying to have their tools, understand the full range in which a person's ability may present during the course of an interview. In a separate piece, which may well undermine the validity of these tools in general, is when you think about basing decisions suitability for job based on trends you have seen in the existing workforce or in a selective training data set By definition what you're doing is making assumptions, drawing from a general crowd, so what that means for a specific person.
That is exactly what our civil rights and disability movement has been fighting to stop for decades. A lot of those decisions and inferences are based on stereotypes and assumptions about people's general abilities, as opposed to a decision that is based on somebody's specific consideration of the individual candidate in front of you. I think that is a critical element that we need to think about when we are looking at the use of AI in hiring and thinking about the disability lens in particular. My next point is a little technical help you'll come on this journey with me. which are the limitations of auditing for bias on the basis of disability. Some people in the AI space, when we talk about the concerns of bias and hiring, say one of the ways to mitigate this is to come in and do statistical auditing to see how people are doing in this test and see if we are inadvertently screening out women on a higher rate than men, for example. That's not just a scientific approach, it actually is
endorsed by the Equal Opportunity Commission, which has on the books in its guidance on employee selection procedures which date from 1978. An idea that the best way to look for discrimination is to do a statistical analysis and follow what they call the fourth-fifths rule which is that if a group is being screened out at the fourth-fifths the rate of the dominant group so if women are being screened out or making it through at a rate of only 80% compared to how men are making it through. That is a good indication that there is something discriminatory happening in your tool and need to go back and re-evaluate it. That model, that idea that you can screen for bias in a statistical way, and go in and use that to fix a tool is very dominant amongst the vendor of AI hiring tools, right now. For many of them they do follow that approach, and as a result, they market their tools as being all being audited for bias, and something that companies can comfortably use without concern.
The problem is that type of statistical auditing for bias is enormously hard when you think about disability. What data set would you be running through that in any statistical way would show how people are being screened out versus non-disabled people? When disabilities manifest so many different ways, you often will not have statistical significance in the number of candidates coming through. It is much easier to do that type of statistical analysis for gender, for example, or for race. It is equally problematic I should say for non-binary individuals or transgender individuals, for people of mixed race. At least in the gender and racial diversity categories we have more established categories, and typically larger sample sizes that are easier to study.
So, there is this real problem around auditing for bias on disability. The reason why that matters for all of us as I said, a lot of vendors are out there at conferences, I go there to these employment marketing their tools saying that they have screened for bias but they're not thinking about disability when they make those pronouncements. That is hugely problematic from an inclusion perspective. I will end on this last big point, which ties to that which is as a general matter. I am so thrilled that this conversation is happening in this workshop because disabilities are too often excluded from discussions of algorithmic bias.
We have seen this wonderful emergence of a conversation around algorithmic fairness. It is still easier for people to jump to think about racial diversity, gender diversity, both very, very important. But not adding the additional ways in which people could be marginalized. Particularly, when people over sect within those categories to be multiply marginalized. It is an essential part of our program that we need to think about those crossover issues. The final point is I think we need a lot more effort to think about how we move outside the bubble of policy advocates and academics that are talking about these issues.
To reach the decision-makers that are driving the market to create these tools, and to deploy them. We need to be out talking to employers, talking to vendors, and trying to make smarter decision-makers throughout that process underscoring the ethical business and legal imperatives for change. Those are some of my opening thoughts. I'm thrilled to be here with you. Julia I will pass it over to you.
JULIA STOYANOVICH: Wonderful thank you so much Alexandra This is a perfect segway. So, let's get started. I am thrilled to be here. My name is Julia Stoyanovich. I'm a Professor of computer science and engineering, and data science at New York University.
I also co-direct the newly established Center for Responsible AI at NYU today. It is my pleasure to speak with you about the responsible design development, and use of algorithmic systems particularly as they pertain to hiring employment in the future of work. Particularly, of course, as they pertain to the community of individuals with disabilities. I have to say that I am humbled to be speaking in front of this audience, both because of the amazing strengths and creativity that members of this community have been exhibiting and resilience. Also, because this is my first time speaking to a group of individuals that includes individuals with disabilities.
I ask you in advance, to please forgive me if I use incorrect terminology or my speed is inappropriate. I would love for you to correct me to give me feedback so I can become one of the advocates for this community. in a thoughtful way. So, our topic today is of course the future of work. Automated hiring systems do make a prominent portion of that topic, as Alexandra already discussed with all of us.
Part of the discussion focuses on the use of technology for hiring. I want to be sure that towards the end of my presentation, and also during questions and answers, we get to other topics that involve technology, this community, and employment in the future work. Things do not stop at hiring. So, to recap some of the things that Alexandra already said.
In a recent report from Upturn, the hiring process is described as a funnel. It is a sequence of steps in which a series of decisions lead to job offers to some individuals and rejections to others. The employer sources candidates by ads or job postings. What I am showing on this slide is the depiction of the funnel. With the different stages shown pictorially.
Before I dive in, I guess I should say that many of the images here, the majority of the images come from a scientific comic that I created together with the amazing Fala Arif Khan. This comic is available online. through a screen reader.
It's currently released in English. It will also be released in Spanish, and also accessible in Spanish this week. So if you cannot see some of these images today, you can look at them later with annotations and the in the comic book. So, this funnel that I am depicting is made up of a sequence of steps. These are sourcing candidates by ads or job postings.
The next stage is typically called screening where employers assess candidates by analyzing their experience, skills and characteristics. Next, through interviewing employers continued their assessment more directly. After that background checks may follow. Then, during the selection step employers make final hiring compensation determinations. Importantly, data and predictive analytics are used during all of these stages. As stated by Jenny Yang former Commissioner of the US Equal Opportunity Commission or EOC, automated hiring systems act as modern gatekeepers to economic opportunity.
Of course we have some bad news as Alexandra already spoke about things that can go wrong in quite some detail. So, I will recap very quickly. Here, we have been saying cases of discrimination based on gender, race, and on disability status. at all stages of this pipeline.
The kinds of concerns we are faced with the use of the automated hiring tools pertain to unfairness in the decisions that are made by these systems. which we would denote as discrimination or by the unfairness in the process by which these decisions are made. These I would call due process violations or due process concerns. Very often, discrimination and due process violations are linked to the term bias. This is a term that some of you already started discussing in chat that we will revisit momentarily.
There is also concern before we dive into bias about whether these tools actually work. So what I'm showing here on the slide. It is a depiction of Arvind Narayanan. He is a computer science professor in Princeton. He was a wonderful talk about spotting AI snake oil. Are these tools actually working? Are they picking up useful signal from the data or are they an elaborate coin flip at best? As Arvind Narayanan puts it, are these tools AI snake oil? In the complex ecosystem in which automated hiring tools are commissioned, developed and used, we must ask ourselves who is responsible for ensuring that these tools are built, and used appropriately.
Who is responsible for catching and mitigating discrimination and due process violations and for controlling the proliferation of AI snake oil under fancy label of data science and AI? What I am showing here in support of this narrative is a culpability lineup. In which the point that I am making is that everybody is responsible scientists are responsible, members of the public, platforms, software developers , all of us are responsible for making sure that these tools are used to benefit society and not harm individuals or particular groups such as individuals with disabilities. The hiring funnel, as well as each component of the funnel are examples of automated decision systems or ads these systems process data about people, some of which may be sensitive or proprietary. They help make decisions that are consequential to people's lives, and livelihoods. They involve a combination of human and automated decision-making. They are designed with the stated goals of improving efficiency and promoting or at least not hindering equitable access to opportunity and finally they are subject to auditing for legal compliance, and at least potentially to public disclosure, ADS, automated decision systems, may or may not use AI.
Although, most of them are billed as AI, because AI sells. They may or may not have autonomy, meaning they may not be making these decisions entirely on their own. Usually, there is also a human decision-maker in the mix but they all rely heavily on data. So, what I would like for us to focus on is the role of data in this environment. In response to the question about responsibility, we have been seeing attempts to regulate the use of data-driven algorithmic tools, such as those that may be part of the hiring funnel.
This activity is broader in scope than algorithmic hiring, of course, and pertains to automated decision systems. More generally, the systems that I described previously. The big question here is, how might we go about regulating these systems? How might we be going about regulating ADS? Should we even attempt to do this? While the predominant sentiment in the industry is still that regulation will stifle innovation. I am showing here a reckless child on a bicycle not hitting the brakes. Industry, alone does not get to decide.
Even in the Silicon Valley, the need for meaningful regulation to ease compliance, and and limit liability is starting to be more and more broadly recognize. There is much debate on a specific regulatory framework that we should adopt. Should we use precautionary principles that can be summarized as better safe than sorry. Here I am showing an image of a child on a bicycle in protective gear.
Or more likely an attempt of more more agile risk-based method, such as algorithmic impact assessment. Here I am showing a picture of a child who is riding a bicycle carefully with the helmet. All this, and more is the subject of intense debate.
Some of which, I had a chance to witness firsthand in which I am still actively participating. New York City, where I live, recently made a very public commitment to opening the blood box of New York city government's use of technology. In May 2018, an automated decisions systems task force was convened.
The first such in the United States. Charged with providing recommendations to New York City's agencies about becoming transparent and accountable in their use of the use of ADS. I was a member of this task force by appointment of the New York City Mayor.
The task force issued a report that included a set of principles. These principles say that we should be using ADS only if they promote innovation and efficiency in service delivery, not simply because they are available. Not simply because they are being sold to us as this AI snake will. We should also be promoting fairness, equity, accountability and transparency in the use of ADS. Finally, we need to be thinking about how to reduce potential harm across the entire lifespan of ADS.
Starting from the design all the way through deployment. While making important points on this report, the ADS task force unfortunately, did not go very far in terms of concrete recommendations. I am happy to result discuss this but I can spend hours talking about what went right and what went wrong during our deliberation.
But we also have an immediate opportunity to make things a lot more concrete in the context of specifically regulating hiring systems, in New York. There is currently a proposed law, a bill that is being considered by the New York City Council Committee on Technology. There was a hearing on this bill, just this past Friday. This bill would regulate the use of automated employment decision tools with the help of biased audits as well as public disclosure.
Individuals being evaluated with the help of algorithmic tools would have the right to know then an algorithm rather than a person was used to screen them. Most importantly, in my mind, individuals would also be told what job qualifications or characteristics were used by the tool. Ideally, we would also want to make sure that the job relevance of these qualifications is substantial. So, let me now switch gears and talk about the things I was asked to discuss. And that is, what is this bias in the data? Some of the technical solutions that we may want to use here. To start, I want us to step back, and think carefully about the role of technological interventions, such as data and model bias that you may have heard about.
The discussion is necessary to help us find a pragmatic middle ground between two harmful extremes. One of these is technical optimism. What I'm depicting here is an image of a woman with dark sunglasses.
In the sunglasses, there is a reflection. On the left, I am reflecting one harmful extremes and that is technical optimism. The belief that technology can single-handedly fix deep-seated societal problems, like the structural discrimination in hiring. On the right, I am showing an image that illustrates technical bashing. That is a belief that any attempt to operationalize legal compliance and ethics in technology will amount to fairwashing, and so would be dismissed outright.
Our job, I think, is to really find a way to navigate between these extremes to create an nuanced understanding of the role of technology in society. So, let's get back to bias. A term that is used very often these days to explain what is wrong with automated decision systems. But remains poorly understood. What do we mean by bias? Our meaning of this term is not in the traditional sense that is used by statisticians.
Who say that the model may be biased, if it does not summarize the data correctly. Instead, what we are seeing here are examples of societal bias exhibiting itself in the data. Let's unpack that further. Data, is an image of the world.
It's mirror reflection. When we think about societal bias in the data, we interrogate the reflection. This is what I'm depicting here world, on the left, that we don't quite know. There is a person that is looking at this world through a mirror, through a lens. One interpretation of bias in the data is this reflection is distorted. We may systematically oversample or under sample, over represent or under represent, particular parts of the world.
Or we may otherwise distort the readings. It is important to keep in mind that the reflection cannot know whether it is distorted. In other words, data alone cannot tell us whether there is a distorted reflection of a perfect world, a perfect reflection of a distorted world or if these distortions compound.
The assumed or externally verified of the distortion has to be explicitly stated. Another interpretation of bias in the data that I'm showing, on the right of the slide. Even if we were able to reflect the world perfectly in the data, even if we were able to take a perfect measurement of the world such as it is, it would still be a measurement of the world such as it is over such it has been historically. Not as it could or should be. Once again, importantly it is not up to data or algorithms, but rather of two people.
Individuals, groups, and society at large to come to consensus about whether the world is how it should be or if it needs to be improved. If so, how we should go about improving it. My final point on this metaphor is that changing the reflection does not necessarily change the world. If the reflection itself is used to make important decisions, for example, whom to hire or what salary to offer to an individual being hired, then compensating for the distortion is worthwhile. But the mirror metaphor only takes us so far. We have to work much harder, usually going far beyond purely technological solutions to propagate the changes back into the world.
Not merely brush off the reflection. One way to conceptualize automated decision systems is simply a so-called predictive analytics. It takes input; a nice, clean rectangular data set that I'm depicting here. It crunches it. Then it produces a result.
This result could, for example, be a prediction of how likely somebody is to do well on the job. Therefore, whether we should be hiring them. You will then notice that the result is such that no women are showed for high-paying jobs, for example.
No individuals with disabilities pass an online job interview. Then, we have three choices if this is our worldview. These are that we could tweak the input data.
For example, of sample or down sample subgroups. We could tweak the algorithmic box that crunches this data or we could change the result. For example, we can reassign outcomes. However, an issue here, is that this particular view is a frog's eye view.
I argue that we need to expand this quote. Also, think what specifically happens inside this box that crunches the data. Think how the results are being used. Whom are they impacting? Who benefits and who is it harder for? Also we need to ask ourselves, where did the data come from? In other words, we will gain much more power to incorporate responsibility into automated decision systems development and use, if we see the system through the lens of their development, design, deployment lifecycle, and their data lifecycle. In their seminal 1996 paper, Friedman identified three types of bias that can arise in computer systems with the broader view of the systems.
This bias is represented here as a three headed dragon. Pre-existing, technical, and emergent. Pre-existing bias exists independently of an algorithm itself.
It has its origins in society. This is the societal bias that we have been discussing with the mere metaphor. Technical bias can be introduced at any stage of the systems lifecycle. It may exacerbate pre-existing bias. I will give a couple of examples next. Finally, emergent bias should arise in the context of use.
It may be present if a system was designed with different users in mind. Or when societal conscious shift over time. In hiring, a prominent example of this is the rich get richer. Emergent bias arises because of decision-maker hiring managers, in this case, tend to trust the algorithmic system to indeed select the most suitable candidates. For example, by placing them at top positions overranked from the list. This interim is, in turn, going to shape the hiring manager's idea of what the suitable candidate looks like.
So, to discuss the technical bias is a bit more. Let us look for about a minute, add models and assumptions. Technical bias often has its origin in incorrect modelling and assumptions or in technical choices that follow from these assumptions. Here are some concrete examples. What I'm depicting here is an art gallery in which there are four paintings of apples. One is the realistic one and three are abstract to different levels of obstruction.
Suppose that the job applicant applies through an online form. This form allows applicants to leave their age unspecified. This data will travel through multiple data processing steps before it is given input to a predictive analytics. A classifier that will decide whom to invite for a job interview.
Further, to make a prediction. The classifier will need to know what the value of age is. So, if that value was unspecified it will try to guess it to fill in the blank.
Now, the question is how should the system guess someone's age. The most common method for this is the simplest one; it is called mode amputation. Replacing a missing value with the most frequent value for the future.
If age is missing at random, meaning that everyone is equally as likely to not specify their age, then this method is appropriate. If age is missing more frequently for older individuals, for example, they may fear age discrimination in employment then mode amputation will impute age for them incorrectly and systematically so. Next, consider an online form that gives job applicants and sprayed. JULIA STOYANOVICH: I can also give a written transcript that is helpful. I suppose I should've done ahead of time. the models and assumptions this is something that people don't think about very often.
JULIA STOYANOVICH: Starting with the Apple gallery, we talked about modeling assumptions actually being extremely important. We said that if you guess a person's age or an age of the demographic group incorrectly, then this is problematic, based on some assumption you are making about your data. The next example I was going to give is a form that gives job applicants a binary choice of gender but also allows them to leave gender unspecified. Suppose that about half of the users of the applicant identify as men and half as women. But the women are more likely to admit gender because they fear discrimination, perhaps. Then if replacing
a missing value with the most common value for that gender is used, then all predominantly female unspecified gender values will be set to mail. More generally, multiclass mystification for guessing values, missing value imputation, typically only uses the most frequent classes leading to a distortion for small population groups because membership in these groups will never be imputed. Next suppose that some individuals identify as non-binary. Because the system only supports male, female, and unspecified as options, these individuals will likely leave their gender unspecified.
Then, once again the system will use mode amputation. It will set their gender to some value, probably male, because this is a predominately male data center. A more sophisticated imputation exists. It could be used and it could do better but it will still use values from the active domain of the future as we see.
Meaning that it will still set the value to either male or female. This example illustrates a technical bias that can arise from an incomplete or incorrect representation. Finally, consider a form that has home address as a field. A homeless person may leave this value unspecified.
It would be incorrect to impute it. We would lose information. While dealing with the blank values is known to be difficult, these already considered amount and data cleaning, this is a technical discipline that thinks about these questions. The needs of responsible data science introduce new problems here with much higher stakes. Importantly it has been documented the data quality issues including missing values often disproportionately affect member of historically disadvantaged groups, such as individuals with and disabilities. This is a risk that we see here technical bias is exacerbating pre-existing bias for such groups.
There are of course immediate parallels here for individuals with disabilities of their data being missing or missing in a way that is non-random and imputation potentially being harmful if it is not done thoughtfully. I'm going to skip this. This is the framing that I like that talks about all of these problems under the lens of data equity representation, feature, access.
What I want to stress is rather than focusing on the use of AI artificial intelligence and automated decision-making for hiring and screening, we should and then shift our focus to how we can use AI to solve our problems that create opportunities for all of us. Rather than developing platforms that encode in hospitality due to fleet, we should be thinking about how to use AI to improve accessibility. Rather than ghostwriting code, the term of that enables accessibility, what about writing that code part of a toolkit that has our priority to which it would have particular attention? What about developing AI methods that are specifically working for individuals with disabilities? I am showing here a picture of Chancey Fleet and a quote from one of her talks.
I am going to skip this; this is a picture that illustrates how individuals with disabilities can be seen with even more errors by facial recognition software, which frankly we are in more trouble if it works and if it doesn't. My takeaways are that we should be thinking about how to build technological systems that are rooted in the needs of people. This means that we need to expose the responsibility to people. We must work together to create meaningful regulatory systems and also engage in educational activities. These are the goals of the Center for Responsible AI at NYU that I am starting together with colleagues and friends.
An example of an ongoing educational activity is a series of workshops that will culminate in a course for members of the public or the use of algorithms, in general. Specifically, their use in hiring employment. We are doing this together with the Queens Public Library in New York City as a pilot. Then we'll scale with libraries in the US. Another example is a comic series of which we have released the first volume, called "mirror mirror". Together with a very talented Fala Arif Khan, who participated in an earlier workshop at part of the series.
So, please take a look at the comment. Let us know what you think. The final anecdote which I will leave you is I already mentioned, that the comic is accessible. You can read it with a screenreader. The only way that we were able to make this work is by getting feedback from Chancey Fleet. She helped us debug it.
That method does not scale. Another opportunity for using AI is to help us make educational materials accessible. Also on AI itself. I will stop here I am happy to continue the discussion.
VERA ROBERTS: Thank you very much. Thank you Julia. Thank you Alexander. That was two really, really terrific presentations. I really felt that I was learning so much. I can tell I was keeping an eye on the chat through both of your talks. I can see that people were quite engaged.
Had some really important questions to ask. One of the difficult challenges with AI is the notion of the blackbox Understanding how we can get in it, and get out, and policies related to it I thought that you both really help to open that up for me a little bit today. Alexander, when you were talking about policy, or particular with regard to the ADA. I was trying to think about how that is playing out in Canada and other jurisdictions. I wondered, have you had opportunity yet to focus on any other jurisdiction, in terms of regulatory structures around or protections around the use of automated data systems, decision-making decisions and hiring processes or other areas.
I thought it was well laid out in Title I in the ADA. We have similar human rights codes here in Canada. Protection against discrimination, but I wondered, is that something you've seen any other areas where they have got some novel approaches to policy? ALEXANDRA REEVE GIVENS: Still unsure, I felt self-conscious joining this workshop. When I do not have familiarity with Canadian law. You have to excuse me with my US focus.
We are spending a lot of time in Europe, as well. The European Commission is focused on questions of AI and equity, bias. The European Union moves slowly, but steadily. When they decide to regulate a space, we saw that with their general data protection, which was a slow moving glacier, but did end up having a significant impact on privacy policies around the world. They have some provisions on AI in the bill.
They're considering a new initiative around AI, as well. Part of the problem though is that a lot of the regulatory regimes are still just focused only on transparency. Putting the burden on the user.
There is some comments about this in the chat. In my mind that is asking way too much of all of us as individuals. To know how we may be discriminated against, and take individual actions. We can focus so much on trying to get directly to employers to know, here is the legal framework.
Here's how you may be violating the law. But also care from a moral and ethical perspective, is why you need to focus on these things. The last piece I will say is Christopher has excellent comments in the chat that how even in the US, that language in the American Disabilities Act is really, really good.
I would argue it is basically undiscovered. So far, as it comes to AI. Nobody is actually litigating these cases. The Babylon Center for Mental Health Law, did bring a series of challenges in the early 2000's around personality test. They filed complaints with the EEOC and fought for years. And even there, they have a hard time moving the needle.
Despite the new language. On the books. So, that tells me a couple of things. One, we need to keep stepping up our advocacy and our pressure. Two, we do need to look at the legal framework, and see what else we could be doing.
There is a new administration coming in the United States. Could that new administration issue new guidance from our Equal Employment Opportunity Commission? To say, this is indeed a violation of federal law and lean in, far more aggressively to help set expectations on what is acceptable testing and what is not. VERA ROBERTS: Thank you. It is so interesting for us to hear about what is happening elsewhere. Really, don't feel self-conscious about having American focus. We are well aware of that when we invited you to be here.
We are interested in what is happening in other approaches from other jurisdictions. We are wanting to consider what might be some possible approaches within our study group to hear today. Getting a variety of information is really, really quite helpful to us. JULIA STOYANOVICH: If I may add to this, we are looking to Canada also for some help and guidance in this.
I pasted in the chat, the link to the Canadian Federal Directive on automated decision-making. We do not have such a directive in the United States. It takes an approach to regulation. This is the use of the tools within the Canadian government, not more broadly.
Regulating the hiring ecosystem is out of scope. But the approach that they take is based on impact assessment which in my mind is the right way to go. And understanding the potential harms and benefits and to whom, are these harms and benefits? One thing that is still lacking, even in the past regulation, you must have an informed public. We all need to think very carefully. None of us, not in Europe, not
in New Zealand, not in the US, how to actually create even a basic level of understanding of data, algorithms, and what we should be asking algorithms and data. So, this is my two cents as a technologist. Technology is the easy part here, relatively speaking. Educating.
VERA ROBERTS: It was fascinating when you're talking about the notion of whether it is a distortion of a perfect world, a reflection of distorted world. The different ways of understanding the data, and I think there was a lot that was very interesting. It gave me new ways to think about this whole process. I will say, it struck me. I think Alexander, you were the one that was talking about the idea. Really, when the systems are being used, for example, in the hiring process there should be disclosure.
That we are being put to the test or put through certain systems. I know even for myself, when I looked at some processes. I have wondered are there hidden buzzwords I should be hiding? White font in my resume or my CV or if I'm helping someone, I'm wondering if that is something I should do. There are these hurdles that are there that many of us are not aware of, no matter who is looking for the job. I can certainly see how that idea that this should be disclosed is, I think might be novel to some employers who sort of like the shields of these systems between them and the hiring process. Them and the candidate.
Of course, some people may be, without intending to, think because it is a data system that they will not have bias. That it is going to be unbiased because there is no human in it to bias it. But of course, one of the things we are learning and are exploring is the fact that bias is inherent in the systems because it is built by bias people and by bias data. So, addressing that is really the solution that you were suggesting, Julia was fascinating to me. I think I'm going to try to talk less.
You had quite a number of suggestions. Why don't you choose one of your favorites to start with? And you can come back to you again, after a couple others if we have opportunity. AUDIENCE MEMBER: Great. I have one question and then a couple points to address. The question is around disclosure piece. How do we monitor for discrimination, based on disability when disclosure should be optional, and discrimination based on the litigation predicated on a required disclosure? That is the primary question.
I wanted to ask I'm also curious to know if there are easy flag for us to watch more for discriminatory algorithms. For example, if gaps in job history could be a bad thing. That screeners are used to look for.
If there are good flags like are they looking for the benefits of lived experience? Easy ways to be able to tell who is likely to be a real problem versus somebody who is going to be flagged? ALEXANDRA REEVE GIVENS: Those are really good ones. Ones that we have been spending a lot of time thinking about. Because again, in theory that Americans with Disabilities Act helps on this. Your request for accommodation cannot be used against you.
But we have to dovetail about reality of the legal framework, and people's actual experience, and their level of comfort asking for an accommodation. It really is just how challenging that is. A system that requires disclosure in order to be fair is really an unfortunate one. One that makes it quite challenging. For the reasons I mentioned in my talk. It also makes the data collection piece really hard.
Even if you wanted to come in as a good-faith researcher, and audit from behind, how a sample of an employee's did. You could maybe make inferences about gender or race based on what you know about a person. You can do a loose assessment, at least of how people tend to be trending from a risk of discrimination on those vectors.
Disability, much harder, there really does need self-disclosure for people to know for auditing purposes how you were treated. We have been having some conversations on whether there are creative solutions here. Could we have a co-op of disabled people who voluntarily agree that they are going to share their data with a trusted third-party non-profit and then go out and apply for jobs and see how they are doing? The mechanism of that would be a lot of working out.
There are concerns about that to as a privacy lawyer in addition to thinking about fairness. I do wonder some of those things. I almost wish that LinkedIn had away of you opting into a voluntary program where you could reveal your protected characteristics, just for purposes of trusted auditing that the employer would not see them but a trusted auditor would. That's one of the
thoughts I had at night thinking about that question. But I think we need smart minds really, coming together on that point. As for the flags, I think there, this is why we need to know better what people are being evaluated on. So that you can look for what the flags are going to be. Gaps in a resume might be one obvious one. There may be many other ways that a system may discriminate against a particular person with a disability based on their individual circumstances.
Data Scientists can't even think about or contemplate to know about a potential risk I flagged in my talk some of the most obvious ones a jump to me. I tried to spend a lot of time cataloging these, from talking to folks who are experiencing some of these tools It needs to be an ongoing project because of the sheer diversity of people's experiences when interacting with these tools. JULIA STOYANOVICH: If I may add to this from a technological point of view. I absolutely agree with everything you said Alexander. There are technical methods that can help us support these types of auditing as well as these types of public disclosure.
When we talk about auditing, it does not have to be us measuring performance of a system on the data. The life data on which it is deployed. We have methods to generate data synthetically. The kind of population in which we want to interrogate the behaviour of this system. So, that technology is there. He can give us privacy protection, and it can allow us to work with these machines to try and test them until we are satisfied with their performance among all of these dimensions. I also wanted to add, in addition to bias data and bias people,
an important dimension is the criteria of goodness in which we value these systems can be bias, as well. Because there is no such thing as absolute perfect accuracy. If you are accurate for one group. you may be less accurate for another. Who decides on that trade-off? Who decides what fairness means, technically? But that is another dimension. In terms of job relevance, that is crucial. It's not enough for us to say, the reason why you are selected for this job is because your name is Jared and he played lacrosse. This may in fact be an anecdote that is well known.
The signal that the system picked up. But if we know that this is the signal of the system picked up, we should immediately challenge it. So, what I think is a promising metaphor here, technically. So for auditing we have synthetic data sets. Which features are job relevant? Which features we think are okay to use? Is this metaphor of nutritional label that I've been running.
When you explain to a particular job applicant, how a decision was made that affects them. You can tell them it is because your speed of discriminating between the red and the green circles was lower than what I expected. Then the individual would know that this actually is because of our particular disability that they are leaving with.
Then they can challenge the decision. So, I think we need in addition to this auditing that talks about the behavior of an algorithm in bulk, over an entire population, very powerful mechanisms to explain to individuals what happened who can then challenge these decisions. ALEXANDRA REEVE GIVENS: I love that. One point I would add that complicates things. Many of the tools, the value they are selling is that they may not actually know what characteristics that they are evaluating for.
They are studying trends in an existing pool of employees who are doing well. Then say, they play the game this way. So, we are going to find people to play the game the same way. Then they will not be able to articulate the speific job characteristics that are making these people look like winners. Our methods, what we're saying is, let's challenge that and say, no you have to be able to articulate what attributes you are evaluating and what rate you are evaluating them.
S you can see whether those are tied to the essential function on the job. The problem is that it actually cuts right into business' proposition of why we need this magic AI, the snake oil to be doing that inference for us. That will be a struggle with the industry that we need to keep pushing for. JULIA STOYANOVICH: But we must. We absolutely must. If we don't do this now, it may already be too late, in fact.
The longer we wait, the closer we will be to the state that we are and with regulating our inability to regulate social networks, for example. The kinds of effects that we're seeing there, that the only way to cancel these effects is really to dismantle the business models. of advertising targeting, in that case. It also keeps us here as well. As targeting for jobs, as part of this funnel for hiring. VERA ROBERTS: Right. Any other questions? I am just looking for any raised
hands. Want to get people to a moment to find that. This is the time to disrupt the system Absolutely. I could not agree more. Janet has put the question up. JANET (AUDIENCE MEMBER): Yes. I think others have heard me speak about this before.
In one way or another, we've had education at some level. »VERA ROBERTS: There's a little bit of distortion but we are hearing you. I will speak slowly. How about people with disabilities, the very nature of the disability or their social economical identities, the location, poverty even. May not necessarily qualify to apply for the jobs that show up in highly technical apps or systems or platforms. I am thinking I know folks with disabilities with down syndrome.
They were on the folks that are not as educated who are a lot less disadvantaged than us here in this very privileged circle. ALEXANDRA REEVE GIVENS: Thank you so much for that point. I have two thoughts to add. Other workshop members, I'm curious if they have thoughts. One, just as a matter of course in advocacy materials, we are now creating plain language of everything we write. And Know Your Rights versions of documents to help empower people to better understand these issues.
Julia, I love you have a comic book that is another way of making these issues accessible. Ours are literally a single piece of paper that try to spell out very simply how these systems may affect you, and what you can do about it. The other point heard in your question is realizing that a lot of hiring tools right now, are being used more for white collar jobs, we would say. What we think about other aspects of the workforce? One, is that the personality tests that I mentioned are being used very widely. They are being used in many retail jobs, in particular. Walmart, CVS, Best Buy, Target are using those types of tests and have actually taken some steps in response to accusations of bias that comes from those personality tests.
That is one area where we do need to be vigilant. The other is that we need to think about the use of AI, not just in hiring, but what it means to be a worker that is subject to surveillance. Performance is being measured by tools. There is being very good research done on about what this means for truckers, for example. Long distance truckers who now are some of the most surveilled workers in the country with machines looking at their eye contact and where their eyes are looking. For people that work in the Amazon warehouses, for example, Uber drivers.
There are many examples of the quantified worker. And what it means to have your movements tracked. Not only tracked, but evaluated against peers. You rise or fall based on how you're doing compared on the large data fall.
What that means for human dignity, fairness, and how you're evaluated and judged. That is another big piece of the puzzle that I think the conversation needs to go to next. VERA ROBERTS: Yes in fact, I was thinking we tend to look at the hiring process often but the whole employment cycle, both of you spoke about.
It is also that area that we need to consider and focus. The surveillance aspect that you talk about which is creating other data sets. It is information being collected about one aspect of your performance for what you are doing but then maybe used in other ways which is usually an ethical no-no. To collect data for one purpose and use it for another purpose.
It is not usually okay. EEOC, that is the Employment Equity someone wants to know what that means. JULIA STOYANOVICH: It's the Equal Employment Opportunity Commission where Jenny Yang was Commissioner under Obama. We hope that she will now become active again in some capacity. VERA ROBERTS: We are waiting with baited breath to see how things will resolve in your country for your next administration. JULIA STOYANOVICH: If I may add one more quick point in the discussion before.
That is the final point with my mere metaphor is that you cannot just touch up the reflection. It does help to hire people who are qualified for the job, despite and maybe not looking as good on paper. But it is also not enough for us to actually be mitigating the various steps of the hiring funnel. We are generally at the level of this hiring funnel is not enough. What we need to do is to use AI and our human personal thinking in figuring out how to make it so that people from the disability community, those who are severely disabled, learn the skills that are needed to enter the job market.
For this AI can and should help. I think that the future of work in this AI context. It is not, how do we make it so that we can use AI to hire fairly? Rather, it's how do we make AI serve our purposes. Create new jobs for us. What are these jobs that AI can enable? How can we make AI make education more equitable? VERA ROBERTS: I love the fact that you're looking at the opportunity aspect.
I think that we try to catch ourselves. And as well, not assuming that all AI is somehow problematic. For sure there is lots of challenges. If we can harness some of the benefits, that would be really terrific. We have come to the end of our slot of time. I know there was still some questions.
Really, I think I speak for everyone here when I say we really, really enjoyed hearing your presentation, and have been the opportunity to have some questions with you. Thank you so much for joining us today. Thank you for all the study group and other participants have joined us today. We will continue our discussions in Canvas and elsewhere. We always hope you'll have an opportunity to engage with our panelists again.
Julia, Alexandra, hopefully, we'll have more op