all right hello everyone um welcome to today's events great to have you all here a few people are still filing in so before we dive in with formal introductions i'm just going to give a quick few bits of housekeeping for all of us to follow today um the first is that you'll find the chat and q a functions the bottom of your screen for those of you who have never used zoom before which i would anticipate is not many um if you would like to ask a question during the q a please do use the q a function below and if you'd like to introduce yourself to the room and just share a bit about why you're here today please do so in the chat we'd love to get the conversation going there as well uh you can also engage in the conversation on twitter if you are a social media influencer our twitter handle is at ada lovelace inst and the uk stats regulation uh twitter handle is at statsregulation there are closed captions available you can turn on the subtitles by clicking the cc closed caption button at the bottom of your screen and alternatively we can view you can do fully adjustable subtitles via the stream text which will be dropped in the chat shortly great okay i feel like most of us are here so welcome everyone to this conversation today on building public confidence in data-driven systems with the ada lovelies institute and the office for statistics regulation my name is andrew straits i'm the associate director of research partnerships here at the ada loves institute those of us who don't know ada we are an independent research and deliberate body based in london with a mission to ensure that data and ai work for society in march 2020 education ministers in england scotland wales northern ireland uh closed schools as part of the uk's response to cobit 19 outbreak government announcements confirmed that public examinations would not take place in summer 2020 and the uk's for qualification regulators off qual the scottish qualifications authority qualifications wales and the council for the curriculum examinations and assessment in northern ireland were directed to develop an approach to awarding graves grades in the absence of exams while each qualification body adopted a different approach all of them involved the use of statistical algorithms to assess and award grades now i personally just so happened to be in westminster on the morning of august 6 2020 a day when students from across the uk march through parliament square and to number 10 downing street chanting and now infamous cry which i will not repeat here about those algorithms in the end the algorithmically determined scores were drops and the grades in all four countries were reissued based on the grades schools and colleges had originally submitted the off call of qual a-level exam results algorithm prompted wide societal debate about the conditions that would engender public trust and confidence in data-driven systems in this event today the office for statistics regulation will share findings from their uk wide review of the exam models deployed in 2020 focusing on the importance of confidence and models specific factors that impacted on confidence in the a-level algorithm and drawing on lessons for the future the office for statistics regulation is a key uk regulator charge with increasing public confidence in the trustworthiness quality and value statistics by articulating standards and shaping good practice the ada levels institute will also share some insights today from a diverse profile of public engagement and deliberation work that we undertook during 2020 on the conditions that engender public confidence in data and related technologies at times of public health emergency and beyond i'm very excited to say that we're joined by four wonderful speakers today first election truce gail rankin who leads osr's program of systemic reviews for the uk and also heads up its office in edinburgh gail's background is in engineering and project management and she joined osr from a career in local government data and performance my colleague riva vatel is also with us here today she is ada's associate director of engagement rima has worked for the organization from its establishment as part of its founding team she leads aida's public attitudes in public deliberation research and its broader engagement work on justice and equalities ed humphrison is the director general for regulation with the office for statistics and regulation and along with michael hodge the head of data and automation at the office for statistical regulation they'll be joining us for a q a panel at the end of the day we'll start from some presentations from gail and rima on the findings from the research into off qual so i'll hand it now to gail rankin to kick us off i think we will be having a few difficulties uh connecting gail we should be with her she'll be with us shortly would it help uh andrew if i just is that humpherson here i just said a few words whilst uh gail tries to reconnect rather than have everybody sitting around awkwardly so first of all let me just introduce us um as the uh as the uh osr so we are the regulatory arm of the uk statistics authority and our formal statutory role is to promote and safeguard production and publication of official statistics one crucial point that i always make in our presentations is that we don't actually produce any statistics and that we're actually quite separate from the ons and a big chunk of our work is overseeing regulating if you will the work of ons and those of you who are interested in demography will know that we published quite a critical report about ons on monday of this week about how they compile uh population uh estimates um for some parts of the country so we're separate from ons and we essentially hold all parts of government to account for the way they produce and publish statistics and data and you see on the right hand side of the screen there some of the tools we use the main one actually is the code of practice for statistics um that's our kind of uh it's got a sort of like foundational role for us it's absolutely central set of principles that we go to our and then we do a range of other things compare the national statistics designation we do reviews across whole systems like adult social care and we um are known for stepping into the public domain when we have concerns about the use of data and there's a screenshot there from a headline uk statistics watchdog warns government over the use of covet data so that's us i can see from my screen we've got gail back at gail are you back with us i am i'm very sorry i was having connection problems then there ed thanks for doing that introduction on us and just to take over we today are going to talk a bit more and about the review of the exams process that we published in march of this year and with this review we've explored the approaches taken to awarding grades last summer and we did this to identify the wider lessons for those working with statistical models and algorithms and i should say just at the um start as well we really enjoyed working with the qualifications regulars some regulators and most of them we hadn't really worked closely with before and we found that they all acted with honesty and integrity in what was very difficult circumstances and can we move to the next slide please thank you so um as you'll all probably be aware and last year because of coronavirus and uk exams were cancelled and instead and grades were calculated using a mixture of teacher assessed grades but also as we all know statistical models were used as well to standardize the results and we also headlines in the papers come august and statistics have ruined my life and we also started at this time to hear phrases such as mutant algorithm being used and they were being used as a way to apportion blame and and basically the statistical modeling decisions that were made last summer had a very fundamental impact and for some children a lasting impact and it's quite hard particularly us as a regulator to think of another time when statistical models have had such public impact before and we probably you know most people you know us included as the public we never really thought about them before or particularly felt that we had felt that we hadn't been affected by them and also what's important to note is our review covered all four countries of the uk and there wasn't while there's been focus on off calls model and there wasn't one just wasn't just one statistical model and the approaches actually adopted in each of the countries did have similarities but we're also different but i think the thing that we want to talk to you more about today is that none were able to succeed in securing public confidence and if we just go back one slide dreamer we've just jumped ahead a little bit there's one before this slide um which just explains a little bit of the context and of why we decided to undertake this review so all of you who are working in this area will know that statistical models and algorithms are increasingly becoming part of public life and we really believe this technology and availability of data increases that there can be very big benefits in the public sector using these kind of models but it is quite clear to us that there has been a negative rhetoric has developed around algorithms and why we've got involved is our role at osr is to uphold confidence in statistics so it's very clear to us that um public confidence in statistical models and algorithms has been damaged by the experiences last year and our worry is that people will be less like less and likely to use these kind of models for fear of a public backlash and so that's that's the main reason why we looked in detail at the approach is it wasn't to pull apart to judge what others have done but it's basically to get a much better understanding and of what caused things go wrong and really so that everybody can learn from it and it's worth just flagging on this slide the last bit of the slide at the heart of what we do is the code of practice and the code of practice is all about ensuring that statistics serve the public goods and what this means that when we when we do our regulatory review it means that it's a balanced one so our reviews our approaches focus on the technical issues but um quite importantly we also consider that the public you know the broader context so the public dialogue public understanding and public acceptance and it's this kind of socio-technical approach that you'll see across a lot of the work that we do it's built into the code of practice and we feel it's allowed us to contribute to quite a unique position on this debate so next slide please reema so i just wanted to spend a little bit of time on exploring the sort of the key factors that we found through our review and that um impacted on public confidence so the first one is around confidence in the model so all models have limitations and uncertainty and i think if we take ourselves back to last year it was a unique time and resources there was resource constraints there were a lot of challenge but we felt when we looked there was a high confidence placed in the model that that a model could deliver a single grade to all students on a single day whilst also not disadvantaging any groups um in addition the next one is around um transparency of the model and its limitations so the regulators took lots of um lots of communication effort um but and there was what we felt limited transparency so full details as we all know when reading in the press of the model weren't published in advance but one of the things around that was a desire not to cause uncertainty in this new time for students who were going through this new process but in our view the limitations of the general limitations of statistical models and also uncertainty in the results of them weren't fully communicated to the public and possibly had there been more public discussion of limitations and also most importantly the mechanisms that were being used to overcome the limitations of the model such as the appeals process it may have helped to support public confidence in the results and then technical challenge so we saw lots of collaboration um in in terms of technical challenge and this was largely with the qualifications and education context experts in those fields there was more limited um statistical challenge in the wider field and this is where we heard possibly some dissenting voices come through the methods weren't exposed to the widest possible audience but we fully do acknowledge and in particular the report that there were time constraints to doing this in this particular case the other factor in to consider is the impact of historical data all four countries had different models but all four countries used the previous grades at schools and centers as an input and within um this historic attainment there are patterns so well-known patterns um between um how different groups the attainment between different groups and because these were used in the model these fed through to the results and this created a public perception that the model had created bias and there was limited public discussion and limited public awareness in advance of these underlying patterns in the data and also how they might impact on the results the next factor was quality assurance we saw lots of really good examples of good quality assurance of input data and output data and but what we saw was largely at an aggregate level and with limited quality assurance of individual results and so and this meant that when the results were published and the media focus which was on these individual individual cases so particularly on individuals where their grades were substantially different from the teachers we felt that there was possibly that that could have been predicted um in earlier in the model development and next is public engagement and testing and there was a wide range of public engagement and testing but what we found when we looked into it that testing largely focused on the processes of calculating the grades and rather than the impact that was going to have um and this um one thing that came out for instance as you know scotland came first um and it was only after scotland um issued its results people felt that they were able to see themselves in in the process and what might happen to themselves or their child and next one is broad understanding um the in any normal year statistical evidence and also expert judgment are used to set grades so we grades you know pass rates go up and down and but there was limited understanding of what happens in a normal year in this case and this result this resulted in this perception that things were possibly more novel than they were when actually there is statistics in underlying a process normally and the last one is human in the loop and humanly can mean lots of things to lots of people and what we saw was we saw clear human involvement in setting the model the parameters the coding and human involvement and checking the results and and some and feeding back the parameters but what we found was the models tended to make the decisions rather than to support decisions in this in this case and it could be that there had there been um more involvements in the final in more involvement of human humans in the final grade this may have improved com public confidence in the output so that that was the key factors we find from specifically looking at the exams and come to the next slide please what's really important though in this the main point of doing our review was what does this actually mean for those who are developing models and we find three principles um that support public confidence and in our report underneath each of these three principles there's actually 40 learning points for others and looking to work in this way so the first principle is around be open and trustworthy so it's about ensuring transparency about the aims of the model and also the model itself so so that touches on the limitations being open to and acting on feedback and ensuring the use of models is ethical and legal and the second um key point is being rigorous and ensuring quality to rights so that's around ensuring clear governance and accountability bringing in those subject matter experts and technical experts when developing the model and also making sure the data and the outputs of the model are quality assured at the level they're going to be used at and then lastly is around meeting the need and providing and public value so that's around engaging with commissioners of the model making sure and considering that a model is actually the right approach and also testing the acceptability of the model and for those of you that know the code of practice very well you might recognize tqv trustworthiness quality and value in these but what is interesting is we actually had a bit of an open book when we did this um and actually this was a bottom-up approach but it just goes to show how well and the code does stand up for cases like this i'm just going to move on nema and so as well as looking at em lessons for others developing models we also need to consider the big picture so what we find are some lessons for the centre of government and we find that um it's really not always clear when you're working in this area what guidance is relevant and where you can go for support and as we all know there's a lot of different organizations in the space and this means it sometimes can be confusing to work in particular if you're if you're new to model development there's also um a level of inconsistencies with regards to terminology and it's also quite hard sometimes to find um out who's doing what in the space and we also found that and there was quite limited guidance and also practical case studies around um public acceptability and transparency models and we also we also feel that em there's a lot of really great support available but it should be a bit clearer and easier to access and so i'm just going to close the top by talking to the changes we'd like to see to happen so next slide please so what we are what we have um are recommended is at the highest level we feel that there should be clearer leadership from government and we're calling on the analysis function and digital function but working with the administrations of scotland wales and northern ireland to ensure they provide consistent joined up leadership on these models and to support this we recommend and that those working this area collaborate and i know there's a lot of collaboration that goes on already but particularly around the guidance area and we see cdei having a role in this and kind of bringing together um the guidance and that has developed and also looking at the gaps so you know possibly in the public acceptability in space and we also recommend that any public body who are doing advanced statistical models which have a high public value should consult the national statistician and the resources within within that structure and such as the data science campus and the um and the data ethics and support available there as well so and can i just move on to last slide please remember i'm just going to finish up by saying um i thought it'd be really helpful to sign post some of the other work that we're doing this area the review of our exams supports our wider work in ostar is part of our automation and technology program and we've got michael here and to answer some questions if you have that at the end as part of this program we are going to be um clarifying osr's role and the code when and statistical models are used and we're currently just finalizing some guidance about models in ai in the context of the code and most importantly we are really looking forward to working and with the other organizations in this space and embedding our recommendations of our review so that's it for me today thank you rumah and we're very happy with i'm also joined today by ed humphrieson who's director general of mosr and michael hodge who leads our automation and technology program and will be very happy to take questions on anything um at the end of the session thank you thank you so much gail i really appreciate it that was an excellent excellent uh introduction to the to the report's findings and recommendations i want to hand it out to my colleague roommate patel who will walk through some of the research that we've been doing at data live institute on participatory forms of enhancing public confidence um hi everyone um so i'm incredibly interested in in some of the findings that have come out from from this report and found it incredibly thoughtful and stimulating so i just wanted to to congratulate and compliment gail on her presentation and what i'm doing here in this presentation is just taking a bird's-eye view of the landscape particularly last year and building on a lot of our public engagement public deliberation work um in order to answer this question what helps us build public confidence in data-driven systems so there's a real synergy here and next slide so just a bit about the early love listen to we use three core approaches and we build the evidence base to think about um the the impact of data and technologies on people in society we also convene diverse voices so it's that piece of the work piece of a jigsaw that i'll be speaking to and and these two approaches help her change and shape policy and practice and so it's great to be a part of this conversation and conversation with many of you here today on that on that matter how do we convene diverse voices that's a really broad range of approaches that we we use and we deploy it and we think of our approaches both participatory and mixed method and so from the left hand side you can see that we undertake a range of attitude surveys and understand people's um attitudes quantitatively so um here some of our work on on facial recognition technology but also the impact of of the pandemic on different groups of people in society is a really um key example or two key ways in which we've done that last year we obviously had to respond to to the constraints and the questions of a pandemic and um uh prototypes a new model of deliberating with the public um through rapid online public liberation so here you've got a an example of of a convenes dialogue that we we pulled together and that was agile rapid online was thinking about um the uses of data-driven technologies such as contact tracing and such as uh immunity certificates they were called then at the time now vaccine passport and and all of that work has been written up and published today there's also been um pre-pandemic but also we we took these approaches into a hybrid um during the pandemic stage there's also a range of uh deliberative approaches that we have used such as citizen juries citizen council and public dialogue and and here we think of this as a longer-term um public liberation that brings the whole system including policymakers and practitioners into a room to have a conversation with um people about technologies what where what what would be or would not be okay when it comes to the governance of technology and a key example of something which we published quite recently the citizens biometric council and and and last but absolutely not least there is a really live conversation about the role that technologies can uh play in exacerbating inequality but also in addressing inequalities and and as part of that we're working very closely with a range of um underrepresented uh perspectives to understand the impact that they driven systems have on on them um so through the biometric council we we ran workshops for instance on gender identity and the impact of systems on gender identity and also on um racialized groups and minoritized groups where do we see public trust and confidence being part of this and and the way we think about this is as a virtuous circle so when we have data systems underpinned by strong sense of trust and confidence from everyone and we we know that we've got certain things right beforehand we we know that we've got representatives inclusive and proportionate data and governance um and and and underpinning that is the mandate for support and active participation government governing data and the reason i represented this here as a feedback loop is because it's incredible it's incredibly important if that for that to happen because it then has a knock-on impact on um us being able to engender fairer more equitable outcomes from the uses of data system um and then that feeds into the beneficial impact that i have that then also impacts in turn on more representative inclusive and proportionate approaches data and governance and what this feedback loop highlights is that quality is not disconnected from this notion of trust and confidence and neither is effectiveness that they're really um connecting i wanted to offer a few reflections on on the pandemic and hear the source of the data is from the agent trust thermometer which is a longitudinal um trust barometer that has been going for at least 10 years and um so we can see here that globally the pandemic did impact on on on trust levels in technology and um the uk was not exempt from that so between may 2020 to january 2021 we saw a 12-point drop um compared to a range of other other countries so this is part of the global trend that we're we're wrestling with and um the same trust parameter indicated that willingness um to share uh a personal data to respond to handling it declined over that period of time and um there was a globally a six point decline in in the percent of people who said they were willing to give up more of their personal and health and uh location tracking information to government than than all that ordinarily is the case so of course this in some to some extent is about technology but it isn't just about technology so um we are beginning to see reports from i'm referencing here and an off-course study reports that trust in gcses um and also increasingly trusting the way teachers award marks have been uh affected by by the uh exam exam situation the the fiscal theatre around um the exam scandal and and and so and you you have a situation where 27 of respondents agree that um gcses are trusted this year compared to 75 in a normal year and um so what this highlighted that facial effect technical impact really um that certainly trust in technology is an important thing to consider but also the technology and the trust and technology impacts upon um the wider system and how how the wider system is received i i really loved and i pulled this bit out of of the study um that gail was speaking about i really love this particular quotation um that illustrates the public confidence is not just about the key technical aspects of the model neither is it about the quality of the common strategy but it's about considering public confidence as part of an end-to-end process from designing to use the statistical model through to deploying the statistical model um so just just a few points that have come from a range of our public liberation initiative there's four key points um that that really emerged through the lockdown debate initiative that was considered at a time of crisis um we were asking what would build confidence in crisis the first was about the evidence base and so how do these technologies work what impact do they have do we know enough about that evaluation all of that that's really important secondly processes to offer independent review and assessment of the technology so my colleague and andrew is leading a programme of work here at the eddy lovelace institute about how do we ensure that we're more anticipated we're assessing the impact the technologies have before um they're used and implemented so what does participatory impact assessment in this context look like what is impact assessment that looks at affected groups and communities here look like um the third was around boundaries when it comes to data sharing data use rights and responsibilities clarity about that people felt this was very important in the context of the fact that it was a crisis where there might be less clarity than ordinarily if okay um and last but not least a lot of people were really concerned about the risk of adverse or disproportionate impact on on on vulnerable groups on minoritized racialized groups and largely because of questions that they had around um the the quality of data sets that were held about different groups and different people in society so so those were key themes that emerge um there's also another report that we we highlighted quite recently that reinforces a lot of the key themes that um gail has spoken about and then really about the apps and technology being judged as part of a system they're embedded into the neutrality so um always being aware that technologies aren't always viewed as neutral so um they're always part of that broader socio-technical system the um the point that trust uh isn't just about data or or privacy uh that technology needs to be effective and and respond to particular needs i feel like that came out very much very strongly and echoes um a lot of the the work that gail's spoken about um recognizing that data is often about people's experiences and and expression of identity and and these are are quite complex they may be fluid so data systems may not always necessarily comprehensively capture that or respond to that being aware about that and um the proactive role that technologies can play so they can when well designed thoughtfully designed and help us to proactively address bias and protects against buyer um and and those were really some key theme that emerged and that that's all from us i just wanted to thank you so much for taking time to listen i'm really interested to hear what your questions are um and looking forward to sharing more shortly thank you so much rima i think some really interesting points to connect between the two different threads one being this notion of human in the loop and a question there being which humans and which loop um i think there's a very valuable insight and connection there between participation and the types of decisions that are made by these systems um we got a few questions in the chat already um but if you'd like to add any more please use the q a function at the bottom of the screen i'd like to welcome ed humperson and michael hodge into the conversation to help answer some of these questions and if i may i'd like to start with one question that uh came up which was what measures are necessary to put in place to ensure more accountable and transparent uses of algorithms um there was a point made by gail in your presentation about transparency being very key and i'm curious if there's a very specific learnings or practices that uh um other public service organizations can put in place to ensure that they're meeting those kinds of requirements in future uh thanks so much andrew i'll i'll give an initial answer and then um invite gail to to kind of link what i say back into the report findings if that's okay so i think uh i i fear andrew that our answer to lots of these questions is going to revolve around the same kind of core proposition and the same core proposition is that if you um break the problem of deploying sophisticated statistical algorithmic models down into component bars and pick out one part and say that's the bit we've got to fix you're probably going to find that you run into difficulties so i think if you just say that's fix transparency of the way the algorithm works take that out you could come up with some fixes but i think you might be missing other things like um the acceptability of the whole process or um the uh the sort of understanding of the um of the preceding environment pre the use of the model what what is the model replacing so it's for that reason that we'd say actually the way to think about this is is end to end when you're moving from one system to another and the sec the new system has a great component of statistical modeling or reliance on statistics you need to think about this notion of the public um the public kind of engagement with that throughout so there are some specific things on transparency but i kind of wouldn't privilege transparency separate from all of the other things as as an answer and get gail is is there anything you'd like to add to yeah just to add and yes i agree with you ed that and we are going to be talking about um and probably quite a lot of our answers the fact it's not about doing one or two things brilliantly and i hope that came out clearly in the talk that is very much about him looking at this full end to end process and i was just actually having a little look at our 40 learning points and actually there's transparency across a lot of them in a lot of the areas um so yes it's it's one of these things that just needs to be thought about um in a lot of steps and a lot of processes along the way very interesting points i want to take a question now from the audience and i can see there's a question here about how do you think the wider use of data models and data during the pandemic has impacted on public perceptions around modeling for example modeling coveted deaths or different epidemiological models um do you think this has had any impact on the exam results situation no okay i'll pass that to you ed and then i'd love you to come in as well but that's it yeah so um i suppose my sort of starting point in answering that andrew is to say i think we let me put it this way i i have you know built the last few years going around a lot of audiences saying you know you know what all of this statistical analysis it's not just for elite decision makers it's a public asset it's for the public uh that the public are both the subject and the beneficiary of lots of the these models so we shouldn't forget that these things are for the public good not for just efficient decision making um that's just one element of the public and i always sense when i give this a rather impassioned like uh advocacy of of of this that my audience's eyes glaze over or they slightly sort of think that's a lot that's kind of like a fantasy you know in the in the real world what matters is kind of policy makers making policy decisions and um what i think we've seen in the pandemic is uh that we're right actually that we're right that that that uh on um certain profound issues the public deeply care um not just about what's happening to them but what how what's happening to them relates to what's happening to their broader community and that of course is the job of statistics is to to give a picture of the broader community uh the aggregate picture and so you see you know extraordinary levels of engagement for example with the coronavirus dashboard i think i heard from public health england that they have they're up to well over 20 million hits on that on that dashboard through the through the course of the pandemic uh you see in the media uh in in in outlets which would normally feature graphs and statistics and data is probably very very prominent uh presentations of statistics and data because there's an intense public interest in this and i think the algorithm story or and i as they say algorithm we always try and say statistical models to rescue this story from the kind of pejorative sense that the algorithm has um when we talk about the the the exams using statistical models i think we see this again we see some intense public interest not just what's happening to me but what's happening to the community and that's that's the sort of the the generic agriculture picture so i think that in a way the pandemic has taught us many many things about public health and our how we live our lives and our exposure to sort of exogenous risk and so on i think it's also taught us uh both the importance of data and statistics in our kind of civic life and also the need for humility in the face of the data i think those two things are really really crucial uh lessons that that have learned and i suppose the really interesting thing for us and i'm not sure if this even answers the question but i'll i'll i'll i'm on a roll now so i'm going to say a really interesting question for us is is this going to be a one-off moment in in kind of societal history when suddenly data became important and then everything went back to that scenario where people think data for elite decision makers are not for the general public or is this a secular shift in interest i believe it's the latter i believe it's a secular shift and in some ways part of our work is to sustain that shift and to get producers and presenters of data to recognize that that is a sustained shift and to and to serve it um so yeah there's a lot in that i'm as you can see i'm i'm quite quite confused by by the question of linking this all into the broader pandemic yeah i mean i'm curious how you'd respond to that a lot of the work you've done i know has been around this notion of health and equality is in light of the pandemic i mean how how do you feel about the uh about this question around um how the hook of it might change uh the the context for the understanding of these challenges well the reason i presented those slides earlier on the idiom and barometer of trust is i think it's so interesting to see that in at the time that um that tech has accelerated when it comes to use and deployment and used by everybody more or less um or at least a rather large significant proportion of population um at that particular time you you're also seeing this this challenge around trump and so that really illustrates um the extent to which which trust is is really important but there seems to be a dynamic that's at play in the pandemic that suggests that that has declined and um so that is that were that's really worth interrogating a bit further as as to understanding what it is that is contributing to that decline the point about legitimacy is really important as well so um it's not just that the technology needs to work well for us but also that people need to feel it works well for us for it to work well for us um and uh it i mean examples during the pandemic illustrate that quite well we we need if if the contact tracing app is going to be effective then um there needs to be a sense that we we feel that it is effective and the wii is a really interesting point to interrogate which is who is it that it is working for and this is where the work on the data divide um comes into play a lot of our public attitude work recently demonstrated or illustrated a big um disconnect between the different demographic groups um who were benefiting from or downloading apps such as symptom tracking apps and so on so so that's a really interesting thing that is a collective responsibility not just the responsibility of one person and your ns is doing a really interesting piece of work through the inclusive data task force which we're feeding into as well um and and working to to influence as well thank you very much i think it's a very good point i want to take another question from the audience um uh i feel like i am just picking the announcements for now but i will get to the rest i can assure you uh i think this one's coming is quite interesting there's been a lot of op-eds um claiming there was no fair way of fully automating automizing or automating the exam process and building an algorithm in the first place was was just wrong i'm curious what the panel's opinions are about this would there have been a right way to build this algorithmic model or was the decision to use a model in the first place inherently flawed and i noticed a bit of a tricky one um uh ed gail michael would you like to start us off i'll i'll give an initial view and then and see if get gail and michael wants to come in so as we were doing this work we you know debate raged internally and with uh we had a very eminent um expert oversight group featuring uh i think three past presidents of the royal statistical society and the debate was were the four bodies across the uk set an impossible task or were they set merely an incredibly difficult task that that was really the question and at times we thought that actually this was an impossible task because there would always be the outlier which would trigger a degree of uh public sense of unfairness and that sense of unfairness would would sort of generate sufficient loss of public confidence that it would always it would always inevitably unravel uh never of course limitations in the data and nobody ever tried this before and then the other school of thought was not impossible merely very difficult and um that uh there was there was there was a there was a pathway through this that might have got to um uh a more acceptable result uh our report lands in the latter uh without at all downplaying the enormous difficulties that the organizations faced and i should say not producing a mission telegram they did they produced really good technically uh well-thought-through approaches um there were ways through that that might have produced a different outcome um but they're not downplaying the challenges they faced but i've been really interested in my colleagues thoughts on this because as they debate did rage about this um gail what what are your thoughts yeah i mean i don't have a huge amount um yes we did you know so i think what's what seems what should be very clear from what edwa said is we didn't go into this review with any sort of set expectations of what we were going to find or with any sort of judgment and i'd hope you know it you know we came across with this balanced view and so it was very much we just looked at what was there and we didn't go back and you know it's not our within our remit to look within the policy decisions and it was very much a looking at this at the end you know the processes that were involved and as ed said very mindful this was done in a very short space of time in in difficult circumstances but yeah we did have those debates as i was outlining there so oh sorry michael please yeah i'd love to hear you're obstructing this all right it's just gonna for me it'd be um it'd be a shame if this was something that shut the door on the use of algorithms um to try and get to a fairer society it i do believe that we can develop our algorithmic systems to be fairer but i think we have to look at fairness from a whole perspective fairness here was thought about in terms of of the impact which was bad like but also fairness has to be thought about in terms of your design of that algorithm and what happens afterwards and if there was a fair system in place where people could appeal after these these type of things have happened then i believe that is another step into fairness so it's not just that is that initial result fair but are the steps we've taken before it and are the steps we're going to take afterwards are they fair as well yeah it's a very very good point michael and i think it raises a challenging question of fair to whom um uh but i think that point around the the that focus on a systems approach to fairness about about the socio-technical concept that this is a complex system that's integrating into a complex environment um which raises all kinds of bias issues along the pathways very important one to flag okay yeah we're just going to flag that in one of in when i in the towards the end when i was summing up what we'd like to see happen this um notion of public's acceptability which fairness is tied up into is an area that we see need some work you know we do need to put some flesh on the bones and i know people and probably people on this call are looking at these issues but it is an area that yeah we think needs to be explored further thank you rima would you like to come in at this point uh yeah please i'd love to hear your perspective um but just very brief i think that there's certainly lessons to be learned from the way that this um you know the last year's conversation have gone that's why we're having this this call um i'd be reluctant to say that an algorithm should never be used because i think going back to that point about the socio-technical systems it's actually about its use and its deployment what it's designed to do the nature of the human interactions with it and and most importantly the way in which people are involved in the development and use and the designs of this of the system um it's fair to say certainly from our perspective that that there are we're still as a society working these things out um i think this report the review um takes us a step forward in in that direction and but the crucial point is how do we use 2020 as a learning opportunity for us to build on so that we can do what we're here to do which is to ensure that we have just the bare socio technical system that's a great point rima actually flows into i think the next question which was submitted by peter kemp from kcl which is um thinking of this year uh which is fair on students is it to use this year's teacher allocated grade or last year's cag plus algorithm uh i'm i'm actually um going to pass that directly onto gail because i think he probably thought about this uh a little bit more than i have um other than to say that i think it's uh it's definitely a good question to ask um i suppose the the one little kind of um nuance i place on the question is i'm not sure whether uh fair is the is the way that would quite think about it i think more say which is the one which is more likely to land in a way that um that is accepted and acceptable by the public i think that's the way i think would focus a little bit more on that public confidence space but gail what what are your reflections on i think one of them may i think one of the main reflections i have i would have between you know last year and this year and this isn't a fudge answer by the way but it's the fact that all is brought questions of fairness to the fore now there has been discussions in the past around fairness of the existing system and some of those have come out and and so i think actually it's a really good thing and because it brings everything to you know out in the open so we can have these kind of discussions and the experts and the education experts and have these types of discussions because you know at the heart what we are very concerned around is you know is the is the mod is the modeling and the use of the algorithms and how that's come about and and you know the fact that the dirty word element um and i think you know models and statistical models have got an element to play in this fairness and we will go back to a situation as well where we are using albeit this year we're not using algorithms and that's been quite clear in some of the messaging we will be going back into a normal situation where algorithms and modeling are used but they're used behind the scenes and will that have changed the public perception of fairness and and will there be different processes in place i'm not sure so for us definitely as a regulator you know bringing these bringing these issues out and having these kind of discussions is is you know it's it's i it's excellent for us it's a very good point you know i i do wonder if there's a distinction to draw between fairness and justice um where when you think of fairness we sort of think of a particular outcome uh or a picture of rubric or metric in the system but where we think of justice perhaps a bit of a slightly different outcome that we're looking at and i'm curious if you had any thoughts on that sort of that that notion of of where fairness ends and and questions of justice uh pick up yeah it's a really interesting question so um i i i really don't want to do a crash course in in political philosophy but but tennis is it concept that has been fatigued for a few different reasons i think that that that's a valid point um being made by by ed um i think in terms of the this question of justice but the sort of who is it uh the way we think about it is who is it that is um benefiting from the user of data driven systems and technology and thinking about this as not just making assumptions about the beneficial regroup but but recognizing that there might be potential levels of benefiting from the way a technology has been designed and i'm being aware of these asymmetries of power that might exist when we're designing or developing an algorithmic system so with the exam model obviously um i mean that seemed quite stark in a way the people who were out on the street and um versus the the people who are involved in designing evolving systems and what we're really interested in is how do you uh begin that uh deliberation or that dialogue uh what that looks like in a system in order to enable a clear sense of what societal consensus would look like in in in in designing these sorts of systems why we use approaches like the citizens biometrics council or lockdown debate that notion of when you create a context in which you bring the whole system together in the in the room you're much more likely to design um ai systems that work for people and society and and design around different standpoints and perspective and so that's a really long answer but so that's how we think about just unfair but it might be quite different to the way other people think about justin there yeah i think it's a very good point man i can see the uh the comment as well in the chat about accountability being another consideration here really getting to that notion of not just how you hold these systems to account but also how do you prevent harm from occurring in the first place making the more um uh thinking of impact prior to the harm occurring um i want to take another question the from the from the audience which was from hannah spiro uh and reba i think we'll start this with this with you i think it touches up on the edelman survey that you mentioned in your presentation to what extent do you think we are actually able to measure and track changes in public trust towards the public sector use of algorithms are surveys and focus groups um a good way to do that or are there other methods we should consider it's a great question um so it's it's quite difficult to track and measure um trust i think but i think that there's something really interesting about making sort of making a range of mixed methods approaches to understanding this this approach so uh some of the longitudinal surveys are really useful and helpful and some of the more qualitative pieces of work are really useful and helpful as well um the aidan wouldn't trust monsters is incredibly useful because it looks at the trend across the international context in the landscape and um aims to understand uh global trends and i think that's quite helpful when you're thinking about um levels of trust uh the the the the other dimension around trust that's really important to think about is trusting what and trust in who so what i find quite interesting through that um survey is the focus on technology or the focus on the sector focused on the institution those sorts of things through that yeah i think you can become much more granular much more specific about what it is that you're trying to measure and have if i try to pull people and ask them about their levels of trust i think that would give us a very limited sense of um information about that but if i were to um be much more contextual about that and in this case users and applications of data i think there's more to be said and i'm also much more around that mixed methods approach so the public liberation work that we do aims to um really understand not just the what's uh so what is it that people are feeling or thinking but the why what's contributing about what what what the conditions that are leading to that um and and get into a much more advisory space so um how would you you know suggest or think about what you'd expect or uh uh uh policymakers or decision makers to do their their four principles or conditions um which is a very different thing to the sort of thing that you get from a car or an attitude server but i'm very happy to to talk more about that um uh outside of the scope of this discussion as well i think it's a very good point i think it raises this tension we're talking about between quantitative data that feeds a model and the kind of rich contextual information that helps explain the gaps in that data um i want to take uh one last question before we we might have to wrap up and this is one for uh everyone on the panel um i'd love to just go around just gonna do rapid fire um last few minutes um and that's that question is kind of going us back to the original report i'm curious if we could um reflecting on the discussion today and the report's findings what measures do you think are necessary to put in place to make sure that these systems are more accountable transparent and just going forward what would you say are your main learnings and and one key thing you'd like to see organizations like offcoal put in place in the future should i go first andrew yeah so uh in in a you know kind of one minute quickfire version i'd say um and it's not just organizations like ofqual there are three other regulators in other parts of the uk and of course many other public sector organizations they should think trustworthiness quality and value they should apply that code of practice anything they're doing which is relating to the way something is going to impact on um a public public discourse or or public uh acceptability think tkv don't just think technical don't just think quality think trustworthiness quality and value that's my one sentence remedy thank you so much ed gail to go next i would thanks very much andrew and for me i just wanted to draw one of the the kind of key things that we raised at the end was around support so there's lots of people in this space there's lots of great people in the space there's an emerging body of of pr of um you know best practice and actually it's worth saying that we did have an internal discussion about you know what are we going to write a best practice report for this this exam review we're like no we're not that's for others to do and to kind of um i think you know if we were an analyst sitting in the space we want it to be easy and clear and that support structure to be working for them to to make that system work thanks gail michael for me it's um i think we've we've got a lot better in recent years about collection of data and data usage and being transparent about that with gdpr we've been good at the outputs of statistics recently and showing that end of the spectrum but when we're missing that middle bit and i think that was apparent with this work was we didn't really get to see the inner workings of the algorithm the model we we couldn't we couldn't scrutinize it we couldn't learn from it as much as we probably should have been able to so i think being more transparent about kind of sharing our practices in our coding in our algorithms etc it will allow us to be better producers of algorithms going forward thanks so much michonne rima i was also about to build on the point about transparency but also transparency for what and for what purpose so um not just that the algorithm needs to be um open but also to sort of think about what level of openness and then what is it that um a subject to whom the decision it relates to could do about a decision um would there be an accountability process or procedure there that's that seems like a really live and interesting thing to think about in in in the development so so that's certainly about transparency but there's some really interesting work on what does transparency look like and we were just about to publish an event um very so i published a report and and and convened an event very shortly on participatory data government that said that transparency is that fundamental building block that enables more inclusive and a dialogic approach to the design of these data systems amazing thank you so much rima i want to say thank you again to all of our guests today and all of our attendees for joining a mass thank you to ed humphrison michael hodge gail rankin and raymond patel for joining us if you'd like to continue the conversation you can on twitter we are at ada lovelace inst and at statsregulation we'll all see you there have a lovely rest of your thursday take care everyone thank you thank you you
2021-05-19