Leveraging Data-Driven Advanced Analytics and Artificial Intelligence Technologies

Show video

DR. CHRISTINE HUNTER: Welcome to today’s  OBSSR Director’s Webinar, titled, “Leveraging   Data-Driven Advanced Analytics and Artificial  Intelligence Technologies to Address Social and   Behavioral Determinants for Health Equity.”  I’m Christine Hunter, the Acting Director of   the Office of Behavioral and Social Sciences  Research at the National Institutes of Health.   I apologize that I’m not on camera; my web camera  has decided to stop working about 30 minutes ago.   So, before I introduce today’s speaker,   a few housekeeping items I want to cover.  So first: Today’s webinar is being recorded,   and the recording will be available in about 1  month on the OBSSR website at obssr.od.nih.gov.  

Today’s presentation will be followed by a  question-and-answer session. Throughout the talk,   all attendees are muted, but the chat feature…and  the chat feature is disabled. So, questions and   comments will be taken via the Zoom Q&A feature.  To ask a question or send a comment, click on   Q&A at the bottom of your Zoom screen, type in  your question, and send. You have the option to   “like” other questions to avoid duplicate posts.  The most-liked questions will move to the top.   Feel free to send a question at any time  during the webinar. Following the presentation,  

OBSSR’s Dr. Beth Jaworski will facilitate the Q&A  session and ask your questions to the presenter.   So, with that, I’m pleased to introduce  today’s presenter, Dr. Irene Dankwa-Mullan. Dr.   Dankwa-Mullan is a nationally recognized industry  physician and scientist, a health equity thought   leader, scholar, and author, with more than  20 years of diverse local, regional, national,   and global leadership experience in health  care systems, businesses, and the community.  

She is currently the Chief Health Equity Officer  and the Deputy Chief Health Officer at Merative,   formerly IBM Watson Health. Her current research  strives to develop and evaluate data sets,   real-world data, algorithms, and…and  algorithms as inclusive technology—so,   artificial intelligence and machine  learning–driven technologies—to empower   health providers, patients, and their families.  A priority is advancing technologies to promote   social good and equity. She supports inclusive  and participatory engagement with communities  

and stakeholders, and she also helps teams  with modeling complex decisions associated   with health equity and social determinants of  health. Dr. Dankwa-Mullan has engaged in the   implementation and evaluation of data and  evidence studies, including social, legal,   and ethical implications of use of these emerging  technologies. She was formerly Deputy Director,   Extramural Scientific Programs, in the National…at  the National Institute on Minority Health and   Health Disparities and played a key role in  promoting strategic trans-NIH and Federal efforts.   Dr. Dankwa-Mullan has published widely on  health disparities, evaluation of artificial   intelligence and machine learning technologies,  including the integration of health equity,   ethical AI, and social justice…principles  into AI/ML development lifecycle.  

It is my pleasure to welcome Dr. Dankwa-Mullan to  present today on leveraging data-driven advanced   analytics and artificial intelligence technologies  to address social and behavioral determinants for   health equity. And with that, Dr. Dankwa-Mullan,  I will turn it over to you. Thank you. DR. IRENE DANKWA-MULLAN: Thank you so much. Thank  you, Dr. Hunter, for the introduction and for the   kind…invitation to present at this webinar  to share my insights on this topic that I am   so passionate about: how we’re leveraging data  and technology. So, I’m really delighted to be   here. I’m grateful for this opportunity to provide  some perspective in this space, as well as some of   the current and emerging research. So, this is…I  want to emphasize that our new company, Merative,  

is an extension of our ongoing efforts, and so  it’s an exciting time for all of us, you know,   to…to build on these technologies. And so  next slide is the outline of my presentation   today. I’m going to provide an overview of the  role of data—so, big data, advanced analytics,   AI/machine learning—in the domain of social and  behavioral health. So, I’ll talk about our efforts  

to advance health equity, racial justice,  building inclusive technologies. I’m going to   provide an overview of AI and machine learning  and some of those terms and concepts that we   use in this space to provide a foundation for my  presentation and also help with that discussion.   I will highlight some recent innovative efforts to  demonstrate how we can integrate behavioral social   determinants data. And…and briefly discuss  a general framework around addressing bias,   which is a huge deal and effort that we…we’re  always thinking about bias in AI algorithms. And   I’ll conclude with some thoughts for the future,  including community and stakeholder engagement.  

So, before I talk about this, I actually joined  IBM Watson Health, now Merative, about 6 years   ago as part of a clinical and health care experts  team to lead and promote clinical evidence and   evaluation. So, in my role as Deputy Chief Health  Officer, I provide the subject-matter expertise,   clinical expertise for scientific evidence to…to  prove effectiveness and value technology and   solutions. And in my role as Chief Health Equity  Officer, I…I also provide strategic leadership   support, subject-matter expertise for how we can  think about health equity to ensure inclusive   technologies and data diversity. And so I…I  think often about how we design better solutions  

in health IT for health equity. I think about  the patients and populations, the entire health   ecosystem, how we can optimize or leverage the  data…the data assets that leaders and communities,   researchers have entrusted in us in an ethical  manner, with…with privacy, with transparency,   and build, you know, trustworthiness.  So, some of these examples...how we’re   working…we’re working with our partners, we’re  working with collaborators, really to design   robust data representations. And…and my goal is  to make sure that we’re capturing the complete   life experiences and understand these data  points and how they impact health outcomes.   So, I mentioned that this is an extension of  our…of our efforts, and some of the programs   or initiatives that we have implemented just  in the health equity space is, for example,   building inclusive technologies and promoting  inclusive language. And this program really   included identifying discriminatory terms in…in  our technology. So, terms such as, “master” and  

“blacklist” that were used in that industry have  been removed from use, replaced by inclusive   words. We think about bias in a machine learning  algorithms and have developed cold audits on how   to mitigate those bias. We published a framework  called TechQuity, which is promoted across the   continuum of the technology design and development  lifecycle. And this concept really calls for   technology and businesses to be accountable  for the active promotion of health equity.   We’re working with partners to build health  equity dashboards and…by enhancing their own   data or existing health solutions, we’re…we think  about integrating socio-demographic factors in   health equity and metrics. And so, really  promoting…technologies that may include AI  

or machine learning as a strategic lever to make  health care more efficient and more equitable. One   of the things that we…I was also excited about is  design justice to promote racial equity in design,   thinking about representation, which really  matters. So, a whole host of really great,   exciting work around really leveraging  technologies and promoting health equity.   And then I do want to touch on AI ethics  because there are social, economic, legal,   and ethical implications for leveraging data  and machine learning solutions in AI. And so we   really take AI ethics seriously, and AI ethics is  really a multidisciplinary field of study, and the   goal is to understand and optimize AI and machine  learning’s beneficial impact while reducing risk   and adverse outcomes for all stakeholders in a way  that really prioritizes our well-being, our human   agency. And examples of AI ethics are illustrated  in this slide, where it’s data responsibility,  

it’s explainability, it’s…you know, robustness  in our data, transparency, moral agency,   aligning your value with the communities. And  so, we…and I’m pleased to say that we led the   development of a framework for integrating  health equity racial justice principles   into the development lifecycle with experts and  partners in the field. And so, it’s a framework   that can be used by researchers and developers and  stakeholders to really assess the impact of their   AI tools and technologies to ensure equity and  racial equity, social justice, is prioritized.   So, on the next slide, I just want to…a little  bit about Merative. It’s really an extension of  

what we’ve been doing at IBM Watson Health. We  do support the health care industry and clients   who deliver health and human services, working  with them, leveraging technologies and their data   to improve health. Not only in cost and in quality  but, most importantly, in innovation and outcomes.   And our mission is the same, as I mentioned, but  we realized that our need is greater. Our need   is so much greater, given the ongoing efforts  of the pandemic and the global recession and   overall economic hardships that have really  been seen, you know, across communities. And  

our community responsibility also reflects our  expansive footprint in the health care space,   and so it includes our social responsibility to  our employees and our workforce and our clients.   And so, how can we collaborate for  innovation to address these really   critical societal needs in the communities in  which we operate with a culture of ethics and   integrity and promoting trust and really  placing people at the center of...of health?   And so, this is…slide shows some of our  product family. We do have Health Insights—so,   these are end-to-end analytics and data solution  that’s designed to manage population health and   health care program performance. We have Social  Program Management, and that helps with health   and social program administration at the point  of care, including benefits management, family   support programs, child welfare, and…and all  those programs that I used within the health and   government social program sector. Micromedex is  a clinical decision support tool that integrates   evidence-based drug and disease content. We  have our MarketScan—some of you may have heard  

about it. It's this integrated, patient-level data  reflecting real-world continuum and cost of health   care, and it’s one of the largest proprietary  collection of de-identified U.S. patient data   available for health care—over 270 million  lives in there. And then Clinical Development   in…in helping with clinical trials. And Merge is  actually our Enterprise imaging and AI-enabled   solutions for radiology, cardiology, and to  manage imaging data from a centralized platform.  

So, data is really, really transforming every  aspect of our world. It’s…every aspect of our   lives. We use data to make decisions in so many  ways. And in the health care space, data is always   being leveraged to allocate resources, to target  interventions, to identify populations at risk,   and so much more. And data points are used  in just about every domain of health care,   including especially, behavioral and mental health  care for—example, in clinical decision making,   in understanding treatment pathways for  optimal outcomes, for self-care and management   of comorbidities, and more. And so, you know,  the question is: Are we using the right data?   And I want to start, on my next slide,  to really level set with some definitions   for AI and machine learning technologies. So,  what do we really mean when we talk about AI,   machine learning, deep learning, natural  language processing—all of which are being used?   How are they being used, and how we integrating  the data, including those relevant behavioral and   social determinants of health to surface those  meaningful insights for improved health care.  

And so, this slide really talks about…so,  as you can see, the…an AI system is a system   that can make predictions and recommendations or  decisions that influences our physical or virtual   environment. And so, they’re typically trained  with huge quantities of structured or unstructured   data. And they may be designed to operate with  varying levels of autonomy—or none—to achieve   your defined objectives. So, as you can see, AI  basically is composed of machine learning and...or   natural language processing. So,  AI…machine learning is a subdomain of AI,  

and it refers to that family of algorithms  that can identify patterns in data. And so,   machine learning can make predictions  based on patterns in huge amounts of data.   Natural language processing is basically another  system of algorithms that can understand language   and syntax and interprets the written language.  Deep learning is a subdomain of machine learning,   and natural language understanding is also  a subset of natural language processing   that deals with machine comprehension, and  it…it’s defined by how machines understand   human language and behavior. A lot of this is  also being used especially in the mental health   space. But my talk today I’m going to focus on  the AI and machine learning and deep learning   domain. But just to give you a sense of  the…or…you know, what AI is and the various terms  

that are being used. Next slide. And so, really  apologize for a lot of text in this graph, but   I want this figure to help illustrate the spectrum  of advanced analytic applications and technologies   that are used to generate insights from big data. And  so, the methods can range from less complex but   advanced analytics, such as descriptive analytics,  that do not involve artificial intelligence   to more advanced, more complex analytic methods  that will…does involve deep machine learning   methods. So, the AI and machine learning  in computational modeling approaches   are our extension of traditional statistical  modeling approaches. And statistical modeling   approaches are able to provide much greater  specificity. So, for example, whereas traditional  

statistical methods will provide information  about a change in X associated with a change in Y,   these advanced analytics approaches provide  insights not only that a change in X is   associated with a change in Y but the magnitude  and the timing of that change. So, it allows for   greater understanding of those complex and dynamic  systems that influence health and health outcomes.   Next slide. So, this next slide is another way of  looking at it. This is the spectrum of advanced   analytics, and it’s a summary of the various types  of research questions or health questions that   prompts the different types of advanced analytics.  So, from descriptive analytics that will inform  

questions around, “What has happened to a  similar population?” or “What are the trends?” to   similarity analytics that informs questions about  how to identify best or promising interventions   in similar patients, or how to have similar  patients really…in...the outcomes with a   particular intervention? Or to look at predictive  analytics that informs questions about…around what   will happen. And finally, prescriptive analytics  that informs questions about what should happen.   So, again, these advanced analytics approaches  are increasingly dynamic, meaning they leverage   the dynamic interplay of interactions,  observations, and health outcomes over time.   And so, interventions are social interactions;  the choices that we make, including treatment,   being virile; what we observe from the data—all  of it can inform decisions or future choices or   these analytics. So, they are data points that  help us to make better…have better insights that   can complement our decisions and interventions in  communities and…and environment. But I want to   mention that we can improve the accuracy of…of  these technologies if we have complete data,   right? So, data that’s diverse and comprehensive,  that include social factors that we know influence   health and health outcomes, especially  outside of clinical care. Next slide.  

So, I just wanted…these are examples. We’re  able to answer all kinds of questions, and   these are real questions that have been from the  literature or have been modeled with…with machine   learning and AI, such as, “How do I optimize care  interventions for different populations?” “What   are some triggers for disease onset?” So, really,  a lot of potential for AI and machine learning and   research to provide insights into issues that are  relevant in a real-world context of understanding,   identifying, promoting better health,  clinical care, public health management.   The next slide, I do want to provide some  examples. I want to illustrate some examples   of how we’re using it. So, this is an  example of…if you want to ask, like,   “What’s my patient’s risk…or a patient’s risk of  developing condition X—diabetes or whatever?"   And so, this question can be approached  by a combination of several features.  

So, there’s feature engineering or feature  selection methods that will extract those salient   or important and relevant features from real-world  health data. Predictive modeling that can provide   information from cohorts or population groups  to address some challenges of data messiness or   specificity in our health care data. We can use  personalized predictive modeling methods that   do not focus on a single global model trained  using all the data but rather patient-specific   local models to multitask learning methods that  simultaneously can predict multiple…risk and   could also leverage across risk information  and association. So, the figure on the right   illustrates the steady setup for a retrospective  prediction task using longitudinal patient data.  

The…and you can see the diagnosis dates will be  the time when the patient was diagnosed with the   disease, and then you can look at…a prediction  window to define how much time before the disease   was diagnosed, and within that window, you want  to compute the risk of the disease. So, it could   be looking at 6 months, what did they have...1 year  or 2 years, depending on the application use case.   And then the observation window defines sort  of the period of time before index case.  

And the data is typically aggregated  using those feature engineering   methods into a feature vector or matrix  representation for downstream modeling.   So, this is where it’s really important, also,  to acquire expert knowledge about the process   being modeled, collecting the appropriate  data to answer the desired question;   understanding the inherent variation in responses  between different population cohorts; and taking   steps, if possible, to minimize this variation so  that…which may not be apparent. So…and…you know,   collecting the right predictors—social, clinical,  behavioral—that are relevant for that disease   condition and utilizing a range of model types  so that you have the best chance of uncovering   those relationships among the predictors and their  response. Next slide. You can also look at—yes,  

this—what has happened to patients similar, right?  So, for example, you have...a treatment pathway,   and you want to determine precision cohorts of  patients that are similar to a target patient that   you have, and so this drives…helps drive research  in looking at patient similarity methods, looking   at temporal event sequence mining methods so that  you can identify salient patterns in the data,   looking at disease progression modeling to better  understand and characterize how a disease evolves   over time and how observations in the data are  associated with some of these changes. You can   look at pathway visualization methods to represent  and survey some of these underlying insights. And   the figure on the right really shows that, sort  of, visualization of that disease progression   model trained on patient EMR. And these are  different states of disease, the duration,   statistics of each stage, the various observation  associated and sort of the transition frequencies   of the different states. Next slide, I do  want to just…I think just to illustrate,   this is a case example that I really like using.  It’s for Crohn disease—Crohn’s disease—that...we  

know that there are significantly…so,  this is a treatment pathway, right,   so it represents different cohorts of patients  that were put on different medications,   and we know that significantly fewer patients  from this analysis have included…you know,   the treatment pathways included biologic  therapies compared with non-biologic therapies.   Very few patients were even ever initiated on  biologic therapy, but we know that there’s been   significant progress made in treatments, and we  know biologics are our most effective medication,   but it’s only being used in a  small proportion of patients,   suggesting that there are barriers to, you know,  prevent optimized patient management.   The next example I actually pulled out  this. Really interesting from…again, from   our MarketScan Database, but this is a treatment  pathway for depression. These are called sunburst   of treatment patterns, and it starts with  first-line therapy, which is the innermost donut   to the fourth line, or outer slices, right? And  so, each color represents distinct treatment   classes, and each layer represents a new treatment  outline. And you…so you can see patients who had   had a clinical diagnosis of depression—more than  twice were in this cohort—and had inpatient visit   for depression. And you could see…so, there’s  a commercial…those with commercial insurance,  

Medicare insurance, and Medicaid insurance.   The proportion of patients that do not receive  any pharmacotreatment during follow-up ranged   from 29 percent to 52 percent. As you  could see in this analysis, SSRIs were   most common first-line treatment, but however,  if you could look…if you see in the Medicaid   cohorts, many patients received a sedative or  anxiolytic prior to any antidepressant treatment,   even though they had a clinical diagnosis. And  so, there were lots of patients that…you know,   females accounted for 62 percent of the patient  population, and there were comorbid conditions.  

But the study showed that while general trends  across these populations were relatively similar   with some important differences, patients that  were covered by Medicaid tend to have treatment   patterns that were different than the other  three groups. More than half of those patients   were untreated, and first-line SSRIs used were  lower. And so, it represents a population,   you know, with high burden of disease, but it  appears that they’re getting different…care when   it comes to depression compared with other patient  populations. And so, this is sort of some insight   into how…and you could, you know, slice this  by race, ethnicity, geography to really provide   those insights. The next slide is…just wanted to  share another visual about how patient similarity  

networks are being used for precision medicine, and  the goal is leveraging all data sources,   including rich genomics, biological interpretable  -omics data, and this all requires computational   methods to support these heterogenous data and  to have more…actually, to predict performance.   So, just a few examples, and next slide just talks  about, you know, current applications hospitals,   health systems, public health, everyone is, you  know, using to some extent. These are various   applications, both AI and machine learning use.  So, there’s application in preclinical research,  

in population health and public  health for tracking epidemics,   clinical pathways for informing treatment  protocols. We also use it…being widely used in   interpretation of medical images—for example, in  diabetic retinopathy, screening mammography. And   then there are patient-facing applications, such  as virtual intelligent agents, that we’re seeing,   and it’s also being used to optimize health  care or cancer care delivery processes,   such as the procurement of cancer drugs, right?  Making sure chemotherapy drugs are being delivered   to the right place at the right time and...and such  logistics, so really widely used applications.   So, I want to transition now and talk a little  bit about: What are the opportunities for social   and behavioral determinants of health? And we  know that there are multiple determinants that   shape our health, multiple factors—environmental,  economic, education—and these socioeconomic and   demographic characteristics really have a complex  and integrate…integrated relationship with patient   risk and vulnerabilities and health outcomes, and  they also have a greater influence on a patient’s   health, these social determinants, regardless  of age, regardless of gender and ethnicity. So,   we have a huge opportunity here to think  about the social determinants of health data.  

Next slide. But we also know that the...you know,  these relationships interact across complex   and dynamic pathways to produce the health and  disparities that we see at the population level.   And in some…most instances, the exposures at the  environmental level or the neighborhood level may   have a greater influence on population health than  individual vulnerabilities, although we know, at   the individual level, there may be some personal  characteristics, including genetic predisposition,   that interact with the environment to produce  disease. But as a result of this complexity,  

there are limitations, as I mentioned earlier, to  our traditional epidemiological and statistical   adjustment models in handling the computational  complexity of these multiple social, demographic,   and environmental interactions that are involved  in disease risk, that are involved in progression,   that are involved in maintaining health  and wellness and…and clinical outcomes.   Next slide. We also know place matters. We know  that where we live can determine how well we live,   and it’s a significant factor of  health…healthy life and healthy   life expectancy. We know food insecurity  is a risk factor, and so really, having,   you know, knowledge about place and…and community  social capital, as well, is also important.   In the next slide, I just wanted to share, we’ve  done some work around leveraging huge social   determinants of health data, but I did…you know,  for hospital systems, but I picked this one that   I just wanted to share, which may be relevant, and  this…this was a study that we conducted to really   understand the influence of population-level  demographics and social determinants of health   on mortality from COVID-19. So, we…we  introduced predictive algorithmic   modeling and machine learning approaches  to study the interactions of those complex,   multiple determinants of health  at the population level. We…so,  

next slide…there were several phases of the  methodology that included identifying all   the publicly available data and features for  the study, so looking at select variables   that had complete data across all counties. So,  we were looking at the county level, as well.   Classification and selection of relevant variables  that had strong associations with mortality,   and then we did a correlation analysis of the  variables in the county clusters in an algorithmic   clustering process, because we wanted to look at  counties with similar geographic, demographic, and   health prevalence status to find out how…whether  they were…the COVID-19 mortality and…for   comparison and see which social determinant of  health had a higher…or most high correlation.   Next slide. As part of the methods, we looked  at a range of social, economic, environmental,   disease-risk prevalence, health care determinants  that are known to influence susceptibility to   disease and health outcomes. So, these are U.S.  Census population estimates. We used the CDC Area   Depravation Index, CDC Social Vulnerability Index,  and COVID-19-related death counts were retrieved   from the CDC’s data reporting from the beginning  of the pandemic. So, we initially looked at the  

accelerated phase to see which social determinants  of health were strongly influencing it. So,   in the next slide, what we did…this is actually  a map of the U.S. showing the clusters, right,   so which county clusters are similar. And 28 of  the 34 variables—so we looked at…34 variables—were   used as features for clustering. And you  could see that, based on the…those…we did  

an overselection to help identify those most  variant features that impacted mortality out   of the…all the variables, and the features  are listed here. And based on these features,   we…we could see, like, six optimal demographic and  socioeconomically distinct county-level clusters.   The next slide just…you know…sorry if it’s a lot  of text, but it’s just to demonstrate that there   were six clusters, and, for example, cluster  4 had 356 counties, and most of these were the   Southern Black Belt, the North Slope counties  in Alaska, Pine Ridge and Rosebud Reservations,   and...and cluster 5 was composed about of 223  counties that included New York, Queens,   and you could see the prominent  social…social features, right, that included   residential segregation, preventable hospital  rates, median household income, home ownership   rates that were prominent sociodemographic  features that had strong positive relationship   with COVID-19 mortality. Next slide. Just  illustrating sort of the mortality comparison  

across those county…distinct county clusters.  And you could see here, like, cluster 4 and   cluster 5 had significantly higher rates of  COVID-19 mortality compared to other clusters.   So, in the next slide, again, you could see…now   you can illustrate, right, with these  ranked variable plot figures that those   key sociodemographic and related determinants,  in terms of the associations and strengths,   either positive or negative, with  COVID-19 mortality rates. Of course,  

we know population density is consistent across  all clusters, but when you look in clusters 4,   which includes the Southern Black Belt, Black or  African American population or HIV prevalence and   employment rate were significantly more correlated  to mortality than population density, which was,   you know, an eye opener, you know, and not  surprising, but it was…this is what we found.   And in the next slide, you can see sort of a  heat map to show their associations and…and   correlation. And, again, each county cluster  showed those distinct associations with various   demographic or social risk factors. But I do want  to point out that, you know, in this clustering,   correlation is not causation. As you know, it  merely quantifies the strength of this linear   relationship between two variables. So, a weak  correlation or lack of correlation to COVID-19   mortality may not necessarily mean that that  selected factor’s not related to mortality rate.  

There might be a nonlinear relationship that  requires further investigation, but at least   there were…you know, you could identify those  factors that needed intervention or needed to   be…some investment in the…so, next  slide, I mean, basically, you know,   you could see that machine learning algorithms  can help capture. Using this combination,   we can cluster counties by similar demographics  and social determinants of health data,   disease prevalence and risk, and, as I said,  this cannot be done with traditional epi modeling   because it’s hypothesis and theory driven, whereas  predictive modeling is a whole lot of data driven.   I want to really, before—next slide—talk a little  bit about…spend some time on AI bias mitigation   in our efforts...and because I know this is where  there are really several opportunities for social  

and behavioral health determinants in the domain,  and we often talk about bias, but we look more at   the algorithm bias. AI bias is a general concept  that refers to the fact that an AI system has been   designed, whether intentionally or not, in a way  that may make that system’s decision use unfair.   And so, sometimes we talk about labeling and  modeling and measurement bias or missing, but…but   I think that I wanted to share, like, in the next  slide, the “Five E’s” of broad aspects of bias   and how that can also be linked with AI and  how we can understand that bias is everywhere   and that we’re all part of addressing bias  and how we can really help to mitigate it.  

And…I explained earlier. So, data  sources for informing clinical evidence,   for developing guidelines, for building and  training algorithms, for clinical decision,   largely comes from research. It largely comes  from research trials, it comes from…EHR data,   from administrative claims data. And so, we all  know that data never speaks for itself and that   there’s always a human being deciding how data is  funded, how data is generated, deciding on what’s   collected, how the evidence is translated into  practice. There’s always a human being deciding   and interpreting and prioritizing data. So,  bias introduced into data and subsequently into  

health care or management or decision-making is…is  really broader than just the algorithms. And one   of the aspects I talk about is…the first one is an  evidence bias, right? Searching for my research.   And a translation bias. We need to…you  know, clinical decision is tied to  

clinical trials and rigorous scientific randomized  controlled trials or real-world evidence data.   I think that we need to look at  inclusivity and diversity in our research.   And we know that the current state of our clinical  trials and scientific studies may not always match   the demographics of the patient population who are  at risk or suffer from the disease condition. And   so, promoting diversity and equity and inclusion  and standards into our science, especially in   behavioral and social…social determinants, is  really key. The next is experience, right? So,   the health care provider and experience is  an integral part of translating that patient   data into improved health outcomes. I  always think about the provider bias as   the elephant in the room because a patient may  arrive at a medical facility with symptoms,   and it’s the primary care provider or a, you know,  clinician who will determine the cause of action.  

Their action or inaction based on an examination  of the patient, listening to their story,   capturing their beliefs is what’s translated into  the EHR data, and so that’s really important.   Exclusion is one, right, it’s the third “E.”  Critical. Missingness. I have environment,   that’s the fourth one. So, looking at…life-course  exposures, environmental determinants that need   to be integrated. And the fifth aspect is data  empathy. And this refers to how much empathy,  

how much patient values, preferences, or  patient-reported outcomes that are integrated   into our decision-making. In all this…these are  five aspects of bias that we need to think about   because they feed into the algorithms;  they feed into clinical decision-making.   So, I want to conclude with…I know that  we often…there is a human part of AI, and   in health care, I think, empathy is a reflection  of our compassionate care, a reflection of   patient-centered care that takes into account  the patient’s perspective and circumstances.   In order to provide precision and patient-centered  care, we have to…we need to understand or have   an understanding of why different belief systems,  why cultural biases, language, family structures,   and a host of other culturally determined  factors influence the manner in which people   experience illness, how…why they adhere to medical  advice or not, how they respond to treatment.  

And these differences are real and translate into  real differences in outcomes and care. And so,   without humanity, our AI tools and our solutions  can exaggerate existing racial inequities and   other forms of bias. And so, I…I want to just  mention that our humanity and our empathy and   our compassion are really important aspects  of patient-centered care, important aspects   of health equity. And it’s critical because it  all feeds into the data that’s used for these AI   models and algorithms. My final slide is just an  extension of it. I think we have an opportunity in   social and behavioral science space to really look  at our data—our data sources are currently siloed—and   really improve and capture those relevant social,  behavioral factors that influence health. Machines   have endless capacity to study and identify  patterns, streamline data, predict patterns,   but I think the interaction between us  researchers, health professionals, remains   critical because we can provide that empathy  and that care and that…for all our patients. So,  

people are the recipients of care or health.  And so, they really are at the center of health   care. And it’s just our high-tech and AI solutions  that are part of the solution and really function   at the service of humanity, rather than the way  around. So, thank you so much for your time and   for listening to this presentation. I’m happy  to take any questions that you have. Thank you. DR. BETH JAWORSKI:  

Thank you so much, Dr. Dankwa-Mullan. That was a  phenomenal talk. I really enjoyed it. We had…over   300 attendees who were also able to listen in. We  have a number of questions. I don’t…I don’t think   that we’ll be able to get to all of them, but in  the remaining time, I would love to be able to   ask you…ask you just a few. One of them being:  Over the past 6 years, you’ve been working at   the intersection of health, health equity, and  technology. How do…would you say that working   at a technology company has added to your already  deep knowledge of health care and public health? DR. IRENE DANKWA-MULLAN: Great, great question.  Thank you. I mean, I…I feel extremely lucky to be   working at this intersection. What is…you know, to  help make health care more efficient or leverage  

these…data to surface more intelligent insights.  And I do work with a multidisciplinary team,   but in terms of insights, I think technology and  data—big data—has tremendous potential. I don’t   think we’re using it to the level that it…its  potential. There is so much promise and so much   potential to be realized…the health care sector  has generated huge amounts of health data. I mean,   driven by accumulated biomedical research data,  public health data, hospital data. They’re all  

meaningful, but they’re so large and so complex to  be handled by a traditional software system. And   so, we really need the insights…like, we need to  get this as a foundation, right, across all of our   partners and stakeholders to bring such technology  and advanced analytics to the forefront so that   we’re able to sufficiently address the complexity  of health care. So, I think our science needs to   advance in this space. We’ve reached the limits  of our human capacity and with manual tasks and   with analytics. And we have emerging data  now, which is too numerous to compete, and   I…I’m willing to…you know, I’m hoping to see more  done in the space for the future of health care. DR. BETH JAWORSKI: Thank you. Thank you so much. I  think a related question that also touches on some  

of the questions that were posed by the audience  is about: How do you work to ensure that diverse   perspectives and underrepresented communities  and their data are included in conversations   around AI and ML technology innovations? There  were a number of questions that were submitted   about that issue of data “missingness” and  how that impacts everything across the…the   pipeline of…of the work that you’re doing. Would  you be able to say just a little bit about that? DR. IRENE DANKWA-MULLAN: Yes, that’s a…that’s  a great question, and I…and I do agree, it’s a huge   challenge and an issue that we really need to  make concerted efforts and commitment to work with   communities, work with those, you know, leaders  in those communities, bring them to the table,   make a conscious effort to be  inclusive of their perspective,   respect their data. I mean, part of AI ethics  is the data is not for the technology company;   it’s for the owners or those that generate it.  And have some standards or principles, right,   or, you know, around how we can work with them on  their data. And so, I think we start…need to start  

building relationships on trust and transparency  and listening and understanding their needs and   their values. There’s a whole lot of trust that  needs to be done, but really including them at the   table in a participatory manner and…and working  so that the technologies benefit them. I think   that’s where we need…we need to really work on  and include them in conversations. I think…I   know that at NIH, there is a lot that’s being  done around research and diversity and including   communities. We have a long way to go, but at  least we’re…we’re making…headway with that.

DR. BETH JAWORSKI: And a follow-up question to…to  what you just said related to some questions that   came in and something you mentioned early on  in your talk with respect to language and,   I think, the power of language. I’m wondering  if you could say a little bit about some of the   strategies or the ways that you’ve been able to  actually change the language that’s used. I know   you referenced some language earlier in software  development that used to be very commonly used,   and it sounds like you had some success  in being able to have folks stop…stop   using that discriminatory language. I’m  wondering if you could speak a little bit to   strategies for changing language and I think,  in turn, building…building trust among… DR. IRENE DANKWA-MULLAN: Yeah. Yeah.  Thank you for that question. I mean,   it’s an ongoing process. We started by…I  mean, sometimes people may not realize  

certain language will be offensive, so there’s a  lot of education and training and…and, you know,   you look at the terms that we’re using  and ask, “Why is this being used?” Right?   And the meaning around it. And so, we  started to collect—you know, this was   at IBM—to really collect a huge database of…of  terms that are used in technology that may have   not been appropriate or right. You know, there  may be a group of people that already knew   these were offensive language or not inclusive  or stereotypes, right? So, there are a whole lot   of…there’s a whole lot of language out there, but  you start with, really, education and starting to   build that database. And then, looking at these  words, and saying, “Maybe we could replace it with   more inclusive.” You know, instead of “master” and  “slave,” we have, you know, other terms that will   be used. And so, I think it  can be done with collaboration. DR. BETH JAWORSKI: Thank you. Thank you. And I  think…we may have time for…for one more question.  

I would…would love to end with a question  about advice. I know that this is a very,   very popular area. Certainly, the pandemic has  ushered in a wave of digital health technologies.   What advice would you give  to researchers—particularly   folks who’ve been trained in the social and  behavioral sciences—at various career stages   who are interested in pursuing a career that’s  specifically at the intersection of AI and ML and   health equity? What…what advice would you have for  folks that would like to pursue this career path? DR. IRENE DANKWA-MULLAN: Yes. I think the…I  think it’s great. I mean, I…I always say that   everyone…every training curriculum needs to have  a foundation in AI and machine learning or data   science as foundational, as part of the core  requirement, because it’s really important. The   field is becoming more multidisciplinary,  so it’s not now just...you know, science…I mean, math or computer engineering  or software or biomedical engineering students   that are going into those fields. Our team had  included anthropologists and epidemiologists,  

public health experts. And there  are psychologists or linguistics   that are going to…so, I will say, taking  courses, taking classes, but it’s very important   in your science career or…to look for some of  these classes. They are Coursera  classes. There are…IBM has a, you know, AI courses  or machine learning foundational…foundational   courses that you could use. And it’s really  a discipline that has…will benefit from rich   perspective. So…and really excellent career  opportunities, especially in…in this space. DR. BETH JAWORSKI: Fantastic.

DR. IRENE DANKWA-MULLAN: So, I would say go for  it. You know, don’t be afraid. Let the…everyone   needs to have…take a class in that, and I  will be cheering them on. We need more…more   early investigators or researchers in this space. DR. BETH JAWORSKI: Wonderful. Thank you so much,  Dr. Dankwa-Mullan for that fantastic presentation.   Dr. Hunter, I would like to turn it  back over to you for closing remarks.

DR. CHRISTINE HUNTER: Thank you. So,  thank you, Dr. Jaworski for moderating   the question-and-answer session, and a  big thank you to you, Dr. Dankwa-Mullan,   for your excellent presentation. And also, thank  you to the large crowd that joined us online   today. If you have colleagues who were unable  to join and may be interested in this topic,   please remind them that a recording of today’s  webinar will be available in about 1 month on   the OBSSR website at obssr.od.nih.gov. The next  OBSSR Director’s Webinar will be held on September   27 at 2 p.m. Eastern and will feature Dr. Emily  Falk, who will present on how health messages  

can affect behaviors, such as alcohol and tobacco  use. Please also subscribe to OBSSR’s listserv to   receive updates on upcoming events. Again, you  can sign up at obssr.od.nih.gov. And with that,   we conclude today’s webinar. Thank you for  attending, and I hope you all have a great day.

2022-08-16

Show video