AI/ML Consortium to Advance Health Equity and Researcher Diversity AIM-AHEAD

DR. LAWRENCE TABAK: Well, good afternoon, everybody, and thank you for joining us today. It’s my great privilege to kick off this stakeholder forum for the Artificial Intelligence/Machine Learning (AI/ML) Consortium to Advance Health Equity and Research Diversity: The AIM-AHEAD Initiative. We look forward to hearing from all of you in today’s highly interactive meeting that will provide valuable feedback as we develop this initiative to use AI/ML to address disparities in health and in the workforce. Now, let me give you a quick background of this initiative. The idea for this consortium was borne from the many needs and opportunities identified by a second Advisory Committee to the NIH Director Working Group on AI and ML. Now, as we’ve all seen over the past decade, there’s potentially transformative opportunity to using AI in a variety of biomedical research and clinical applications, and in just the last year—with acknowledged disparities and health outcomes and research highlighted by the COVID-19 pandemic—even more avenues have opened for artificial intelligence to have a tremendous impact. The Advisory Committee to the NIH Director group identified several

unique opportunities for NIH to apply resources in a very practical way to address these needs, starting with supporting widespread use and deployment of AI/ML capabilities with an initial focus on electronic health records; however, to meet the full potential of artificial intelligence, we cannot do business as usual. To meet the needs of underserved and often marginalized communities, we must take a different approach when designing and implementing the AIM-AHEAD Initiative. Now, towards that aim, it has been my privilege to help coordinate the development of this initiative with three extraordinary co-leads: Dr. Dina Paltoo, the assistant director, Scientific Strategy and Innovation, in the Immediate Office of the Director of NHLBI; Dr. Nicole Redmond, medical officer in the Clinical Applications and Prevention Branch,

Prevention and Population Sciences Program, Division of Cardiovascular Sciences, within the NHLBI; and then last, Dr. Laura Biven, Data Science Technical Lead in the Office of Data Science Strategy located in the Office of the Director at NIH. And together with a team of about 25 incredibly enthusiastic staff from almost every Institute and Center at NIH, are all united in their dedication to this effort. Now, the purpose of this initiative

is to address the needs of marginalized communities through three main goals: first, to enhance inclusion of groups underrepresented in AI/ML research workforce; second, to increase the capacity for AI/ML research through training and building of infrastructure in institutions that serve underserved communities; and then third, to support development of AI/ML models in research, including questions that address biases and disparities, develop representative predictive models, and incorporate community-engaged research. These efforts will start by focusing on the significant errors, gaps, and racial and gender inequities in electronic health records, using problematic data for models amplifies disparities. The foundation that the models are being built on must be addressed, and community-enabled real-world efforts are needed to expand their reach and transformative abilities of these technologies for all. And we can’t go to tertiary and quaternary care centers and engage at the community level.

We need to work with different groups of people who heretofore have been left out. We cannot continue to let these community remain as afterthoughts, or else the same inequities and disparities and health outcomes will be perpetuated. For AIM-AHEAD to be successful, we need robust collaboration and strategic partnerships. For this meeting, a Request for Information and the upcoming Research Opportunity Announcement, our goal will be to identify partners in the artificial intelligence industry, computing partners, academic partners, nontraditional partners—all of whom are essential—and particularly, those nontraditional partners at the community level who serve the underserved individuals and people who are marginalized in our society. We hope this consortium will build a very strong network of different groups united in this effort to expand the capacity of AI and ML for all and use it to address the challenges of health disparities, health inequality, and minority health. In addition, if implementing with community insight and cooperation, the consortium

can complement and synergize with other ongoing NIH activities, like the NIH UNITE Initiative, the Common Fund Bridge2AI Program that was recently announced, and the many ongoing AI/ML activities throughout the Institutes and Centers at the agency. Together we have gathered roughly 600 participants from across the United States who are experts in different fields and come from different institutions and different sectors. They come from academia; from federal, state, and local governments; from industry; and health care systems and centers all coming with their own unique perspectives. They all come today with a common goal of leveraging

AI/ML for research, expanding AI/ML capacity, and for mitigating disparities in health and in the AI/ML workforce. This initiative will require participation and input from all of you in order to be successful, and so we’ve designed today’s meeting to be very interactive. We’ll begin with a brief overview of the initiative from Dr. Paltoo, who was one of the AIM-AHEAD co-leads. Then we will hear from Dr. Irene Dankwa-Mullan, deputy chief

health officer and chief health equity officer of the IBM Watson Health, who will discuss the potential of AI/ML for health disparities research. And following that, we will have remarks on enhancing the diversity of the AI/ML workforce by Dr. Talitha Washington, director of the Data Science Initiative at the Atlanta University Center Consortium. Then it will be time for all of you to shine by letting us know your thoughts and your ideas during the breakout groups so that we will be with a more manageable number to collect feedback, and then we will convene for a report-out in a large group discussion before closing out the meeting. Please give us your honest opinion and insight. We welcome your voices in helping us to use this initiative to break the status quo, to inspire real change, and to lead us to tangible progress. And with that, it is my great pleasure to pass it on

to Dr. Paltoo for an overview of the initiative. Dr. Paltoo. DR. DINA PALTOO: Thank you, Dr. Tabak. I’m very happy to be here with all of you today to really hear your feedback. The co-leads of this initiative are very excited, and we are just anticipating your feedback and thoughts on how we can broaden the benefit of AI/ML technologies to reduce health inequities and enhance diversity of the AI/ML workforce.

As you know, there’s been a rapid increase in the volume of data that’s generated through EHRs, as well as other biomedical research, such as -omics or social determinants of health, and this presents exciting opportunities for developing and applying data science approaches, such as AI/ML, to improve research in health care, but we also recognize that there are many challenges that hinder the widespread use of AI/ML. Without more diversity of both data and researchers, we run the risk of perpetuating harmful biases in practice, algorithms, applications, as well as outcomes. Biomedical studies in data sets lack diverse representation. Those that do can lead to inadequate understanding of continued health disparities and inequities. AI/ML capabilities can also be costly, difficult to learn and understand, and time consuming, and many communities actually have the potential to contribute data, diverse recruitment, as well as cutting-edge science and technologies, but they may lack the financial, infrastructural, and training support. Also, with EHR data, we know that we can begin to build capacity

and know-how in using these data, but we also need other data to complement the EHR data. We need social determinants of health, genomic data, imaging, and other data types, and we can put all of this together to understand disparities and inequities. NIH is committed to leveraging the potential of AI/ML to accelerate the pace of biomedical innovation but also prioritizing and addressing health disparities and inequities. NIH envisions a multiyear program that will begin this fiscal year with our appropriated funds, and this program will foster and support mutually beneficial partnerships that form a network of highly diverse institutions and organizations to build capacity and capability to contribute data—diverse data—to leverage AI/ML infrastructure, training, and know-how, but also to use the technologies and these data to conduct research that’s most important to those who would be a part of the consortium but that’s also aligned with the NIH mission. AIM-AHEAD is an innovative and transdisciplinary

framework that transcends scientific and organizational silos. It’s being put in place to tackle the complex drivers of health disparities and inequities. Through this initiative, we hope to establish these mutually beneficial partnerships that can increase participation from underrepresented researchers and communities to enhance the capabilities of AI/ML for these communities. We want to build capacity and capability through a coordinated data and computing infrastructure through training and also access to resources. We also aim

to support research questions that can use EHR data and connect these data with other data to address biases and lack of data, develop predictive models, and incorporate community-engaged research. AIM-AHEAD is really being put in place to…redress the challenges of health disparities and health inequities using novel technologies, such as AI/ML. We envision four components of AIM-AHEAD: partnerships, research, infrastructure, and data science training.

Through partnerships, we envision regional, multidisciplinary partnerships to form a network of networks. These partnerships will establish trusted collaborations with shared goals in research or training, and they will integrate data science, community engagement, and clinical research. Through multidisciplinary research partnerships, NIH envisions that these partnerships will build data sets but will also leverage existing data and resources to use AI/ML and develop novel algorithms and approaches. Data sets—such as EHR data, imaging, -omics, social determinants of health, and even other data types—and other resources can be leveraged, as Dr. Tabak mentioned—Bridge2AI, other initiatives, other data sets and platforms

that we have. These research questions will help to ensure equitable access and sharing of best practices across a consortium to redress health disparities and advance health equities. Potential research areas are vast, but they can be used to improve health care, prevention, diagnoses, and treatment. An example of potential research areas can include detection and mitigation of biases, understanding the criteria for AI/ML success, the role and impact of social determinants of health and other factors, metrics to measure health disparities and inequities, and also predictive models that can be applied to prevent, treat, and implement health care strategies. For infrastructure, we envision a coordinated, federated data network, which could enhance data interoperability and data sharing but also understanding that we want the consortium members to locally maintain the data, to govern it, and to also be able to share it. Potential infrastructure components include data management and a repository, clinical informatics, technical assistance, as well as project management. For data science

training, we envision facilitating the integration of underrepresented researchers across the career pipeline. We hope that this initiative will help to build on existing knowledge of each stakeholder to pivot to new areas, such that data scientists can learn from health disparities researchers and vice versa. We hope to develop a community engagement program that can invigorate the pipeline of the diverse data scientist trainees, and we also envision our training component to target critical skills and capabilities in data preparation, data management, AI/ML, and health disparities research. So, what you see here is just a potential organizational structure of AIM-AHEAD; we do not have anything set in stone, but we did want to give all of you something to react to. On the left, you’ll see example

partners. We can have partners that are forming from a minority-serving institution, clinical research networks, practice-based research networks, community-engaged networks, or even data science or information networks. They can form these partnerships under a coordinating center or coordinated partnership for additional partnerships, research infrastructure, and training. On the right, we can see that we have other data science programs and other

efforts that are not funded by NIH but that could be funded by NIH that can join in with this group to then form a larger AI/ML research collaborative. We also envision the funding for AIM-AHEAD to come through the Other Transaction Authority mechanism, and the overall goal of a consortium is to establish a sustainable infrastructure to leverage partnerships, data, and resources for future research. We’re thinking of, for Phase 1, a consortium or coordinating center/partnership that could hold additional…and seek additional stakeholder engagement; think about organization of the consortium; and also identify the needs and priorities, such as research, resources, pilot studies, and use cases. And for future phases, we’re looking at additional pilot studies and use cases, establishment of additional components of the consortium, as well as additional health disparities and inequities research that’s of interest to the consortium members. Dr. Tabak mentioned that the concept was approved

by the NIH Advisory Committee to the Director, or ACD, and after that—after all of the deliberations and the co-leads and NIH staff deliberating on this—we have released our RFI, or Request for Information, so we’re also hoping that you provide your comments through that venue, as well. The comments are due July 9. Here we are today with our stakeholder forum quorum, so as part of…as part of our stakeholder engagement process, we want to hear from you through this forum, as well. We aim to release a Research Opportunity Announcement later this summer through the Other Transaction Authority mechanism and fund the awards by the end of the year, but we did want to point out that there will be ongoing stakeholder engagement throughout this program. Ways that you can provide feedback

as of right now are to reply to the Request for Information, as well as to provide us your feedback through the breakout groups during the stakeholder forum. For the latest news on AIM-AHEAD, please visit the NIH data science website, and you can also post general questions to the AIM-AHEAD email address or questions specific for the RFI to the RFI email address. At this time, I’d like to thank all of the NIH staff on the AIM-AHEAD team and also volunteers who have helped to put this together and who are also volunteering to help today with this stakeholder forum. But we really want to hear from you; that’s the purpose of the stakeholder engagement effort and the RFI. It’s very important for us to hear your thoughts, your feedback, and how we can shape this initiative and consortium moving forward. We take all of your comments very seriously and really look forward to hearing your thoughts. It is now my pleasure to introduce Dr. Dankwa-Mullan. She is the

chief health equity officer and deputy chief health officer for IBM Watson Health. She also leads initiatives to ensure that diverse perspectives in underrepresented communities are included in conversations for ensuring AI technologies and innovation. She has over 2 decades of experience in clinical research, public health, health disparities, and population health, and she joined Watson Health about 8 years ago after being at the National Institute of Minority Health and Health Disparities at NIH. It’s my pleasure to introduce you

to Dr. Mullan. DR. IRENE DANKWA-MULLAN: Thank you for the introduction. Okay. So, thank you again for the kind introduction and for inviting me to speak today. Are you seeing my screen? I just want to check to make sure. DR. DINA PALTOO: Yes, we can see you. DR. IRENE DANKWA-MULLAN: Okay, and I’m really

pleased to be here. It’s an honor to be here with you virtually. So, IBM Watson Health is part of IBM Corporation, and we are a 108-year-old company helping to lead the transformation of health around the world by applying the next-generation technologies—AI and machine learning and Hybrid Cloud. We work with health professionals and researchers, academic institutions, and other organizations around the world to translate data and knowledge into insights for more informed decisions about health and health care, so it’s all about augmented intelligence for human expertise in AI. Research that informs the evidence base upon which new and emerging technologies—including AI and machine learning—are used can be based on support health decision making to inform policy and policy resources. But we

may not realize that data really drives the AI and machine learning technologies and can tremendously impact the scientific progress we want to pursue because that data, as I mentioned, is built on evidence and provides insights for decision making. So, there’s never been a more important and [inaudible] moment as now to advance our sciences better by really centering health equity, centering social justice, and ethical values, like fairness, transparency, and trust in our AI and machine learning. So, if you’re able to take away any message from this presentation, it’s that there’s really a huge potential for technologies to improve efficiency in our research efforts; augment our clinical decision making; improve the quality, safety, and health care of medicine; and promote health equity.

But it also has the potential to worsen existing disparities in health without the adequate methods, approaches, all frameworks in place that also includes a real thoughtful, deliberate, and inclusive process. So, we know…why is that? Because we know that the determinants of health and the process of diseases are complex and depend on multiple biological, environmental, social, and economic factors, and so our efforts to bring technology and these advanced analytics and machine learning to the forefront of science should be able to sufficiently address this complexity. There’s a lot of potential—so much—in machine learning and artificial intelligence, and I am sharing on this slide are some of the current and potential applications. So, for example, in preclinical research, these technologies are informing drug discovery and genomic medicine to advance the science of precision medicine. AI/machine learning insights on data can also inform clinical pathways that include predictions and identify high-risk populations or stratifying, for example, for triage in care. In public

health and population health, AI/machine learning methods can be used to help mitigate epidemics and pandemics of the…or the COVID-19 and also understands transmission of communicable and chronic disease, but I would really like to share also a couple of potential, critical opportunities for research. The first opportunities, actually identify explicit data of data biases. We all know that data, in general—but, in particular, health data—is subject to in multiple ways to bias, and the increasing quantity and types of data that are available today make it hard to identify where the bias can emerge. These algorithms and models are

developed…that are developed rely on the data, including our research data, and so if the data is not complete or missing, it impacts our results and outcomes and can have really negative consequences. And as you can see, our biases can be exclusive or inclusive, but overall, bias is all our responsibility. We are all accountable for these biases, and we have a responsibility to promote progress on health disparities research and identify new measures, so set benchmarks or standards that will reduce bias in AI and machine learning. But it’s important to note that the outcomes are most critical, right? The outcomes are not about the methods or algorithmic performance. We all know that the best outcomes may not

be about having the most robust methods, because all methods may have limitations or inherent flaws. It’s about the research questions that are asked. It’s about the diversity of the … of the team. What really matters also is your social mission: what we scientists—all of us—are aiming to achieve in the service of a social mission—your ethics, your values. So, we also have an opportunity to reevaluate our AI design tool technologies and the developmental cycle. The developmental cycle of any AI application or machine learning is to really implore a strategic approach that considers health equity and ethical principles in managing the data, the model building, streamlining the deployment from conception to implementation. And so,

important considerations include working particularly with communities to inform and develop a framework for ethical and equitable machine learning technologies. We need to think about…we need to respect the dignity of all individuals that will [inaudible] in research. We need to connect with each other sincerely, openly, exclusive…inclusively in this research enterprise; protect priorities of social values, racial justice, social justice, and public interest; and above all, really promote racial equity and social justice as a moral and ethical imperative in our research. So, in conclusion, I think there’s a huge potential for AI and machine learning to inform and advance study science and really drive better interventions in our programs, our policies, and practice, ultimately to include the health of socially disadvantaged populations. And in this era of COVID-19 and beyond, it really should be a necessary extension of our efforts around big data, AI/machine learning, and digital transformation that we are hoping really integrates a complete etiological context of each patient’s health. The research that you all conduct today or what you do today in this era, I

would say, in terms of contributions, it’s going to be critically important because in the year 2025, the world or our society is going to be building on that basic knowledge from those abstract ideas and those discoveries that you and all of us come up with today. We should know that the innovations in science advances that we take for granted in the world right now were really knowledge and ideas that came up in the 1990s and 2000s. That’s the foundational substrate that we’re exploring today, whether, you know, it’s the genetic engineering, genomic sequencing, or understanding the actual influence or determinants of health and disease mechanism. So, most of these, without question, are really based on ideas from past years. So, our vision for AI and machine learning in health disparities research really should be merged with a desire for social justice to make this world a better place for everyone, to be inclusive in our technologies, and to promote health equity and racial equity. And I know there are a lot of you involved in advocating for change, and for those of you who are already involved, I really want to applaud and congratulate you and by saying a huge congratulations to the AIM-AHEAD team and for everyone who is involved in this very important initiative. Thank you.

DR. NICOLE REDMOND: Thank you so much for your remarks, and now we’ll transition to our next speaker, Dr. Talitha Washington, the director of the Data Science Initiative of the Atlanta University Central Consortium. DR. TALITHA WASHINGTON: Thank you, and thank you for the invitation for being here. As she said, my name is Talitha Washington. I’m the director of the Atlanta University Center Data Science Initiative, and I’m going to just tell a little bit about the work that we’re doing, how we’re approaching health disparities through the AI/ML workforce lens. So, the Data Science Initiative works across four HBCUs. It works across Clark Atlanta University, Morehouse School of Medicine,

Morehouse College, Spelman College, and we also work with the Atlanta University Center Library. So, one of the things when I was preparing this is…you know, the question of: Why should we even do this in the first place? What’s our rationale? Why should we even care about diversity when we’re talking about AI/ML workforce? So, there was a report—an article that was back in January of 2019—that asked some key questions. How can diversity be accomplished? So, these are some of the things, I think, that we’ll be thinking about today. How can we bring about diversity? How could this knowledge become more widespread? And as we know, our culture really informs the health care that we provide, and so diversity in health care is definitely not underrated there. And also, if we look at pay in the health care, minorities are paid on average 12 percent less than their White counterparts, so there’s some work to do in the pay disparities. But really,

when we think about health disparities, it’s the right to health; it just not solely a health care issue. This really is a civil rights and a racial justice issue, and artificial intelligence and machine learning are rapidly becoming powerful tools for addressing these concerns in both medical and health inequities. So, I’ll tell you a little bit about the work we’re doing here in Atlanta. So, Clark Atlanta University has a Center for Cancer Research and Therapeutic Development, and so for this center, they…they do cancer research, and they say that the incidence of prostate cancer is 1.6, and the mortality rate is 2.5 times higher in African American men than Caucasian men, so there’s that health disparity difference. And even while Black men living in urban communities face

many well-reported risks, those living in rural areas face similar risks and are those that are most often overlooked, including restricted access to quality health care. So, even though we’re in the year, what, 2021, we still have these access issues to health care. We still have these disparities in who gets done, so the Cancer Center’s looking and actively engaged and providing opportunities in developing research, training scientists in cancer research, and they lead also community outreach for prevention. Just

plant a seed that they’ll be hiring a director soon, so if you want to come to Atlanta, please reach out. At Morehouse School of Medicine, we have a Center for Maternal Health Equity, and as we know, maternal health is very important. The pregnancy-related mortality ratios for Black women are about three times as those for non-Hispanic White women, and that’s an issue; that’s a problem. I mean, it’s a problem for anyone, but when we see these disparities, we as Black women have a higher mortality ratio, and so we really are pushed for a part of our work as part of the human nature of who we are to really give back and develop these new strategies to understand these disparity ratios and also to develop new approaches. So, today there’s been very little training on artificial intelligence/machine learning among African American communities, so we really, in the Atlanta University Center Data Science Initiative, have a unique opportunity to intertwine data science with workforce training, looking at these health inequities and bringing to light some new ways where we can really move the needle, so to speak. So, as I said, we work across four HBCUs, or historically Black colleges and universities. Now, HBCUs were established prior to 1964

whose principle mission was and is the education of Black Americans. And even though we make up only 3 percent of institutions of higher education and enroll 10 percent of all Black students, we actually disproportionately prepare a large number or a percentage of those who are STEM degrees—so, 18 percent are in STEM degrees, so we really have a unique niche in this HBCU space to really make a difference on how we can diversify the workforce. And so we…so, there was a consensus report by the National Academy of Sciences that really talked about MSIs being America’s underutilized resource, so as I go along with the Data Science Initiative, I’m thinking about literal ways that we can leverage what we have, our expertise, so we can bring about some societal changes. I came on as director in August of 2020 and been kind of hitting the ground running ever since. I guess I’m not new anymore—I’m coming up on a 1-year anniversary—and we had a good time building this initiative.

So, with this initiative, we facilitate and coordinate data science research and activity across all the institutions with the importance of looking at and actively engaging in increasing the number of African Americans with expertise in data science and…we are HBCUs, and so a lot of our research and teaching is…really has an eye to address social justice for Black lives and all that we do. So, the goals of our initiative is to develop talent and also to create new knowledge, and the way we do that is through our curriculum, through our research, and also through engagement. So, we’ve been busy, like I said. I’ve been here since August building up the program, developing a minor for undergraduate students, developing a course—Data and the African Diaspora—that is up and running now. We have a postbac program in open-source software development that’s happening now funded by the Sloan Foundation, programs for incoming students at Clark, so we’re in different facets of providing opportunities to engage, to develop students and also faculty in all things data science. Just a little bit about how we do things. We do things in a collaborative fashion, so this minor wasn’t…our data science minor wasn’t created by one; it was really created by a consortium—a cross-section of people from across the Atlanta University Center who came together and said: What is important for our students to learn? So…and they said math and stats, programming, modeling data curation, ethics, and communication, and we’re looking for the minor to take place in January of 2022. To support the learning of our students and also our workshops and researchers, we have just recently launched in Atlanta University Center a virtual computer lab that really provides a virtual format where people can engage and do data science work. We also had a seminar series that occurred

over the spring semester—they’re posted on YouTube and our Facebook—that really was a way to engage with other folks what’s going on in data science. So, for example, Dr. Monica Jackson talked about: can data predict who gets COVID-19, data and the Black vote, building degree programs in data science. So, it’s really a way of having different topics but available to the broader general public to really stimulate thought there. This past spring, we held our inaugural W.E.B. Du Bois Data Science Symposium. Du Bois was

a professor at Atlanta University Center back in the day, and he’s also known as a data visual innovator. He told the story of Black folk in the United States through his data visualization work. In that same vein, we are also doing that with the Data Science Initiative, telling our story, bringing new light and bringing some innovations. We also funded mini-grant awards this summer, and we’re looking to spur up research and curriculum in data science, so we’re really excited to see what will…what will come of these awardees. We’re…you know, our workforce isn’t just only restricted to our students. It also is our faculty, building them and providing resources to them to think about different ways about data science, so we’re hosting a series of eight summer workshops—all of them full, it’s amazing—to really provide different avenues for the faculty, staff, and graduate students to come into data science, with the intent that that will then percolate into their research or into their classroom, into their training of their students. And

we also…this summer, we have a pre-freshman summer experience for students coming in to explore data science, and as I said, we also have a postbac program on open-source software development. So, we’re looking at developing the AI/ML workforce in different ways and with the real blatant attempt as we aim to diversify, and so in our imagery, we really aim to show our students that this is them and they are the expectation to be the data scientist, to do machine learning, and all the rest of that. We are also supporting summer research programs at Clark Atlanta and Morehouse School of Medicine. At Clark Atlanta, they have a promo summer virtual data science training, research, and education initiative that has two pillars. One is an undergraduate research program, and then another one is for faculty to engage in training in data science and also to do research—expand their research portfolio. At Morehouse School of Medicine, they have a postbac program that’s on a health data science summer bridge program in biotechnology and bioentrepreneurship.

Morehouse School of Medicine—I have to say this to my colleagues—was both rated number one in their master’s program in biotechnology, and they’re really focused in making a lot of headway and having…and moving health data science here in the AUC. So, for the Data Science Initiative, we are led by the Council of Chief Academic Officers, and we work very closely with the faculty advisory board, which consists of members from across all of the institutions, and they advise us and make sure that we’re really moving in the direction that best serves the Atlanta University Center but also serves other entities and interacts with others, as well. So, my team…we’re a team of four, and my deputy director is Jerry Volcy, and my administrative director, Bettina Gardner, and Tommy Taylor is the communications specialist, and we work really hard…and we’re…as a team, we’re committed to advancing data science, artificial intelligence, machine learning, and cybersecurity techniques that address ethics and bias, with a focus on topics that impact Black America. So, I kind of want to end with this because at the beginning, the title was, you know, A Seat at the Table—A Seat at the Data Science Table. This is one of my favorite quotes from

Congresswoman Shirley Chisholm: “If they don’t give you a seat at the table, bring in a folding chair.” So, at the Atlanta University Center, we aim to really help push and get people to think about data science, AI/ML/DL, all the rest of that, in a way that resonates with all people so that no one group or demographic gets disenfranchised, whether it’s in health sciences or other areas. So, with that, thank you for letting me present today and for sharing a little bit about what’s happening down here in Atlanta. DR. NICOLE REDMOND: Thank you so much, Dr. Washington. So, with that, we’ll be transitioning

to the next part of our program. My name, again, is Nicole Redmond. I’m a physician-scientist with the National Heart, Lung, and Blood Institute, and I’m a part of the AIM-AHEAD planning team, and so in your materials, you should have received a second link to log in to your assigned breakout topic. We had such robust response to this forum, and we did ask your preferences, but there were a number that wanted the research group, and so you may have received the link to your second choice, but we anticipate that there will be great discussion during that time. So, you will leave this link, use your breakout room link, and then you’ll get split into smaller breakout groups. After about an hour of discussion, you’ll come back to the main breakout room to consolidate the feedback. We’ll be looking

for a volunteer from among you to serve as the reporter, and then you’ll come back here using the main link that you used to join this session for the report back in. I see that we’re running a few minutes behind, so I think, actually the start times will add about 10 minutes to them, so we’ll get these started and look for you to consolidate your feedback at about 2:45, and then come back to this room at about 3:15. So, thank you, and we’ll see you in your breakouts. DR. NICOLE REDMOND: Okay, it looks like the attendee number has stabilized so, I'm going to stop sharing and we're just going to have a round robin and have a representative from each of the three topics just give a very brief high-level report out of the discussions. I want to emphasize that…I’ll turn on my camera here so you can see me. I’ll emphasize

that, you know, we’re not expected…expecting anyone to go through the full explanation and into details; that’s why we had our notetakers and our…took a lot of notes, so that way we can have a more comprehensive summary of the discussion that we’ll consolidate and have on our website, but we thought it would be good for everyone to have an opportunity to hear some of the very high-level themes and topics. So, first, we’ll hear from the infrastructure group, so Dr. Thomson, you are the lead for that group. MR. ALASTAIR THOMSON: Hello. So, we have a lot of input, so I’m kind of synthesizing on the fly here. So, one of our first questions was: Don’t start building infrastructure

until you understand my concerns about it or my interest in it. There was a lot of input here, a lot around concerns about privacy and security and that those need to be dealt with before we get there. How do we share data in a timely manner, given privacy concerns, the transmission storage and analytics for it? There were comments about there’s some data that institutions are not going to be comfortable in sharing widely and that seemed in compute to the data is going to be important—figuring how to incorporate all of the data types: EMR, imaging, and everything else into the clinical environment. There was a very interesting

comment from Google, I think it was, about the quality of the EHR data in disadvantaged areas where it is much lower quality because they don’t have the process and things in place to clean it, which introduces biases. It’s easier for an AI researcher to go and grab data from better-equipped areas, and that’s introducing bias, and we need to be able to address that and resource the institutions to be able to get the data where it needs to be to be truly AI-ready. There were so many others. Everyone was concerned about the capacity. They specifically mentioned DPUs and Cloud capacity for that…that it needs a lot of capacity for complex machine learning model training and that they just don’t have access to that kind of capacity at reasonable costs, and we need to figure out those strategies. Comments about needing strategies before we start building infrastructure

to make sure that we reduce redundancy and foster collaborations. The collaboration theme came up pretty frequently, and the need for multidisciplinary collaboration, and that the expertise to deal with this does not exist in single teams, that we have to be able to bring people together across institutions, across disciplines in order for this to be effective. There’s a variety of examples they gave in terms of what they’re doing today, from predicting stroke in preterm infants to tracking COVID-19, but the comments that all are probably doing, data is rate-limiting. The EMR is there, but it needs mapping to

a consistent data model, such as OMAR, the kind of thing that has been happening in pre-C. Mark [indiscernible] mentioned their partnering with Children’s Hospital in a program with the main experts and design to optimize the data sets. This partnering was a fairly common comment from the industry people, that they’re working with academia in order to enable them to do the things that they really want to do, and they’re bringing in a different kind of expertise. And so to the collaboration, it’s not just collaboration with academia—it’s collaboration with hospital systems, with industry…the [indiscernible] industry and others are going to be important. There was a lot of diversity in how much the various participants

understood about AI. We had some people who were deeply experienced with it from NVIDIA, from Google, from Microsoft, but there’s also a group of people who just heard about it, and I think our conclusion was that diversity of experience is going to be critical for this program to be successful, that we’ve got to be able to address things at all levels in order to make it effective and to really address equity. There were a lot of barriers around costs, access to GPUs, a lack of understanding of the tools. There was a comment about a lack of interest in the biases because it takes a lot of time and effort to do that, and when you’ve got data sets that are just available and don’t need much data preparation, you know, it can publish faster, so I’m going to take the simplest approach possible that we’re going to have to address. Big, big concerns about getting the right expertise

in the right place. The AI, the clinical, and getting it together. In terms of…there was one of the questions about adopting AI moving to the Cloud—how long would it take you to build the capacity? Many of them said they got some capacity now and that most institutions are hitting in the direction of Cloud, but there were things…more things that needed to be done. There was discussion about a centralized, deidentified database for use by everyone for those who are willing to share their data; the ability to access multiple modalities of data in one place was a theme that came through in all the groups; and in the same way, a centralized co-chairing and problem solving. That’s…one of the issues that

was observed was that, you know, models are produced, algorithms are developed, but then they don’t go anywhere because there’s nowhere to put them. You know, they might exist in GitHub somewhere, but there’s no centralized place to go and index them and find them, and that if we really want to address that, we have to deal with the codes and the algorithms and things, as well. The comments about tools that non-programmers can use…and this has been my experience, too. You know, there’s a certain group of people who are fine using, you know, Python and all of the frameworks with it, but there’s a lot that have got interesting and valid AI research to do that they want to be able to take a model, retrain it, and then apply it without having to go in and actually write a lot of code, and that’s something that will help democratize the availability of AI. In terms of actually seeing EHR data, we asked about what data they had access to, and comments were that they need partnerships to be able to get at the data. Health systems only include

facets that they generate, and given that they’re generating the data in a lot of respects for financial purposes, there’s a degree of not being fit for purpose. In particular, there were comments about lack of social determinants of health data. They don’t…the hospital systems don’t exist to generate data for research; they deliver the health care, so they can send in bills. We’ve got to recognize those limitations in what we’re doing. Talked a little bit about the kinds of organizations that need

to be involved, and just what we really came through was the diversity of understanding that’s needed—not just computer science, not just bioinformatics, not just AI, but bringing together some of, I think, the more conventional statistics or whatever, as well. There were a lot of examples that were given of working with various things across the board. So, there’s a lot going on, and I think there’s things that can be brought together that are already happening and facilitate them, including the exchange of knowledge if we put in place some of the things they’re taking about around co-repositories and other things. In terms of that kind of experience, there’s a desire to be able to engage with

the Cloud vendors, the data providers, but also with legal ethics groups, patient advocacy groups. This was another theme that came through in multiple of the small groups is that the patients need to be involved in the discussion about what’s going on so they understand what is there. One particular comment, in terms of the barriers to forming these kinds of partnerships and making use of it, was that AI’s sort of seen as a not ready for prime time thing—it’s almost seen like science fiction—and so there’s less investment in it because, you know, it might make an interesting sort of lab experiment, but it’s…you know, a nifty tool, I think was the phrase that…“nifty science projects” was the phrase, and part of what we need to do is give researchers exposure to real current use cases in the health care space to help them build an understanding of what it is, and it’s not just future technology—that’s it’s there and available and developing now. And with that, I think I’ve just about run out of words, so I’ll turn it back to you, Nicole. DR. NICOLE REDMOND: All right. Thank you.

That was a great job on the fly. So, next we will hear from the research group. I think Dr. Schaumberg is presenting, and I know that you sent some slides, and so if you want to go ahead and…go ahead and start, and I’ll try to get those up for you shortly. DR. ANDREW SCHAUMBERG: Yeah, I’m afraid I don’t have those access to the slides, unfortunately, but there’s… DR. NICOLE REDMOND: Give me one second. Let’s

get this going. DR. ANDREW SCHAUMBERG: Elizabeth did an excellent job in getting that. DR. NICOLE REDMOND: Okay, is this what you’re looking for here? DR. ANDREW SCHAUMBERG: Sure. Yes. Elizabeth Walsh was taking notes. I took some notes in the background, but I can do my best to fill in some of those. DR. NICOLE REDMOND: All right. So…

DR. ANDREW SCHAUMBERG: So, great. We’d probably all benefit if Elizabeth were here, but there’s a lot of discussion about minority populations or underserved populations, and so on, and what could we do and need to engage with them better and incentivize sharing of their cases because they didn’t feel represented, and building trust on a number of levels—if it’s for environmental reasons, for Indigenous communities—but we need to get out there and engage with these communities. There is …there’s been discussion on commercialization. For my part, I studied some of the social media aspects of patient case sharing with oncologists, and what does that mean if companies have access to some of these data. Thank you for bringing up the research teams, that’s very helpful. A core question or a core observation early on was:

We need to be careful to formulate problems from given pain points. We have to engage with communities, see what those pain points are, and then put our science around what serves those communities best rather than sort of coming up with a hypothesis served in isolation and then trying to shop around to see whether that would be suitable to deploy. So, it’s putting these communities first, and making sure we’re focused on serving them is important. There was also a lot of interest in Cloud and centralized data repositories.

A lot of discussion on federated learning and how we could do that better, so I’m not sure of I’m adding extra value by discussing that further. I mentioned trust building, data sovereignty, and—again, it’s been mentioned earlier—patients need to understand what it means for their participation in these research programs, how much control they will have and will not have, but engaging with them is very important. A lot of extensive discussion, actually, while we were taking those on…on biases—which communities are overrepresented or underrepresented. There’s a lot of great literature in AI about dealing with biased data sets, but it’s important for us to collect data for these communities to serve them best, as best we can, for my part. It can be challenging to establish all those connections, so we have biases not only in our data but also biases just specific for myself, for the colleagues that we partner with. It’s important for me, for instance,

to do better at that and be as equitable as we can. There was a discussion also about what it means to involve these historically Black colleges and universities—HBCUs. I thought this was an excellent commentary. I thought, you know, there was some history on sort of an asymmetric relationship where an organization would come with research studies or come with money and then the HBCUs would be expected to do a lot of the work—there’s some asymmetries there—and addressing some of those asymmetries to collaborate on more equal footing. I thought…for me I thought that was great to hear. This is not my expertise, but I’m here to learn, so I appreciate just learning about some of that. I feel the pain

also with data sharing culture. Sometimes we’re not incentivized enough to share our data. For me, you know, if it’s an international data set and patients are involved, what does it mean to the IRBs at the institutions, NIH funding agencies, Department of State, when we start branching out internationally to reach underserved communities worldwide? So, we should be mindful of what those collaborations look like but also collaborations that, you know, we would have with Cloud companies, social media companies, wherever the data are. It’s important to be mindful of all of these things. Discussions about the rich data sets, and I think biases and the data structures themselves as a source of bias, which I thought was insightful to remark on. Things like gender. What does it mean if the

only thing that’s represented in data structure is male or female? In any number of these fields, you know, we’re imposing structure on people, and people aren’t that simple, so what does it mean for us to open up in a way, which is still, you know, amenable to analysis? I think there were some great comments earlier, Mr. Thomson, about…geez, what was it …biases and, yeah. So, what are our data structures that we’re using, and how does that contribute to the biases that we have, the biostatistics, right? Involving biostatistics in a rigorous way to do these analyses with more flexible data representations…excuse me for that. Education—we had some great speakers on education earlier on AI, and now

it’s important to have diverse expertise and diverse ideas. I think that’s, you know, as important for me, that diverse expertise in who’s doing the analysis, and who’s engaged in the study? It’s also important to address, of course, the biases in the data, but everybody should be involved; it benefits everyone. We’re partnering with other research institutions, I think I mentioned that a little bit, and the diversity of research and collaborators, I think I touched on that already, so thanks for giving me that slide to be…to click that, so I really appreciate that. I think that’s the summary of some of the things

we discussed. If there are other slides that you’d like me to share, I’d be happy go through those, but I think we focused most of our time on this slide, if that serves us. DR. NICOLE REDMOND: No, I think that is fantastic. DR. ANDREW SCHAUMBERG: Oh, okay. DR. NICOLE REDMOND: You did a great job. Great job, because I know there was a lot, and this particular group was the largest of our breakouts, so excellent job. We appreciate you doing this for us. So, next, I will call to the

virtual podium Dr. Idris to speak on behalf of the training group. I’m going to stop sharing, and I think we did not have a consolidated slide, given the time. There you go. All right. DR. MUHAMMED IDRIS: Should I go ahead and just jump on in? DR. NICOLE REDMOND: Yes, just go ahead and jump in. DR. MUHAMMED IDRIS: All right. Thank you so much, everyone. So, in our breakout group,

we touched upon a variety of different questions ranging from what successful collaborations might look like around training in AI/machine learning, best practices for scanning of existing programs, and also barriers and facilitators towards kind of mutually beneficial partnerships. In particular, I’m thinking about kind of diversifying data science training and maybe what some specific needs might be for historically Black colleges and universities, as well as tribal colleges, nations, and safety-net health systems. And so, at a high level, one of the…a common theme that was brought up amongst various groups is this idea of a train…the data science trainer programs. Of course, you know, we see this a lot where you might have folks who might not consider themselves to be data scientists but have very specific expertise, whether it be in math and stats or whether it’d be some aspect of programming, and maybe even some domain expertise, right? And so, having kind of the ability or a set of programs that allow folks to kind of train kind of across these foundational areas, and then take that back to their respective institutions, it would definitely be something of interest.

There was a huge focus also on the interdisciplinary nature of training programs and the need for folks with that domain expertise—practicing physicians, in particular, providing them opportunities. This could be providing, again, more background in statistics, more background in programming. That would be an ideal opportunity for a crossover, and so having those sorts of interdisciplinary programs would be ideal. There’s actually an excellent example, I believe, at the University of Pittsburgh, where we actually have folks who are in the process of completing their residency or fellowship, and we’re also able to get funding and complete some graduate training in AI and machine learning, and that was kind of an excellent example of what that might look like. Another aspect of what we spoke about was, really, if our ultimate goal here is to diversify the pipeline and eventually the workforce, the need to be culturally aware and to really take a step back and appreciate the structural barriers that underrepresented groups face in entering and completing a successful career within the field, and so, you know, one example that was brought up is that there are often the case where you have students who might…may have…may not have …well, let me rephrase that…who definitely have the interest and the ability to participate in these programs but might have circumstances outside of the program that might limit their ability to participate fully—so, being kind of fully aware of the entire person’s circumstance and really doing our best in order to provide them with kind of the mentorship and resources and the opportunity to complete training programs and then go on from there and succeed in the field. One of the areas where we spent a little bit of time talking about is kind of: What

is the ultimate goal of these training programs? And often within in the context of the NIH, it’s ultimately developed independent early-stage investigators…independent investigators from diverse backgrounds, and so this idea of throughout the training programs into internships, whatever it might be, making sure folks have their appropriate mentorship and those relationships that will allow them to succeed, ultimately. We also spent a little bit of time talking about the kind of importance of understanding kind of the “why” behind the…kind of our interest and motivation in really making kind of the idea of leveraging data science and machine learning in AI for health disparities a personal one. This, you know, definitely involves building trust within the communities within which we’re working and providing opportunities for folks from within those communities to get the training that’s necessary but, you know, not in this brain-drain sort of way where we’re kind of attracting people to research for institutions, working on kind of other problems, but ultimately with the end goal of coming back and serving kind of within the community. So, this idea of what happens when you finish training was something that was brought up, and the need for incentives to keep people in the field—we need to think a little bit more deeply about that. The final two, I think, kind of go hand in hand. One is, you know, we often have these kind of

nice, big funding opportunities and have rather ambitious goals around training, but how do we ensure sustainability for programs established from this initiative? What does that look like? And in particular, there might be opportunities to find resources and support from within industry to help make these programs more sustainable. And so, with that, I hope I’ve done our group justice, so forgive me if I haven’t, Nicole, but a very lively discussion, and again, we touched upon a lot of what was described in the other previous groups—breakout groups—in the interdisciplinary nature of training and the need to establish, kind of, trust and to develop and sustained relationships—trusted relationships—within communities in which we’re serving. DR. NICOLE REDMOND: That was absolutely fantastic, and so, I really appreciate you getting volun-told. You did an excellent job, and I think they’re, you know, definitely reflected—the very robust discussion that we had in our groups.

So, with that, I am the only thing standing between everyone and the rest of the weekend, and so we wanted to just have a brief summary of what we’ve discussed, and I think some of the key words—if I could just pull some words out—interdisciplinary. I think there was a lot of discussion about trust and how that plays into the quality of data in security, privacy, and even just engagement in the program overall. And then, also, sustainability—and so, wanting to be appropriately resourced and supported so that this program would go through. So, I just want to leave you with a few final reminders. First of all, thank

you, thank you, thank you, thank you, especially those of you who have been able to be with us for the entire afternoon on a Friday in the summer. We’re just indebted and grateful to you for your participation. We want to remind you that we have a live…published a Request for Information. The Notice number is here. It’s an RFI on Inviting Input to

Broaden the Benefits of AI/ML Technologies to Reduce Health Disparities and Inequities, Enhance the Diversity of AI/ML Workforce. So, this will be an opportunity for you as you continue to think about some important considerations, the issues, concerns, potential barriers that would need to be addressed, suggestions for how to make this program the best that it can be. We really want to hear from you and get that feedback, so the link to the RFI is located on our website here, and then you’ll be able to use the submission portal to respond to the RFI. I also want to remind you that other program information will be posted here on this website, including the meeting summary, which hopefully we’ll, as we’ve gathered all the recordings, the notes, the slides, and we’ll consolidate that to have a more detailed summary available to you on the website. Again, once we have all of your responses in the RFI, those responses will also be consolidated, and we will then link a summary of the RFI responses on this website, as well, so that we can be sure that we have all the feedback that we need to be sure that this program succeeds. Lastly, on

the website, we also will publish any future opportunities—so, any opportunities for additional webinars, meetings, engagements, information, and—most importantly—the funding opportunity announcement that will be forthcoming. So, this will be the place to come for all information about this forum, as well as everything that’s forthcoming with the program, and again, I’d like to…if you have any questions, just general questions, you can also send the team an email, and we’ll be sure to try to address those as best we can. I think I…we’re ending a little early, actually, so that’s great, especially on a Friday afternoon. I’m going to check in on my co-panelists to see if they’ve got any other questions or comments that they want to make. [brief pause] Go ahead. DR. DINA PALTOO: No, I was just going to say thank you to everyone for joining us, and just echoing what Nicole said: Please, please, please submit your comments to the RFI. We’d

really appreciate it, and thank you all for taking the time to spend with us today to share your feedback. DR. NICOLE REDMOND: And I’d also like to reiterate my thanks to my colleagues on the AIM-AHEAD Working Group, and a special, special thanks to the volunteers who led…assisted with the facilitation and notetaking of the workshops, especially on short notice, and so I am deeply grateful for your help and participation with that. Great. So, thanks, everyone. Have a wonderful weekend, and we look forward to hearing more from you. Again, the RFI is open and ready to hear your feedback, and thanks again for spending your time with us this afternoon.

2021-07-16

Show video