HLS Library Book Talk: "Big Data, Health Law, and Bioethics"
Well. Welcome it's, so nice to have you here on this weather, could be a little bit better but maybe if it was a little bit better you'd be playing outside so, maybe we should be thankful for the weather I'm. Glenn Cohen I'm professor, here at the law school I'm the faculty director for the Petrie Film Center for health law policy. Biotechnology. And bioethics, aka. The longest, name Center you've ever heard of I want. To thank the Harvard Law School Library who. Helped put this together with special thanks to June Casey who organizes, all these great book, talks and so, many others coming up also, our co-sponsor. The Berkman Kline Center, for Internet and Society at, Harvard University. And I want to mention that immediately after this book talk we'll go, to the pub or, we'll have food and libation, for you you could hear more about health, law at Harvard's, a few free beer free, wine I think free food a great, place to celebrate this event the. Book we're going to talk about today is edited, by myself, professor gasser, who you'll hear about in. A moment and hear from and also, Holly Lynch who's formerly, at Petri phloem is its executive director now, a professor at Harvard at, Penn medical school that was a Freudian. Slip because I wish she was still at Harvard and Fe Vienna who's at ETH Zurich, we, also received, amazing. Support from Krissy Hutchinson jones's in the back who also helps run these, amazing, events and, my, Ras Ethan, Stevenson Brian Yass Colin, Hurd and Wilfred, Bay who, helped do the line editing, okay those, introductions, aside let, me start with a claim I've made in a few other places that I think is really important, and that, is that people talk about big data the way adolescent. Boys talk about sex. Everybody. Says they're doing it relate, to each other, many. Of them are not quite sure exactly what, it is. If. They are doing it they're probably not doing it very well and certainly.
They're Probably not doing in a way that both parties, to the, interaction. Enjoy, right, so. What do we mean when we say big, data typically. We refer to the three V's, lots of hype here but let's try to get down some facts many. Possible, definitions, by him a common denominator of the three V's volume. There's a vast amount of data variety. The significant, heterogeneity, in the type of data, available. And velocity. Speed at which the data can be processed. Or analyzed, is extremely, fast now, some would add 1/4 V. Value. But, in many cases I would say that's more aspirational, than, actual, and, defined. As such healthcare has become one of the key emerging, use, cases for big data for, example Fitbit. And Apple's research kit can provide researchers, of access to vast stores of biometric, data on users. From which attest hypotheses. On nutrition. Fitness disease, progression treatment. Success and the like. EHRs. Are quickly being analyzed, and in some instances attempts to monetize, them, countries. Around the world are trying to use big data to improve public. Health but. For all of these wonderful promises. This book we are lawyers an ethicist after all focuses, a little bit also on the gloom right, the real big. Questions, raised by big data from legal and ethical perspective. It, has I would say at least two cross-cutting. Themes one. Is the question whether the regulation, of medical record, data. Non-medical, record, healthcare, data and non. Healthcare. Data. That permits inferences, about health whether, a single regulatory regime, will be good for all of that or whether we instead need multiple, regulatory regimes. And that's, important, to emphasize, because while today and I would say HIPAA our American, system of healthcare privacy, is focused, on the doctor-patient. Interaction. The healthcare record that's, an extremely, important source of health data, but, it's far from the only source your, social, media post incredibly. Important, source startling. Finding, that you could actually do fairly, good diagnosis, of depression from. Instagram. Filter, choice for example revolutionize. The law a lot of people think about this but, whether it's social media fitbit, whether. It's health records whether it's pharmacy. Records whether it's purchasing, records right the famous story it's, true I looked it up it's not apocryphal. Somebody. Who's, comes. Into Target I'm gonna in my imaginations, a southern person I don't know if they have to be a southern person but it makes the story sound better right so this gentleman comes into Target and says I want to speak to the manager and the, manager comes forward say what do you want speak to me and, he says what, business do you have send in my daughter, all. This information, about pregnancy. She's. 16 years old, the. Manager apologizes. Profusely, well, about a month later the man comes back to target and says I'm, here to apologize. Say. Why are you apologizing, said my, daughter is pregnant, turns, out target, knew his daughter was pregnant before he did right imagine, now it's not just the target data set but it's everything you've purchased on Amazon it's your electronic health records it's your Facebook it's your social media it's the nest in your home how, do we think all of these can be put together or should not be put together what. Will be the effect for big data this, is the second theme on not only the healthcare professions, but medical education, society, the way we interact with each other there. Are 22, chapters, in this book plus an introduction, plus an epilogue to discuss these things they're, written by leading, scholars, and practitioners cover. A vast terrain, of knowledge I'm just gonna mention a few of the topics covered is. Inferring. Knowledge, about people from data the same as a privacy violation to, what extent can existing, regulatory regimes. Be arbitrage, for the world of big health data, what, about pharmacovigilance. And False. Claims Act can we use big data for that the. Americans with Disabilities, Act will, we think of. Discrimination. If you want to use that word maybe that's pejorative, but determinations. Made on the basis of data that, it touch on disability. Or predictions, about health as violations. Of the Americans Disability. Act if not is that a problem how. Do you deal with cases where big data under, samples, minority, populations.
In Terms of making predictive, algorithms, and the like for health what. About medical malpractice or, other forms of liability, in these settings big. Data and research ethics, are their, duties for us to share data and under what terms and how, do we think about the intellectual property, regime in particular patent, but also trade secrecy as interfacing. With questions big big data big. Data it's a big topic but we've got some big people here to talk about it I'm now going to introduce them the. First one is gonna speak to us is Professor Earth's gas sir these people why they have mile-long, CVS. So I'm just picking a couple highlights he's. The executive director of the Burkman Klein Center and a professor of practice here at Harvard, Law School. He's published more than 100 articles and professional journals and his books include born-digital. Interop. And a forthcoming book on the future of digital privacy. Well. Then hear from a meet subvert. Wari I always, hesitate a little bit see if I got the name I got exactly right right a meet, is, the assistant director of the program on regulation. Therapeutics, and law portal. Best, acronym, ever at the Brigham and Women's Hospital he's. An instructor in medicine Harvard Medical School and associate. Epidemiologist. At the Brigham he also teaches public health law at, the University. He studied epidemiology. At the University, of Cambridge and studied law at the University, of Maryland, he, is a well-known scholar you'll often caught catch him either on TV or testifying. Before legislators. Or all of the above finally. Carmel shahe the executive, director of the Petri phloem Center and a lecturer on law at Harvard Law School she's. A common law graduate, of Harvard Law School but also holds an mph from the public health school. She, teaches here, and she is really. Running a huge number of projects, at the Petri Flom center that if you stay for. The, free booze I just told to mention this as much as possible the, free booze and the free food you can hear more about what those exciting, projects, are everything from law neuroscience. On the one hand to, investment. Questions on the other to. Even things we've recently finished doing like working on the health of NFL players so. With that kind of short introduction, I'm going to turn it over to Professor gasser, who's gonna go first. Hello. Everyone. Good afternoon I'm delighted, to be here. First. Of all a little bit of a warning I'm not an expert, at. All when, it comes to this, intersectionality. Actually, the previous slide anyway. Of health, law big data and bioethics, I'm the digital guy right you've, heard it in the introduction I'm working. At the Birkin client center we, look more broadly at the impact, of digital and, digital technologies, on society, and so I'm relatively, new to this field what's happening, when you introduce, tech, to health, and. Therefore, my. Comments. Will be a little bit more a reflection. On what have I learned as. I had the pleasure to help, editing, this book. Before. I, share a few observations I want to really, say thanks, to Professor. Cohen and the team at Petrie phloem for inviting us to, this party, I learned. A ton and here. Are a few things, that I've learned so. Essentially what I'd like to do is share with you six observations. Six. Takeaways, that fall. Roughly, into three categories, first. Category is. Understanding. The phenomenon what's going on as we introduce. Advanced. Technologies. Such as big. Data analytics, AI, the Internet. Of Things you, name it to. Health. What. Is happening, on the ground, second. Cluster, of observations. Is around what are the normative issues, and professor, Cohen already alluded, to some of the big normative. Questions, that, arise. And then, the third cluster is about, okay, where, do we go from here if. We understand, the, opportunities, but as well as, the challenges and want to. Make. Sure we embrace the opportunities, but, also avoid some of the pitfalls, okay. So first. Observation. Understanding. What's happening we've. Already heard it in the introduction, the. Big lesson, learned number one it's complicated. A lot of things are happening. For, those who are not working in health or in health care, the. Health system, itself of course is of incredible. Complexity. In. Many ways and, now you layer on top of that the technological. Complexity. It. Makes it really an exercise in, managing. Through a complex. System. And, that. Management, of complexity. I think is a real, challenge. For, a number of reasons so. First. Of all you take the example. Of electronic, health records, and, this long journey that, we. Are in to increase interoperability, among. Health records, that then, enable. Us to analyze, the data from, many. Many different sources the. Reasons why we haven't achieved more interoperability, are really complex, have to do with economics.
Fear Of liability. Cultures. Within, within. Health. Systems. Ant alike. So, in. Other words as you introduce technology to, this already, complicated. And complex system. It. Gets really hard to understand, even for experts, and one. Of the big lessons learned from the book and from the conversations, around it is really how, many different experts, you need to get together in one room to. Better understand, what are what is happening, right now what are the promises, and what, are some of the pitfalls. So. In that sense it's a deeply, interdisciplinary. Endeavor. To talk about big. Data and. Health law and bioethics, and. With, that interdisciplinarity. I think a number of complications. Arise, the. First one that I observe, is. One. Of semantic, interoperability. If, you're a statistician. Or. A computer, scientists. Who understands, the AI part, of whatever. Diagnostic. System, is developed. You. May have a very different language, and, use a very different terminology. From the physician, who ultimately will. Work with this particular system or actually, the patient, who. Will whatever get. A treatment based, on what, the physician, or and/or the AI system determined. So, how do we deal with these different vocabularies. A. Simple, example is computer, scientists, and engineers have, a different notion of privacy, than lawyers or. Many. Of the people in my space and it. Takes a lot of time and energy to, define, these. Vocabularies. Which is important, if you want to understand. What's. Actually going. On how do we label things and what are we going to do about it I would. Even take it a step further and. Professor. Cohen, alluded, to it already it's also that, within, each discipline, the vocabularies. Are challenged, so what, is health data in this new environment as, you described, it before. It's. No longer, clear. What basic, concepts, mean the same would be privacy. In. The legal discourse so, we have real language, challenges, as we even try to triangulate. The. Phenomenon, here. Last. Observation, within, that cluster is also many. Of the technologies, that are now added, make it particularly, hard to understand, them think think, about AI. Systems. That reach, a degree of complexity, and are really often blackbox technologies. Where, even experts, can't explain, what the algorithm, exactly. Was doing leading to particular, outcome so, that, makes it very hard to understand. Even as a descriptive, matter, what. Is actually, happening as we inject these.
Advanced Technologies, to to, health healthcare health. Discourse. The. Second observation in, this the. Second cluster of observation, as I said is about normative, issues. So the book talks a lot about these technological, advancements. And the promises, and pitfalls but. Ultimately, as as you browse through it I feel, it's really more, a story about big, normative, questions, about human, questions, about, societal. Questions, value, issues, so. Yes, they, are amplified, by technology. But ultimately it boils down to big. Questions. That we are societies, we as citizens we as patients, we. As professionals, have to answer. This. Concerns. Both, some. Sort of new challenges, normative, challenges, I will get to that in a second but also. Existing. Norm sets, professor, Cohen already mentioned it by way of example, do we need to, update certain, laws and regulations. That have some sort of, in. The past established. An equilibrium, in, terms of balancing, different interests. Of stakeholders involved, and I, take the example of informed, consent if. The informed consent doesn't, really, work anymore as, a mechanism. Or, not, as in past. Technological. Environments. What. Does that mean, do. We need to have something, new or do we need to double down on informed. Consent or should we focus more on okay. Forget about consent, but we put restrictions, on data how data is used, so. These. Are really at its. Core normative. Questions. But. It's not only about like how. Existing. Norm sets are challenged, by the disruptive. Technology, and the dynamics, that flow. From there it's also about some of the hard new. Societal. Questions. That we, are confronted, with professor. Cole mentioned, the issue for. Instance. Of. Individual. Privacy versus. Public. Health. Benefits. Right so maybe. A. Big, data set may be a challenging. Or infringing, on a person's. Privacy. Because, you could reaiiy, person. Who appears. In a big data set but. The value, can get out of the data set and public. Health benefits, may be enormous, so how do we think about these valid trade-offs, individual, privacy versus, the, use of technology, here, big data for. The public great, public, good that's, not something where a technologist. Should. Or can give the answer that's really a conversation. We as a society need. A second. Example in the same category, is the issue of inclusion. Turns, out and one chapter describes. It in quite detail that. Many. Of the big datasets you're talking about are not, really, inclusive datasets. They don't necessarily, represent the, entirety. Of a population, here in the US they don't necessarily have. Data. In it from immigrants, let's say now. What does that mean if we use that data later, on to develop. Treatment, strategies, to make recommendations. Of all sorts, to discover. Some. Of the patterns that Glenn. Was mentioning, are we again, excluding. Populations. That would most benefit, potentially, from.
These Next-generation, technologies. As they are applied in health that's, a big societal. It's a human, question. That. Again goes far beyond, Internet. Of Things or, big data this. Is something. That we need as a society, need. To sort out and figure out, the. Last cluster a, third set is really about design, so if. We have a long, list of opportunities which, are I think, well represented, in the book as well as some of these challenges. Whether it's privacy, threats with whether it's fear. Of discriminatory, practices whether. It's. Bias. Whether, it's. Cybersecurity. Threats. If you think about the Internet of Things so, it's a long list right so, what. Can we do to, embrace, the, opportunities, on the one hand side but, then also manage some of the challenges, as, we, enter. This new world of a. IOT. Data, analytics, and health the big, takeaway, from the book is that there is no silver bullet solution, or. The otherwise, it will be shorter book I guess, but. Rather that we will need, a lot of creativity, and embrace. Essentially. The full, instruments. Available in the toolbox. Professor. Cohen. Of course, for obvious reasons and, put emphasis on, law will. Also return, to that just kind of my last observation but it's worth highlighting that, in the book there are also other tools described, so for instance how, could we work with market-based. Mechanisms. To. Address some, of the challenges, for. Instance there is a chapter, arguing, we should be. Innovative, and create some sort of clearing, houses that, would address some of the problems. That arise, with. The Internet, of Things as, applied in the health care, sector. We. Should you know think about standards. That are competing, on a marketplace for instance, standards, for privacy so, there. Are market-based, approaches. Some. Chapters highlight, other, instruments. Of governance. Including. Technology so, to what extent can we deal with some of the let's, say privacy. Problems, by, looking. At privacy, enhancing technologies. Or new. Frontiers. How. Advanced. Algorithms. Can help to address privacy, challenges so, the key word is differential, privacy here, I could go more into that so. You get a sense it's not only about the law but there are also other tools available and, the challenge, is how can we find a mix, to. Enable some of the good uses of the technology, while, addressing, some, of the risks and challenges. Now. The last observation is really going back to law one. Of the big learnings for, me from this chapter and looking at health as one big huge case study is. That. Health. Has kind of an industry or sector is not that different, from. Other sectors, we've, studied before, whether it's entertainment, remember. You know the copyright. Wars. 15. 20 years ago or if you look at what's happening to transportation, or any other. Industry, in that, that the, reactions. By the legal, system and how lawyers, think about disruptive.
Technology, Is actually, quite similar across. Industries. And. So within health I think you see quite. Clearly that there are roughly three response. Modes. One, is that we try to apply, the old rules you already have whether it's hippo, or any other. Norm. Set and try to apply, that to, the. New technology. That enters, health and, quite. Often that, gets us quite far. But. Of, course you also know there are limits how far that goes and then, the question arises do we need to change the, law we, hurt already an example, the False, Claims Act, does. That need to be upgraded. Because. You can now start to manipulate. Data, to. Trick. To. Commit fraud and healthcare. Fraud, so, you can make Prattville changes, that's another response, mode that you see in many different sectors or you can innovate more dramatically, and, maybe an example, of innovating. More dramatically, is GDP, are in Europe. But, also some of the proposals coming out in the book including. Well. Maybe we have to rethink privacy. More dramatically, in this age of big, data maybe, we. Don't think of privacy, anymore as an individual, right but as a group right because. In, many cases. When. Big, data becomes relevant is about categorizing, certain. Populations. Into groups and that. Is what we, may be concerned, about for instance when it comes to, discriminatory. Practices, so do we need to have a group privacy. Right as opposed to the traditional right, to, privacy that we think of as an individual, right so, these are some of the bigger. Paradigm. Changes. That are now discussed, and. I think are symptomatic, also, for conversations. In other areas the. Last point just quickly. Is also going. Through through, the book you. See another recurring, theme emerging. About the law which. Is often. We, think of the law as just a constraint, on behavior, the law basically, tells you what not to do you're not allowed to share this data you're not allowed to collect it or you have all sorts of. Banned. Behavior. That's. Some sort of the common. Notion of law but. In the book there. Are several instances where it becomes clear that the law can also play a leveling. Function, or an enabling, function, the. Leveling function, I think is more nascent, we see a lot of discussion outside health, now, looking, at big, companies. Big tech companies, that have accumulated. So much data that, there, are algorithms get, better and better but, what about new, entrants, into the market, small, companies. That try to compete, if they don't have access to this huge, volume of data as we heard from Professor : can they ever compete, so law. Of course through competition, law and other mechanisms can, address this problem we. May see some, of that also unfolding, in health but. Certainly what we see in the chapter, and particularly, the one that professor Cohen wrote is maybe. The law in some instances, in this context, takes it too far in terms of. Locking. Things down instead. Of maybe. Creating. An environment, where we have more information flow. To. The benefit, of health. Healthcare. And to the benefit of Public Health privacy. Is, of course one of these areas so professor. Cohen introduced, the idea could, we flip it around and think about, a duty.
To. Share information. Given. The benefits, of information. Sharing in, a world where any, progress. Or. Much of the progress in medicine and in health will. Come likely. From, data so. Really. Provocative thought, and I, think a great example how. Long could, not only constraint, but unlock, and open up some, of, the promises, of the technology, so these are few. Reflections, again. From an outsider, you will hear now much. More detailed, substantive, analysis. On. Some of these issues that I may, have mentioned what I hope, this. At least connects, the discussions, a little bit to. Other. Conversations. We are having be. It autonomous. Vehicles, AI. Ethics. And governance so thank you so much thanks again. Well. Good afternoon everyone, and I want to say thanks, so much for Glenna and for Carmel for inviting, me to talk about this in general for hosting these types of great events where I really, think it is essential to bring together stakeholders, from. Different. Backgrounds. And here. I stand, a little bit at the intersection, of people who are. Epidemiologists. Who are really pushing for a lot of data, sharing as much as possible but I also stand a little bit in the world of law where, there's. Sort of risk of ace a verse nature, of lawyers comes out as well so what I'm gonna be talking about is the promise, of big, data, I love Glenn's. Analogy. To to sex, everybody. Is doing it and the promise of big data in post-approval. Research. For drugs and devices. And. How. HIPAA and specifically, hi-tech constrains. And, allows that, research, what. The ethical, considerations are. Of that and then end with, a little bit of, recommendations. In terms of how this can be facilitated. In a responsible. Way so. I guess, to begin I guess is why the, increasing. Need for post-approval research, for, drugs and devices, we. Need to first take a look and say, what we do know when a drug or device is approved, there, are limits of pre approval studies and that entails, frequent, exclusion, of key segments, of a population, women children ethnic, minorities, also the, fact that these studies are conducted on short timelines, with small populations so, there's the inability to detect rare but. Serious adverse, events, in example, natalizumab, and, progressive. Multifocal leukoencephalopathy. A. Rare but fatal event that would have never been caught in a pre-approval, clinical trial it's only once the drug is on the market, further. We need to stand back and say what is going on in general, with. With. Drug policy, and regardless. Of whether or not you're in Europe in the United States there's, increasing, pressure to get drugs out into. The market faster, on the, basis of surrogate, outcomes, which we know after. The fact or one occasion, later shown to be poorly predictive, and increasing. Use of accelerated. Review pathways, so there is an urgent, need here to be able to conduct these studies in a fast. Process. And a robust, method, and, what. Do we have to do that we've got a considerable. Amount of raw. Ingredients. We've, got insurance claims. Consumer. Purchases, electronic. Medical records, so there you see a graph just in the uptake. Of the usage of electronic medical records, wearable. Sensors, social. Media so, interestingly, people, saying on Twitter well I've taken this and I have this side effect, has. Actually. Been mined as a way the FDA has explored, those opportunities, to say can, we catch early, adverse, events through that system and increasingly. Biological. Registry, so there you just see the decrease in cost of whole genome sequencing. Eventually. Getting to a point where everybody can have this done if they want it so this. Is a beautiful. Slide by Isaac Cohan in JAMA and. I'm not going to unfortunately, be able to do it to justice, but here, you can sort of see the types of data that exist the structured, and unstructured nature. Of it and. How. All of these pieces sort, of fit together but, really the key of this is these. Are all isolated. Pieces. Of data, their. Magic, is, putting. Them together and, the. Question, is what. Are the. Systems in law that allow us to do this or that hinder, that and whether or not that balance, from. An ethical perspective, is, appropriate. So. What, in terms, of data sharing to aid post-approval, research here what, do we have. In. Terms of the facilitation, of observational. Study what. Does this data, sharing do it enables, enhanced, capture, so let's, just go through a couple examples here so people with relevant, exposures, and outcomes we. Know in the United States there's considerable, churn in. Health insurance so I might. Have access, to certain people who start a medication, in one claim. Database. But. If I can't follow them through overtime I don't know if they actually suffer an adverse event so the, only way I'd be able to do that is by linking those claims databases, together in terms, of variables, of interest my claims data isn't ever gonna have information.
On Things like BMI. Or smoking. Because that doesn't affect whether or not I get reimbursed by, the insurer for that, but. The, EHR, the, electronic, health record might have that information so if I'm able to link those together now, I have more variables, to study so. Again. One problem, with claims data is that, if I'm looking just at claims data once people go into the hospital it becomes a black box and so, I'd like to be able to know what happens inside, the hospital but also outside, of the hospital how. Does, that happen. By sharing this information together so by enabling this, data sharing I'll be, able to have improved statistical, power more, rigorous adjustment, for confounding, and more. Detailed. Subgroup, analyses, that is, particularly, relevant in the context, of precision, medicine so now I need robust, statistical. Power to be able to say which, people are particularly benefiting, from this treatment and which, people are more prone to suffer adverse events, and in. This era with, the amount of data that's out there there's the ability to do this but, the question is how does HIPAA sort. Of constrain, this or enable, it and one thing I want to say is I think HIPAA, gets a little bit of a bad rap in the sense that I think there is considerable, flexibility. Built into HIPAA now, it, remains an open question whether or not that flexibility, is sufficient, or is, it proper in the realm, that it covers because again as Glenn, alluded, to we're only talking about a certain, stream, of information, that it's covering so. Here, it's a little bit of a background here the security, rule covers electronic. Protected health information maintenance. And transmission, and the privacy, rule covers. Conditions. For the disclosure of pH I. And patient. Authorization is, not required, among other things for, treatment payment and healthcare operations for. Public, health activities and, what. We're talking about here for research so you can get an IRB, waiver if there's not more than minimal risk but in the context, of observational. Data, observational. Studies where you. Can say the risk to the participant, is probably. Minimal and we can debate that. The. Problem, with this process, is there. Have been reviews, that have sort of shown that we're worried about inconsistent. Or ambiguous interpretation. Of federal regulations. We're. Worried about changes, required by certain arby's and not by others and what that does to the overall research protocol. So, with that in mind, what. Does. HIPAA, enable. In this context. In, terms of, data sharing independent. Of IRB, authorization. So, it permits, covered, entities and their business associates to disclose, pH I without. Patient, authorization or, IRB waiver of authorization, through two paths one is, to. General, pests and. One. Is a limited data set which removes 16, specific, identifiers, and I'll come back to this in just a second the other path is D identification. And D, identification. Involves, turning, pH. AI into, non pH, I and there's. Two, paths within that one is a safe harbor mechanism. Again is just saying as long as you don't share 18 identifiers, so you're fine the. Other one is a bit more ambiguous, it, says expert, determination.
So I can, share any data as long as an expert says that. The risk is very small that the information, could be used alone or in, combination with other reasonably, available information to identify an individual, what. Does that mean, we'll get to that in just a second so, now what, about those cookie cutter pathways, the 16 identifier, czar the 18 identifiers is this gonna allow, us to do the research we want to do let's. Back up even further just a second say well why don't we just get patient. Authorization we've, talked about why our ere be authorization. Might be problematic, when a patient, authorization here. We're talking about millions, and millions, of patients and the, process, of doing that would also be constrained, so assuming. That this is a pathway we want to explore we find public, health value, for using, this. Safe. Harbor says these, are all the identifiers, you. Cannot, share and as. Long as you don't share this information, what. You're sharing is de-identified. Information. So. In the context. Of. Post-approval. Drug research and surveillance there's some things that sort of are flagged as potentially, problematic a key, element in, any good. Observational. Research is an, element, of date right, we need to have some sort of timing, here as far as when, a drug was taken and when an adverse event happened. Device, identifier, as if we're doing drug, and device research. Also. Addresses. Because, addresses. Are oftentimes the only pieces of information which we have to, understand. Socioeconomic, status. Which could be an important, confounder, in the type of research we want to do so. That. Being said well. There, are some pathway, problems, with just these limited, datasets or the safe harbor, pathways, the, limited data set there is this. Issue of what I am converting it's, only 16. Of those identifiers. Can't. Be shared but and I can share, elements, of dates but under, that pathway. It remains, pH, I and thus, there. Is still liability. Associated with the disclosing, party and then with the safe harbor pathway, we, perhaps here's. The hidden, flexibility. Now if you're a lawyer and you say that it says what. Can I not share all, elements, of dates well. Maybe, I don't need to share an element, of a date maybe I just needed to share, time to event information so. Maybe I just say that from an arbitrary reference, point. Somebody. Took a drug 50 days later and then, suffered, a myocardial infarction. 75. Days after that, could. Be sufficient could that perhaps get, us out. Of, this. Conundrum. Ah. Think. About what, I said about the need to conduct these studies faster. And in. As much as real time as possible, as those. Studies become, quicker and quicker and duration. We. Are implicitly. But going to violate that rule about, sharing, elements, of date and here's just one example so, this is an example of a study that was conducted in Italy and we can assume if let's say it was conducting the United States and somebody. Was. This. Involved, a vaccination and whether or not there was an adverse event associated with the vaccination this, study was conducted from, October. 2009, to January. 2010. Inclusive. And so. Now I say well what happens if I actually say that from 200, days arbitrarily, after, some arbitrary. Reference point the, person had a vaccine and then. 40, days after that they had an adverse event. By. Just. Knowing. When. This study took place I, would, know that that, person could not have been vaccinated in. 2000. In 10. It had to be in the second half of 2009. As a result. Have I shared. More. Than. An element, of a date more specific. Or more granular, than a year I have, so, again, showing you there's flexibility. But that flexibility, still hits a little bit of a roadblock in terms of how much information I can share, so. There and there is this expert, determination. Pathway. And that expert, determination. Pathway. Says. Well, maybe anything goes maybe I can craft this specifically, in terms of what information I was there I just have to have an experts. Say that, there is a low risk of, re-identification. Here. Now. The problem is well really what does a low risk of identification. Mean who, is an expert. And. Are. There certification. Standards, around the answer, to all of those are these are still developing questions.
And Because. These are developing, questions and because there is potentially, a risk of, liability. For. Certifying. Something, that is actually unsafe. There. Is the problematic, sort, of nature that this pathway is being it, exists. But is not being used. In a way that would be conducive to the darién data sharing that HIPAA actually, permits, so. To, back up one second, and again, say well why, what. Are the ethical considerations, at play and we've heard this there, is this inherent tension, here, between, wanting. To put forward as much data sharing as possible, because, that, will be beneficial, from a public health standpoint. How. Much time, okay. Thank you because. That, will be beneficial, from a public, health standpoint and, that. Tension between saying, we want to value people's. Privacy. And their right to make determinations about, their data, that theoretically. They should own, or, at least have some say and how that data is used. So. Here. Is where. Epidemiologists. Computer, scientists. Come, to the situation with a little bit of a different perspective than lawyers a. Canonical. Piece, now in. Privacy. Law is the, myth of D identification. I don't know if you guys have read it but if you haven't you definitely, showed us by Paul ohm, but, he cites a couple of examples, most notably, one that I particularly, enjoy is by a. Professor. Here, who as, a graduate, student Latanya. Sweeney sort. Of proved, that, the information, that governor, weld at the time wanted to release into the public domain was actually not confident. Was not de-identified. So, this related to the canoe. Hospitalization. Records, of state employees and, what she did is she basically combined, that data with census, data and she. Showed that she actually identified. From that, de-identified. Database which, records, with the governor's and mailed them to him so. These, examples, are shown in the law literature, as the, myth of D identification. But. When you talk to computer, scientists, who do data privacy, they'll. Say look first, of all that's not a case of HIPAA, de-identified. Data and, if you actually take a look and try to model, attack scenarios, about how you can reify. What, is HIPAA D identify, data under the safe harbor it, is very, very, difficult. Even. If you can identify, a. Small subset. Of individuals then. The question, ethically, is. What. Risk are we willing to accept as a society. As. Okay. In terms. Of being able to facilitate this, data sharing the real question, is not is, there a case of no risk there's, never really a case of no risk and what we need to have more public discussion. Of is what, is an acceptable level of risk and, so, this slide just shows examples, of how difficult. Re. Identifying, HIPAA, de-identified. Information, actually is. So. Here, I just want to say with that in context what we put forward is a couple of recommendations, in terms of, how, you can potentially start the, ball rolling in, developing, these. Certification. Standards, and we just come up with a few general. Principles, which. Is that in terms of the risk threshold, setting, if we. Are saying what an expert should be involved in first. We should say that not all risks, are created, equal and so. Even. Though the expert has, a considerable. Degree of discretion given, regulations. Given, guidance on what an acceptable, level of risk is or what a very small level of risk is the, expert, should select a risk threshold proportional. To, the potential harm of the, re identification. Not, all reaiiy denta fication x' are the same if i reaiiy denta fie someone who's taking a statin, versus.
We Identifying, someone who is taking, pre exposure prophylaxis for. HIV, and AIDS. The. Potential, danger, i could cause to that person is different in each context, so let's, consider that as one two, let's. Model, different types of attacks scenario so we don't really, know. What. Who the attacker is going to be it, could be a prosecutor. It could be somebody who's looking for a specific known, person, it could, be someone who's trying to identify any, person, in the database or, it, could be a market or someone who's trying to re identify, as many records as possible. We are trying to say what the risk is we, need to explore all of these different, options and, then. In terms of risk mitigation we. Shouldn't, be overly. Conservative. In, terms of the level of risk that, we're willing to assume, and, finally. In terms of the experts, there. Are other ways we can go about doing this in terms of, furthering. The likelihood. That there, won't be risk, and one, of those is why don't we routinely, use data sharing agreements, that conditions, the way in which the recipient party, will use that information so. Hopefully this gives you a little bit of a subset, of some of the samplings, in the book but thanks. Alright. Hello everybody, my name is Carmelo. Shekhar and as Glenn, mentioned I'm the executive, director of the Petrie Flom Center so, before. I start, talking, on the substance, a few. Completely. Shameless plugs. First. Of all working, on this, chapter, and then joining the Petrie Flom Center was not my first encounter with the center I was, a student fellow here, and for. All of the students sitting in the audience kind of checking us out I would highly, recommend that, you go that route because, who knows maybe one day you might steal my job for me the. Second. Thing is if you are perhaps a little bit past, the age of students and I certainly don't see anybody, here who appears to be a day, over 23. So, but. There, are many ways for you to get involved in the center that are different than being a student fellow, I would highly, recommend that you sign up for our newsletter it comes out only every. Other Friday, we strongly believe in not spamming I'll let you know when all of these great events happen, so, if you enjoyed this there are many more events to come I believe. That we have some information in the back that you can take if you're curious or please, feel free to talk to me all right, now that I finished the shameless plug of, my section, let's. Talk, about the, chapter that I contributed, to the big data book my. Chapter, really looked. A fairly. Recent Supreme, Court case, Kobe a versus, Liberty Mutual Insurance Company, to. Look at what happens when, law. Impacts. The collection, of big data because that's really step one you can't do, all of these cool things that you claim to do with big data if you can't collect it in the first place, and. Here. I'm going to walk you through and this is going to be short twenty minutes because I stand between you and the questions, and more importantly the free alcohol I understand.
I'm. Going to introduce you to two. Concepts, that are not. Particularly, sexy, all-payer, claims databases. As well, as ERISA, and explain, to you how, their. Intersection. Led to actually. A very interesting, case that, had even more interesting, public. Policy. Ramifications on. How. We try to get information about our health care system and build a better healthcare system for everybody, and then. I'm going to talk to you about potential, solutions. So. First I said all payer claims database, or, AP. CDs I mentioned. ERISA what, are these two concepts. So. The case I'm talking about was, based on the Vermont all payer claims database. And that is actually a great, sample. For us to use all. Payers claims, databases, contain, medical, claims pharmacy. Claims dental, claims as well as additional information about, patient. And provider demographics. In an area and it, includes public, payer information which, is relatively easy for us to get your Medicare your Medicaid s-- but, it also includes, private. Payers in most. States until, recently, when. They wanted to aggregate this information so they could really get a sense of all the claims happening, in a state required. All employer. Sponsored health, insurance, that's the type of insurance I'm, willing to bet most if you guys have, to. Enter, their claims information, into. Their ap CD, one. Thing to note is that the end user of ap CDs is usually not the, patient, so, you're not going to the Vermont APCD. Data is to look up and remind yourself wait when. Did I get that, prescription. For antibiotics or. When did I have that burn that I got treated, the. End users are, policymakers, are researchers, are people who are really trying to understand, what is going on with our healthcare system, this, is a tool that especially, in a fragmented, system like. Ours is really invaluable for, getting these overarching. Datasets. I. Also. Mentioned, ERISA, and when I gave this presentation at, our annual conference a couple years ago I used, the joke that ERISA, is so boring even lawyers find it boring I have. Not learned, anything to disprove the truth of that joke since then so. I will do this very quickly ERISA.
Came, Of age in the 1970s and. It's meant to regulate employee, benefit, plans that, employer sponsor, the idea is that these, plans should be consistent, across, the country they shouldn't be overburdened by States and there should be some accountability. So that employees, know what kind of benefit plans they have, the. One, of the most important, and interesting concepts. There is the concept of preemption saying. That the federal government's, laws Trump the state laws ERISA. Has one of the broadest, preemption. Clauses, of any, modern, federal statute. Which means. That there have been a whole host of cases trying. To figure out at what point does, ERISA, stop, and state law begins I often. Liken ERISA to the, fat. Woman singing when, she sings everything's. Over. Before. Go, BA the case I'm going to talk about the. Standard was that a state law was preempted if it had a connection with, or. To reference, to such an employee benefit, plan which, was still fairly broad but had this idea of a strong connection, you're really touching, something that ERISA should be regulating. One. Thing to note is that ERISA does have a reporting, requirement, but, it's reporting, requirement, is really limited to the financial, health of the employee benefit, plans including. Health insurance, plans rather. Than the AP CDs reporting, requirements, which is looking at health care data explicitly. So. Now that I have gone through those two concepts, let's talk about the Supreme Court case that inspired, me to write this chapter. The. Central issue in this case is whether ERISA preempts. The Vermont, state law requiring. All, health insurers. Including. Those who are self-funded, by employers, to, report claims. And health care service, data to, the state in order to flesh. Out the data set for the APCD. Specifically. It, the. Question was whether ERISA preempted. The Vermont statute in, establishing, a unified healthcare database and requiring health insurers to report healthcare, claims data that, may not necessarily be, financial. Reporting, data, again. I emphasize ERISA. Wants, you to report financial data to look, at overall is this plan healthy, is it going to be around to provide the benefits, employees, were promised, whereas. The APCD, could, in, some ways care less if your health, insurance plan was around tomorrow it just wants to know what, your health insurance plan has already paid for. So. The Supreme Court concluded. That ERISA did indeed preempt, Vermont's, APCD. It said. The Vermont law had a connection with the ERISA plan that. It tried, to govern a central, matter of plan administration, in this case reporting. Disclosure, and, record-keeping. That. It interfered, with ERISA Skol of having a national, uniform plan, administration.
One. Thing I want to high on we're going to talk about it under potential, solutions, is that, Justice, Breyer wrote, a concurrence, in which he, noted that this. Decision would cause serious administrative problems. So he said this is good, law but perhaps not good public, policy and he. Suggested, that the state's work with the Department, of Labor which, is the agency, charged. With issuing. Regulations around, ERISA, as, well as the Department, of Health and Human Services to remedy this issue. Justice. Ginsburg wrote. The dissent saying that Vermont's. Law. Should not be preempted by ERISA and that was because she felt, the, record. Collection were so different that, having. The APCD, and requiring to report this information in, did. Not impose a substantial. Burden on ERISA, because. They elicited, different, information, and serve distinct, purposes. So. Now I've introduced, you to AP, CDs and ERISA and explained the collision, between the two. But. You, might be asking, okay why is this a concern why should I care about this how does this relate to Big Data let's go back to the sexy, metaphor, about teenage, boys. Well. First. Let's talk about the overall impact, this, is a map of what, APCD. Land looked like in 2016, right, before the case came out we had about 18 states with a pc DS about 12 interested, in developing, them and a, lot of the reason that the states wanted, to do this work is because. Everybody's. Talking about the runaway cost of health care right you read a million articles on that and states. Felt if we have an AP CD we can know what is being spent and that transparency, is the first step to controlling. Costs. Now. In 2018. The map looks a lot grayer and the gray are four states that have decided I'm not really interested, in doing anything, with ap CDs so, ready you have a lot of states that we're going to try to do these big. Uniform. Data sets that would have been really interesting, to play around with and walked. It back I also want to say this map is a little misleading I, think because. The blue states our. States, that have, AP, CDs but, after, the Supreme Court case they, don't have the complete data they're so, far from being a universal, data set, so. To. Give you a sense of it in, 2017. About, a hundred, and fifty-one million people had employer-sponsored insurance, of, which, 60%, were. Self-funded. By their employer so these are the plans that, Arisa protects, and, that after go BA we, couldn't collect data on in Vermont. Alone, we. Know that the ruling eliminated, about, 20%. Of the. Total, population from. The data set that's a really big chunk. Furthermore. The people eliminated, it's not consistent. So, in Vermont. We. Know that a higher percentage of non elderly, women 59%. But, as compared, to non elderly, men 55%. Are covered, in these employer-sponsored, plans. So we're losing more women and our data set is getting less representative, and the less representative, our data set is the. Worse. The. Conclusions. That we can draw from it is. So. We also have a problem, when we have data sets now that are composed largely of public, payer data, because, Medicaid, and Medicare do cover a ton. Of people you can get some really interesting data, out of that population, alone, but.
Medicare. Obviously. Excuse very old, and Medicaid. Skews, towards, lower socioeconomic. Status. And so that's, very different, healthcare data than, when you include, the employer-sponsored people. So. Here I'm going to give you the pitch on why apcds. Were really useful and exciting, here are some of the types of things that states were trying to do before, the Supreme Court case so. Colorado was trying to use its APCD, for price transparency. To. Try to increase competition for. Maternity, services, and hip and knee replacement, to improve those prices and bring those costs down in, New. England a researcher. Used, an AP CD to compare, outcomes of. Services. For children enrolled in Medicaid and children, in commercially, insured, plans. Which, now you can't, necessarily do, because you're missing that chunk and they found that there were serious problems. And, differences. In outcomes, between these two populations. Which, might be common sense but. Also is very important, to study, lastly. The, opioid, epidemic is, obviously. A big issue, our country, was in 2016. Unfortunately. Continues in 2018, and in, Maine people were using data from AP CDs to try to see what sort of patterns could indicate that somebody either had, a substance, abuse issue or, was at risk of developing it, so, that they could flag those people for earlier and earlier interventions. These. Are all worthy topics, some, other issues, I want to raise when you try to do big data work with ap CDs when they're incomplete, data sets is that you lose the ability to catch a lot of things, so, immediately, mentioned. This drug earlier, but, in a large enough data set you can see these really rare but important, side-effects that can't, be caught as you reduce the number of data points you have in your data set so, just from a shared numbers point we're losing something, there's. Also the, issue of the, validity of, the data after all. Of these employers, say no I want to keep my information, proprietary. Especially. When you're trying to do research on the impact of socioeconomic status. On health and health outcomes. Lastly. We're losing a key resource for gauging, health status, and needs so for, example, researchers. In Tennessee, tried, to see what. The outcome of using, a particular antibiotic would be and they, found that there were some serious health outcomes, enough to say you should not be using this antibiotic to treat this, condition. Danish. Researchers tried to recreate this study using. Their equivalent, of an AP CD and there because, of their health care structure, it truly was, Universal, across all economic. Classes and, they didn't find this. Outcome, and they. Theorized, it was because the Tennessee researchers, had used Medicaid, data only because. Of the confluence of socioeconomic status. And health and health outcomes they. Were only. Seeing, the more vulnerable as opposed to the general population. So. Hopefully at this point I've convinced. You guys that, having, incomplete, datasets is actually, a big problem for, everybody. And not just for the kind of people who like to write, chapters and books titled big, data in health care, here. I want to suggest a couple of potentials, Lucian's, and now i reference, giving this talk in 2016. At the big data conference, and when i went back to my slides to update it in 2018. I found. That I needed to do virtually, no updating, of potential, solutions because, nothing, really had happened and. I think this is again the issue of, sometimes, an. Important, development happens in the law but for whatever reason the public policy, community, isn't. As invested, in it and so not much happens. So. The biggest potential solution, and the one that Justice Breyer noted, was, he said the Department, of Labor is responsible. For ERISA, they. Should issue regulations. Requiring. The collection, of this data. This. Did not happen the. Department, of Labor essentially, did what I do when a friend that I don't really like text. Me to ask if I want to go out to dinner they. Kind of said yes and let it die so, the Department, of Labor in 2016. Issued, proposed, regulations. To. Collect this data and, there were problems, with those regulations, the. Data they wanted was not as complete as the states were collecting, they said let's collect it on an annual basis, as opposed to a monthly basis, and, a. Lot of people wrote in there were about 300 comments, which is pretty good for something, that is a little bit on the drier side and then. They did absolutely nothing. With, it they. Still have done nothing with it it is not on their list of priorities and I. Think they rightly say we don't really know healthcare we don't want to get in the business of collecting healthcare, data and there just isn't the political, appetite to do it there.
Are Also some other issues, whether, if, the, Department of Labor tries to, collect. This data are. They. Able to waive preemption, and say States just collect it as you want would that open them up to another round of follow-up litigation. The. Issue of, if they collect. The. Types of data that fall under their purview which is just the self-employed, plan, okay, then they have an incomplete data set and the states have an incomplete data set and can. We necessarily hook. Them up to create one, whole one that is going to be as good as if one, entity. Had collected that data the first place and then. Again there's just the political appetite, for it it's not their priority of, course. Private, industry has tried to step in here and some. Healthcare datas are volunteering, there their, data so. Aetna, Humana United, Healthcare have put together a database that some researchers, have used, but, Blue Cross Blue Shield which, is a major insurer, has, not volunteered, their data there's. Also the issue when you volunteer, you, know you volunteer, what you want and it might not necessarily be, what's, needed there's. A group called the APCD, Council, which is trying to spearhead a common, layout to, say all states should use this layout in. Order to collect data to make it more uniform to ease the administrative, burden and that's a step in the right direction but. Without, regulations. Or. Real big push for people to adopt that data sharing that, template, remains just a template. So. I want to highlight that, saying, voluntary. Contributions. Are the solution. Kind. Of stops before you tweak it to make voluntary contributions. Appealing. Enough to do on a widespread, basis, some. Things that we need to think about our tax incentives, to get health insurers to do this, incentives. To third party administrators. Who, administer, these plans again to say we really should be sharing our data, offering. Incentives to employers, but. Then we also need to talk about what, sort of changes, to, data privacy, regulations, and liability, regulations, do, we need to have to, allow these entities to disclose, information that.
Might Otherwise be. Tied. Up in fears of violating, HIPAA or be tied up in non-disclosure. Agreements. And, so. I wish I could end on a slide that said things we have accomplished, since 2016. To make sure that we have good quality health care data we. Haven't, on the other hand that leaves an opportunity, to really improve this and make sure that we continue, to have the kind of big data sets that we need in, order to do some of the interesting, work described. In our book. Hopefully. You got something interesting under the exchange if you want to share what you got after that exchange that'll be great too but we have the first question I think over here. Avery, grappa patient, privacy rights, I feel. To some extent that I'm listening. To a bunch, of astronomers, before Copernicus, or just wondering, why do the planets, move. These, things in the sky moved the way they do and, in, the sense that. Both. The law and everything, around that, we're talking about here is focused. On institutions. Being, where. All the data processing, all the information, technology rests. And the individuals. Do. Not, people. Do not the, patients the people. Do not have agency, do, not control technology. Do, not have ways of automating. The authorization. For how data is used for, all these wonderful purposes. And. That. To me seems. Like, it's rapidly. Obsolete. As a, concept, because, as. Facebook. And everybody else aggregates, these datasets, in. Various, ways we realize how valuable they are Roche. Paid two thousand dollars per patient for cancer data on a million, patients earlier this year and, yet, Moore's law is driving, the technology, down. To. Where compared to two thousand dollars controlling. The state of yourself, is. Trivial. Where. How, do we get. To, this next phase of technology, and law. Who. Wants to take the first step. So. I, will. Say I think Facebook. Is a really good example not, only of, the. Dangers of what happens when we think that, companies. And entities own people's. Data as opposed to people themselves, but, also in terms of the market forces right, we, all know Facebook, has some really terrible privacy. Practices but how many of you are pulling your Facebook, and Instagram profiles. And joining some social, network that advertises. Itself as having better privacy, practices I think, I've seen studies indicating. That people say they value it but, they have yet to act on it and I think some. Of that next stage of the data revolution, you talk about will really, only happen, when people start. Saying I will only see, providers. That give me, control and ownership of the data or I will. Only use apps, that, explicitly, give me ownership, of the data. And. I think you raise a great point in, the sense that I was focusing, in on a discrete, situation of, research, but we're talking about, when. We're talking about HIPAA we're talking, about a, constrained.
Area In terms of a universe in which a lot. Of commercial, data, sharing is taking, place and a lot of frankly, exploitation. Is taking place and there. Are definitely. Solutions. That can happen there and I think one framework that roos mentioned, that is helpful to think about is how do we operate within the existing law but how do we also now, learn, from the European experience in terms of. This. Wider, realm, of data. Privacy and agency, that we are going to give individuals, in terms of other information. Is used. Okay. We've got a question over here and I gotta say before you guys for question I'm just gonna be excellent but that outfit, is outstanding, okay go ahead thank. You I have. A question for a, meet. You showed a very interesting slide, where, there, were. Several. Streams, of data. Medical. Records. What, evel's. Claims. All of that one. Of the things that I was actually wondering about that slide is like where is like medical. Error data. In. The u.s. medical, error is the third cause, of. Patient. Death in. 1999. The. National, Academy. Of Sciences. Published a report nor in that around. Ninety-eight. Thousand. People died in the u.s. because of medical error and, two. Years ago our research, in front Johns. Hopkins University. School of Medicine actually. Updated, the report and showing. That the. Actual cow count, is higher so, it's not 98, thousand, but actually between. 250,000. And. 400,000. Death because. Of medical error so. I mean those numbers are very big and they really remind. Us about. How, inefficient. Health. Care in the u.s. is and. Also. How, patient. Death due to medical. Error is an under-recognized. Epidemic. So. How. Do we integrate that within. That slide. So. I can, excellent. Question I think that one, of the things to recognize about that. 98. Or whatever that Institute of Medicine report and then that's more recent Hopkins, study I took a lot of a lot, of heat, among. The medical community because. A lot. Of that is based on extrapolations. From very small samples, and. The. Problem, of medical error is huge. We don't honestly, know, specifically. How huge it is and one of the problems is the ability to study that question, systematically. Using. Representative. Data is part, of the problem is access, to a single streams, of information, so some of that information could come from. EHR. Data but you need to have EHR data, from hospitals, that are academic, medical centers that have great EHR, systems plus rule met centers, that don't and how do you start. Doing. This in the process of as we're getting into a more digital age we, need to be, able to combine those data streams in an effective, process and part of the barrier is not. Necessarily. What HIPAA says but the way that people interpret HIPAA and the, risk-averse nature, that a lot of people have so I think it's. An important, recognition that, Medicare is one of those types, of things that could benefit, greatly from, greater. Danger. If. You all just add a point on this so one good error you did not make but sometimes people do make is conflating, medical error with medical malpractice in. That the best studies we have my old colleague Michele Mello and Dave Studdard show that actually the medical malpractice system. Does a terrible job in both directions it spot it claims things are medical, error that are not medical error and it misses many things that are medical. Error by labeling them no recourse. Available. And the, other thing I'll say maybe is that even if you could do this well and measure it through the big data there'd, be a question about what you do with that data people. Who think the patient's, care about that data I got to say the, results from what we see about care quality and attempts to communicate this report. C