Overview
The goal of the research roundtables is to foster discussion between the participants and senior researchers in the field on several topics of high relevance in the ML4H community. Each roundtable will have a small group of senior researchers and practitioners who are experts in the selected topics and a few junior chairs who will lead the discussion on open problems in ML4H which were crowdsourced from the community.
How to participate
We will host virtual and in-person roundtables that will both take place at 13:00 CST on November 28th, 2022. The virtual roundtables will take place in Zoom (links will be provided in Gather.Town). The session will be split into two parts, with a 5 minutes break in the middle where participants can move to a different roundtable. Participants are encouraged to pick at most two roundtables that they would like to join and engage in the discussion points proposed by the junior chairs and ask further questions to the senior chairs. To avoid the disruption of conversations, the participants should only move between roundtables during the break.
Virtual Research Round-tables
1. How machine learning can help prevent and respond to infectious disease outbreaks, were are we now?
We can trace the origins of infectious disease modeling to Bernoulli's smallpox models in 1760. Modeling became more and more sophisticated in the 20th century with the mass action law, the Kermack-McKendrick epidemic model, stochastic models, and other advancements in modeling.
In the 21st century, we have seen the rise of machine learning techniques powered by the surge in data collection and cheaper and cheaper computing resources.
Besides modeling the spread of diseases, machine learning has been applied to other areas, such as diagnosing, triaging, and predicting outcomes at the patient level, accelerating the study of protein structures for vaccine development, reducing the time to develop tests, detecting and suppressing misinformation, and predicting future outbreaks caused by complex environmental and climate changes.
In this roundtable we will discuss "how can machine learning help prevent and respond to infectious disease outbreaks?"
- Senior chairs: Megan Coffee
- Junior chairs: Christian Garbin
2. Are our ML models really making an impact in the hospital? What do care-givers and clinicians want and what is still missing?
Machine learning models have shown tremendous promise in advancing clinical care. However, adaptation in clinical practice has been slow due to the numerous barriers that exist in implementation. ML models can impact healthcare in different ways such as relieving workload through medical note summarization, augmenting clinical workflows through intensive care unit alert systems, and providing high-performing diagnostics and risk-stratification. Despite the promised potential of these methods, concerns about the clinical utility of these models hinder widespread adoption in hospital systems. Further, despite FDA approval of many clinical ML models, few randomized control trials have been conducted to evaluate the utility of these interventions. In this roundtable, we will discuss the current landscape of the use of ML models in hospitals and the impact that they have had. We will also discuss the remaining opportunities and challenges from the perspective of a clinician.
- Senior chairs: Collin Stultz
- Junior chairs: Elizabeth Healey
3. How to effectively integrate multiple data sources for machine learning applications in healthcare?
Healthcare dataset is inherently multi-modal and collected from multi-sources. For example, medical datasets include electronic health records, medical imaging, lab tests, genetics, and patient demographics gathered from various sources and datasets from internet of things (IoT) devices like watches collect fitness and health statistics data. Consolidating such datasets is an essential step for data analysis. This research roundtable will discuss the challenges, opportunities, and possible solutions of healthcare data integration for machine learning applications.
- Senior chairs: Mert R. Sabuncu
- Junior chairs: Heejong Kim
4. How to evaluate the quality of healthcare data and labels prior to applying machine learning?
Understanding the quality and relevance of a dataset used to make predictions in healthcare is a vital step in the research or algorithm development process. In this session we'll discuss common pre-model building pitfalls: picking the right label, and understanding if selection is present in the training dataset.
We often use machine learning to predict important but challenging to define concepts like health, or healthcare needs. When empirical definitions are hard to determine, an obvious next step is to use a proxy label. One example from healthcare is using patient cost as a label when we want to understand future healthcare needs. Ziad's work has shown that not only is cost a bad label for health, but it can be biased. His research has shown how an algorithm that was deployed by a large healthcare organization systematically predicted lower future health needs among Black patients compared to White patients in part because the Black patients had systematically lower past spending. Low past spending, in this case was correlated with higher health needs, not lower.
In addition to selecting the right label, it is also crucial to evaluate whether the chosen dataset has selective measurement of the outcome. Selection is rampant in healthcare where much of our data exists because a human made a decision. Suppose we try to predict a patient's risk for heart attack: if the patient was tested for heart attack, or had a heart attack, this is straightforward. However, we do not observe test outcomes among patients not tested for heart attack, a decision made by a human who may be biased in various ways.
- Senior chairs: Ziad Obermeyer
- Junior chairs: Claire Boone
In-Person Research Round-tables
1. Are our ML models really making an impact in the hospital? What do care-givers and clinicians want and what is still missing?
Machine Learning (ML) has found numerous applications in healthcare - inspiring conferences, degree programs, research programs, and entire institutions at the intersection of healthcare and ML. On one hand, there is great promise for improving healthcare outcomes and reshaping health systems. On the other, there are equally if not more potential harms that can arise if ethics, fairness, and privacy considerations are ignored. We draw attention to the translation gap between theoretical/proof-of-concept ML research and clinical integration in the hospital posing the following questions: What are there common problematic simplifying assumptions? How can we ensure various stakeholders are engaged in this pipeline? Who are these technologies helping, and more importantly who are they hurting? Recent work in explainability, data visualization, and human-computer interaction have made some improvements towards what care-givers and clinicians want. However, many voices in the healthcare system may not be prioritized - such as nurses, allied health care workers, and especially patients. In the discussion that follows, we explore how ML fits into the healthcare pipeline, who gets to contribute, and what are key areas of focus moving forward.
- Senior chairs: Roxana Daneshjou, Siyu Shi
- Junior chairs: Jennifer Chien, Sujay Nagaraj
2. Evaluation of healthcare data prior to applying ML, e.g., representation analysis, annotation quality, OOD, clusters of IDs
Healthcare datasets often exhibit many challenging properties including non-random missingness processes, label noise and high dimensionality. These datasets also often reflect societal biases and algorithms trained using this data may exacerbate disparities. Each of these properties is further complicated when considering scenarios where there may be dataset shift. Given these challenges, it is important to systematically evaluate datasets for the existence and severity of such pathologies prior to applying any machine learning methods. A related task is interrogating the robustness of methods to the introduction of different dataset shifts in these pathologies. Several types of approaches exist for this kind of data exploration including dimensionality reduction techniques for visualizing clusters of data, evaluating and improving label quality using active learning and tests for missingness assumptions. The purpose of this session is to discuss existing methods used by the community for exploration of healthcare data, explore what pathologies are rarely addressed or difficult to interrogate using existing methods and how we can mitigate common pitfalls in the analysis of large healthcare datasets.
- Senior chairs: Nicola Pezzotti, Pin-Yu Chen
- Junior chairs: Neha Hulkund, Shreyas Bhave
3. How to ensure generalizability of ML in healthcare?
Despite being deployed widely in practice, predictive models in healthcare can fail to perform well across clinics, patient populations, and time. Two recent examples illustrate the challenge: The University of Michigan Hospital deactivated their sepsis prediction model in April of 2020 due to COVID-related changes in the distribution of patients, and the widely-deployed Epic Sepsis Model was found (in an external validation study) to dramatically under-perform relative to the claims of the developer. Meanwhile, a recent study found that only 23% of ML-based healthcare papers used multiple datasets. A definition of generalizability is the ability of a model to perform well on data from a cohort of patients independent from its training data, whether from a different hospital or demographic subpopulation, or a different point in time. In this roundtable, we will discuss different aspects of what it means for a model to generalize well, the challenges inherent in building generalizable models, and potential paths forward.
- Senior chairs: Emmanuel Candès, Stephen R. Pfohl, Edwin Fong
- Junior chairs: Michael Oberst, Amruta Pai
4. How do we inject domain knowledge into DL models, in particular when not much data is available?
When sampling is difficult or expensive, as is often the case in medicine, biology, language, and the social sciences, we typically obtain only a small dataset relating to any particular modeling or prediction task. In these situations, we can provide domain knowledge as an inductive bias to inform learning tasks and restrict the possible solution spaces of models, increasing our sample efficiency in these already sample-sparse domains. Common examples are knowledge graphs, but this representation is often unavailable, and knowledge in some domains may not align well with a graphical representation. In this session, we will focus on why domain knowledge is important (and when it might not be), and discuss the many ways of injecting domain knowledge when it is relevant. Our panelists represent diverse backgrounds from natural language processing, bio-informatics, and computational medicine, and each have unique experiences utilizing domain knowledge. We aim to provide a comprehensive overview of existing approaches and thrilling discussion about future research on injecting domain knowledge into deep learning models.
- Senior chairs: Aakanksha Naik, Ben Lengerich, Ying Xu, Eran Halperin
- Junior chairs: Caleb Ellington, Wisdom Ikezogwo
5. How to effectively integrate multiple data sources (e.g., EHR, images, genomics) for ML applications in healthcare?
Multimodal data sources provide unparalleled opportunities to design models with intelli gent capabilities and learn. Often individual decision points situated in complex healthcare decision processes are modelled with the aim of improving and augmenting single outcomes. However, from an engineering perspective attaching intelligent solutions into broader decision systems can increase their complexity which in turn increases the risk for fragmented user experiences. This in turn may even impact the cognitive load of the consuming domain expert negatively. Instead, a potential avenue for research is to move the AI solution upstream with the aim to model entire decision processes end-to-end. A side-effect of this is that the AI may need to be able to (1) consume multiple data modalities and model multiple tasks jointly as the scope is broadened or (2) translate one data modality into another one. Successful implementations will ultimately lead to significant reduction in engineering complexity and more unified user experiences.
- Senior Chairs: Jonathan Bidwell, Dominik Dahlem, Mark Sendak
- Junior Chairs: Jason Dou
6. How can we utilize foundation models (very large pre-trained models) for healthcare?
A recent paradigm in machine learning is the concept of foundation models, very large- pretrained models that can be used as a base model for applications. Such models have seen particularly impressive performance at zero- and few-shot generation tasks in text and vision. Given the 'emergent abilities' of such models, they hold remarkable promise for transforming the way we practice medicine and transform care. For example, large language models (LLMs) have already shown potential recently at tasks including clinical information extraction and medical exam question answering. However, there are practical questions and challenges that arise due to the high-stakes of the clinical setting. In this roundtable session, we will discuss "How can we utilize foundation models (very large pre-trained models) for healthcare?
- Senior Chairs: Payel Das, Zachary C. Lipton, Byung-Hak Kim
- Junior Chairs: Monica Agrawal, Changye Li
7. Using ML for population health
Population health, in general, can be defined as the health outcome distributions within and across populations. The research requires taking into account various cultural, social, and environmental factors and investigating their effects on the health of communities. To this end, machine learning is emerging as a possible way to automate complex tasks in population health that otherwise have required substantial human labor. Machine learning utilizes these health outcomes to understand patterns of health determinants in order to identify high-risk groups and predict future disease burdens. These models can be especially of interest to policy-makers to recommend public health policies and interventions. In this roundtable, we will discuss the challenges in generalizability and possible solutions. We will talk about the design of evaluation metrics in order to push for clinically and cost-effective models, and how to ensure benefits are distributed more uniformly across various socioeconomic and geographically diverse groups.
- Senior Chairs:Nathaniel Hendrix, Dimitrios Spathis
- Junior Chairs: Peniel Argaw, Arpita Biswas
8. How to incentivize creation or publication of new data collections and facilitate international collaboration?
While the potential for the application of machine learning in health is enormous, so are the challenges. Crucially, ML models require large amounts of training data, yet making this data available while respecting patients' privacy has proven to be an immense conundrum. Furthermore, while we are constantly reminded that health issues know no borders, whether it be outbreaks of infectious diseases or rare genomic disorders, the jurisdictions governing health data most certainly do. As such, the global sharing of health data for the benefit of the citizen of the world remains elusive.
This roundtable will distil and articulate the exact bottlenecks that currently block progress in the greater sharing - in a responsible manner - of health data, both in general as well as internationally. Building on the precisely articulated bottlenecks, the roundtable will discuss and propose potential ways forward.
International collaboration in collecting and sharing large medical datasets is the key to improving clinical impact of machine learning (ML) in healthcare domain. However, barriers such as privacy concerns remain. This roundtable will discuss the challenges and suggestions to incentivize creation or publication of new data collections and facilitate international collaboration
- Senior Chairs: Jun Seita, Bastiaan Quast
- Junior Chairs: Mehak Gupta, Xinhui Li
9. Post-approval monitoring and validation of AI systems in health care
The rapid development and application of AI systems in healthcare have raised a wide range of concerns about their reliability. Even if an AI system is approved, it can become brittle when deployed in real-world scenarios, for instance, due to distribution shifts. Therefore, post-approval monitoring and validation is a critical quality assurance process to ensure that these systems deliver accurate medical predictions in diverse real-world scenarios. In this roundtable, we will discuss some of the challenges regarding the post-approval monitoring and validation process and how to address these issues from a policy, technical, and data perspective. In addition, we will discuss the best way to establish collaboration between clinicians, ML experts, and the systems themselves to communicate these issues.
- Senior Chairs: Berkman Sahiner, Anna Decker, Harvineet Singh
- Junior Chairs: Marta Lemanczyk, Yuhui Zhang