ML4H 2021
  • Home
  • Accepted Papers
  • Attend
    • Registration
    • Participation Guide
    • Schedule
    • Speakers
    • Research Roundtables
    • Career Mentorship
    • Raffle
    • Code of Conduct
  • Submit
    • Call for Participation
    • Writing Guidelines
    • Reviewer Instructions
    • Submission Mentorship
    • Reviewer Mentorship
  • Organization
    • About
    • Organizers
  • Past Events
    • 2020
    • 2019
    • 2018
    • 2017
    • 2016

Specialized Healthsheet for Healthcare Datasets

Negar Rostamzadeh, Subhrajit Roy, Diana Mincu, Andrew Smart, Lauren Wilcox, Mahima Pushkarna, Razvan Amironesei, Jessica Schrouff, Madeleine Elish, Nyalleng Moorosi, Berk Ustun, Noah Broesti, Katherine Heller

Abstract: Machine learning (ML) approaches have shown promising results in a variety of healthcare applications. Data plays a vital role in the development of ML-based healthcare systems that directly impact human lives. Many of the ethical issues with healthcare applications of ML can be traced back to structural inequalities that are reflected in the way we collect and process data. Developing a guideline for improving documentation practices in the creation, use and maintenance of ML healthcare datasets is of critical importance. In this work, we introduce Healthsheet, to address adaptations and expansions of the original datasheet questionnaire to healthcare-specific applications. We address the collection and use of sensitive attributes, dataset versioning and maintenance, privacy, data collection context, and health-related devices. As part of the development process of Healthsheet, we worked with three publicly-available healthcare datasets as our case studies, each with different types of structured data: Electronic Health Records (EHR), multiple sclerosis (MS) clinical trial data and smartphone-based performance outcome measures.

Poster
Abstract: Machine learning (ML) approaches have shown promising results in a variety of healthcare applications. Data plays a vital role in the development of ML-based healthcare systems that directly impact human lives. Many of the ethical issues with healthcare applications of ML can be traced back to structural inequalities that are reflected in the way we collect and process data. Developing a guideline for improving documentation practices in the creation, use and maintenance of ML healthcare datasets is of critical importance. In this work, we introduce Healthsheet, to address adaptations and expansions of the original datasheet questionnaire to healthcare-specific applications. We address the collection and use of sensitive attributes, dataset versioning and maintenance, privacy, data collection context, and health-related devices. As part of the development process of Healthsheet, we worked with three publicly-available healthcare datasets as our case studies, each with different types of structured data: Electronic Health Records (EHR), multiple sclerosis (MS) clinical trial data and smartphone-based performance outcome measures.

Back to Top

© 2021 ML4H Organization Committee