Data Science for Medical Decision Making

2 credits - Spring Term

Have you ever Googled a health-related question and been dumbfounded by the hits? Gotten a lab test result and wondered if it applies to a person like you? Wondered what an “odds ratio” for a genetic variant you inherited is? Explanations of why we are who we are, and what diseases we might get, and why some of us are at risk, are often unsatisfactory. It’s time to become an active consumer: in this course, we will develop skills in querying large health data streams to make informed decisions through the lens of data science. This course will survey the current data and methodological approaches to conduct integrative high-throughput investigations merging genomic, exposomic, and phenomic datasets to discover new associations with disease and health. Students will be introduced to statistical decision theory and how modern data science and machine learning approaches can help improve rational medical decision making. Students will be encouraged to find publicly available data (e.g. CDC, NIH dbGaP data) and formulate an original research project for submission to a journal or as a proceedings article.

Learning Goals:

1. Develop basic skills in biomedical data science, including R/RStudio, Python and cloud-based infrastructure.

2. Understand how decision theory and machine learning can enhance clinical care.

3. Develop your own prediction algorithms that integrate exposomic, genomic, and phenomic data.

4. Execute data-driven methods on current day computing clusters.

5. Interpret statistical estimates and biomedical findings in the published literature (and the lay press).

Open to all Harvard and MIT students. To enroll, email CV and brief statement of interest to Rebecca Fitzhugh.

View in Course Catalog