Data Science for Medical Decision Making
2 credits - Spring Term
Have you ever Googled a health-related question and been dumbfounded by the hits? Gotten a lab test result and wondered if it applies to a person like you? Wondered what an “odds ratio” for a genetic variant you inherited is? Explanations of why we are who we are, and what diseases we might get, and why some of us are at risk, are often unsatisfactory. It’s time to become an active consumer: in this course, we will develop skills in querying large health data streams to make informed decisions through the lens of data science. This course will survey the current data and methodological approaches to conduct integrative high-throughput investigations merging genomic, exposomic, and phenomic datasets to discover new associations with disease and health. Students will be introduced to statistical decision theory and how modern data science and machine learning approaches can help improve rational medical decision making. Students will be encouraged to find publicly available data (e.g. CDC, NIH dbGaP data) and formulate an original research project for submission to a journal or as a proceedings article.
1. Develop basic skills in biomedical data science, including R/RStudio, Python and cloud-based infrastructure.
2. Understand how decision theory and machine learning can enhance clinical care.
3. Develop your own prediction algorithms that integrate exposomic, genomic, and phenomic data.
4. Execute data-driven methods on current day computing clusters.
5. Interpret statistical estimates and biomedical findings in the published literature (and the lay press).
Prerequisites: This course requires familiarity with programming (e.g. R or Python experience or Harvard CS50 equivalent) in a Linux environment (i.e. use of the command line).