Biostatistics – Biomedical Informatics – Big Data (B3D)

Co-organized by the Department of Biostatistics at the Harvard T.H. Chan School of Public Health and the Department of Biomedical Informatics at Harvard Medical School, the Biostatistics - Biomedical Informatics - Big Data (B3D) Seminar is a series of research talks on statistical, computational, and machine learning methods for analyzing large complex data sets, with a focus on applications in biomedical science and public health, including:

  • Genetics and genomics
  • Epidemiological and environmental health science
  • Comparative effective research
  • Electronic medical records
  • Digital health
  • Neuroscience
  • Social networks

The goal of the seminar is to provide a forum for brainstorming and exchanging ideas, and promoting interdisciplinary collaboration among researchers from a variety of disciplines such as biostatistics/statistics, biomedical informatics, computer science, computational biology, biomedicine, public health, social sciences, and other related areas. The seminar will feature local, national, and international speakers who are leaders in their field.

Selected Mondays

Minot Room / Lahey Room 
5th Floor
HMS Countway Library (scan ID or sign in at desk if no ID, take elevator to fifth floor)

For complete details, visit

Recordings of talks will be made available on our YouTube channel.

B3D Mailing List – sign up to receive emails on the B3D Seminar Series and other news and events on big data and data science.

#HarvardB3Data - join the conversation via Twitter


Monday, February 10 | LAHEY ROOM | *2:15-3:15pm* 
David Van Valen, MD, PhD
Assistant Professor of Biology and Biological Engineering 
Caltech University 

Single Cell Biology in a Software 2.0 World
The study of living systems is challenging because of their high dimensionality, spatial and temporal heterogeneity, and high degree of variability in the fundamental unit of life – the living cell. Recently, advances in genomics, imaging, and machine learning are enabling researchers to tackle all of these challenges. In this talk, I describe my research group’s efforts to use machine learning to connect imaging and genomics measurements to enable high-dimensional measurements of living systems. We show how deep learning-based image segmentation enables the quantification of dozens of protein markers in spatial proteomics measurements of breast cancer and describe a new method for deep learning-based cell tracking which will enable information-theoretic measurements of cell signaling. Lastly, we relay our efforts in deploying deep learning models in the cloud for large-scale deep learning-enabled image analysis. By using single-cell imaging as the read out for a genetic screen, we show how we can identify deep connections between host cell energetics and viral decision making in a model system of viral infections.