Biostatistics – Biomedical Informatics – Big Data (B3D)

Co-organized by the Department of Biostatistics at the Harvard T.H. Chan School of Public Health and the Department of Biomedical Informatics at Harvard Medical School, the Biostatistics - Biomedical Informatics - Big Data (B3D) Seminar is a series of research talks on statistical, computational, and machine learning methods for analyzing large complex data sets, with a focus on applications in biomedical science and public health, including:

  • Genetics and genomics
  • Epidemiological and environmental health science
  • Comparative effective research
  • Electronic medical records
  • Digital health
  • Neuroscience
  • Social networks

The goal of the seminar is to provide a forum for brainstorming and exchanging ideas, and promoting interdisciplinary collaboration among researchers from a variety of disciplines such as biostatistics/statistics, biomedical informatics, computer science, computational biology, biomedicine, public health, social sciences, and other related areas. The seminar will feature local, national, and international speakers who are leaders in their field.

Selected Mondays
*All lectures will be virtual until further notice*


HMS Countway Library (scan ID or sign in at desk if no ID, take elevator to fifth floor)

For complete details, visit https://www.hsph.harvard.edu/biostatistics/b3d-seminar/

Recordings of talks will be made available on our YouTube channel.

B3D Mailing List – sign up to receive emails on the B3D Seminar Series and other news and events on big data and data science.

#HarvardB3Data - join the conversation via Twitter

Upcoming Virtual Lectures

September 30
1:00-2:00 pm
Eran Segal, PhD
Computational Biologist
Weizmann Institute of Science 

Personalizing Treatments using Microbiome and Clinical Data
Accumulating evidence supports a causal role for the human gut microbiome in obesity, diabetes, metabolic disorders, cardiovascular disease, and numerous other conditions. I will present our research on the role of the human microbiome in health and disease, ultimately aimed at developing personalized medicine approaches that combine human genetics, microbiome, and nutrition.

In one project, we tackled the subject of personalization of human nutrition, using a cohort of over 1,000 people in which we measured blood glucose response to >50,000 meals, lifestyle, medical and food frequency questionnaires, blood tests, genetics, and gut microbiome. We showed that blood glucose responses to meals greatly vary between people even when consuming identical foods; devised the first algorithm for accurately predicting personalized glucose responses to food based on clinical and microbiome data; and showed that personalized diets based on our algorithm successfully balanced blood glucose levels in prediabetic individuals. Using the same cohort, we also studied the set of metabolites circulating in the human blood, termed the serum metabolome, which contain a plethora of biomarkers and causative agents. With the goal of identifying factors that determine levels of these metabolites, we devised machine learning algorithms that predict metabolite levels in held-out subjects. We show that a large number of these metabolites are significantly predicted by the microbiome and unravel specific bacteria that likely modulate particular metabolites. These findings pave the way towards microbiome-based therapeutics aimed at manipulating circulating metabolite levels for improved health.

Finally, I will present an algorithm that we devised for identifying variability in microbial sub-genomic regions. We find that such Sub-Genomic Variation (SGV) are prevalent in the microbiome across multiple microbial phyla, and that they are associated with bacterial fitness and their member genes are enriched for CRISPR-associated and antibiotic producing functions and depleted from housekeeping genes. We find over 100 novel associations between SGVs and host disease risk factors and uncover possible mechanistic links between the microbiome and its host, demonstrating that SGVs constitute a new layer of metagenomic information.