DBMI Faculty on Path to Unify US Epidemiological Surveys and More
The NIH Office of Data Science recently hosted an all-hands grantee meeting for the country's BD2K Centers of Excellence. Among the year-one highlights presented by Zak Kohane, PI for PIC-SURE, was an accessible and user-friendly representation of data collected by the CDC through its National Health and Nutrition Examination Survey (NHANES).
NHANES provides health metrics on a large and representative sample of the United States population from 1999 to 2012, including physiological and clinical parameters (e.g., serum cholesterol, glucose), personal environmental biomarkers (e.g., blood mercury and infectious agents), physiological characteristics (e.g., head circumference), and self-reported behaviors (e.g., diet, physical activity).
These data have proven valuable in demonstrating the effect of removal of lead from gasoline (a gross decrease since legislation) and providing reference intervals for child growth. And they have enabled the first “environment-wide association studies” (EWAS), linking over 250 environmental biomarkers with disease phenotypes such as diabetes. Unfortunately, they have also been difficult to download, query, analyze, and integrate with other datasets.
Using the i2b2/tranSMART platform, however, the PIC-SURE team has created an application programming interface (API) to enable the NHANES data to be more easily accessed through a user-friendly web app.
“As a prelude to some of our joining of social media, with environmental data, with clinical data,” said Kohane, “we’ve taken an amazing public resource—namely, the NHANES study—and created an online instance of this that you can now at your leisure explore, slice and dice, and download the pieces that are relevant to you.”
Video of the BD2K all-hands meeting is available for viewing: