Business Title

Senior Data Engineer

Auto req ID

47698BR | Apply via Harvard Careers

School/Unit

Harvard Medical School

Location

USA - MA - Boston

Job Function

Information Technology

Time Status

Full-time

Department

Biomedical Informatics

Salary Grade

058

Union

00 - Non Union, Exempt or Temporary

Duties &
Responsibilities

At Harvard Medical School, the goal of our efforts is to transform healthcare and research to usher in a new age of individualized data-driven medicine. We work in a constantly evolving environment, iterating through prototypes with our experts to create research tools and methods. In this rapid and open space, individuals will be able to exercise creativity in developing solutions to novel problems.

We are looking for a Senior Data Engineer to join the Avillach Lab team to help us manage data flow to our cutting-edge research systems. We seek an individual to work on our multidisciplinary team of data scientists, medical doctors, and developers. This individual’s primary focus will be working in database team to design the data pipelines to facilitate ETL and occasionally working with application team, data stakeholders and researchers to understand their data. When required, changes will have to be made to an existing data loading pipeline and database schema. This is a hands-on role and development and coding skills are required.

Principal Duties and Responsibilities :  

  • Work as a member of Database team to handle databases and Big data Clusters
  • Develops & Debug scripts, procedures to process data from different clinical and computational sources
  • Expand the existing schema design to handle new data formats
  • Work with application developers to test the data accuracy and quality
  • Work independently and self-manage different aspects of the project lifecycle
  • Document any software written to inform end users of its functionality, and developers for support purposes
  • If needed, then work closely with researchers to understand data derived from both healthcare providers and computational methods
  • Assists in developing an anthology of medical terminology to represent the data within our application

Basic Qualifications

  • 5 years' experience working on a data team ( preferably ETL/Database team)
  • Experience working with large datasets
  • Programming experience in Java, Python or SQL
  • An advanced degree may count towards experience

Additional Qualifications

  • Bachelor's Degree in computer science or Engineering preferred
  • Experience in Automation of ETL data pipelines will be a huge plus
  • Experience with SQL and RDBMS ( Oracle, Postgres, MySQL)
  • Understanding of Dimensional data modeling (Star Schema)
  • Knowledge of NoSQL databases ( DynamoDB, MongoDB ) is a plus
  • Familiarity with cloud stacks ( AWS, GCP or Azure etc.)
  • Code Debugging and Performance troubleshooting skills
  • Experience preferred in languages like Python/Java/R and Shell Scripting
  • Experience with Source code management systems like Git,SVN etc.
  • Knowledge of statistical tools and research methods 
  • Should be a Team player with a positive attitude
  • Ability to work in a constantly evolving environment and change priorities
  • Ability to communicate effectively both orally and in writing
  • Ability to manage multiple projects in a self-starting, nearly autonomous fashion
  • Ability to work with scientists and researchers in order to understand the data structures

Additional Information

This is a one-year term position with strong potential for renewal.

This posting is a grade 58 and requires a higher level of competence and experience than the requirements listed on Requisition #47696BR, Data Engineer, grade 57. The grade will depend on the incumbent's skills and experience. Only one of these requisitions will be filled.

Pre-Employment Screening

Criminal, Identity

EEO Statement

We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability status, protected veteran status, gender identity, sexual orientation, pregnancy and pregnancy-related conditions, or any other characteristic protected by law.

| Return to top |