Job description

We are partnered with a global biopharmaceutical organisation dedicated to transforming the lives of patients and their families. With a strong commitment to delivering life-changing medicines for serious conditions — often where limited or no therapeutic options exist — this company combines a diverse portfolio of marketed therapies with a growing pipeline in oncology and neuroscience. Headquartered in Europe with operations, laboratories, and manufacturing facilities worldwide, they are driven by a patient-focused and science-led approach.

This role will offer you:

A leading position in driving data engineering initiatives across multiple Research & Development areas, including Clinical, Pre-Clinical, Non-Clinical, Omics, Real World Data, and more.
The opportunity to design and optimise advanced data pipelines, models, and repositories using cutting-edge AWS technologies.
Cross-functional collaboration with international teams of scientists, researchers, and stakeholders.
A high-impact role supporting innovation in both neuroscience and oncology.

Responsibilities:

Lead the design, development, and maintenance of data pipelines for diverse R&D data sources.
Create and optimise ETL/ELT processes for structured and unstructured data using Python, R, SQL, and AWS services.
Build and maintain repositories and data warehousing solutions.
Develop and maintain data quality frameworks, validation processes, and KPIs.
Implement data versioning, lineage tracking, and regulatory compliance measures.
Document data processes, architectures, and workflows in line with best DevOps practices.
Collaborate with R&D researchers, data scientists, and stakeholders to deliver tailored solutions.
Ensure compliance with global data privacy regulations such as GDPR and HIPAA.

You will bring:

Expert proficiency in Python, R, and SQL for data processing.
Advanced knowledge of AWS services, particularly S3, Redshift, FSx, Glue, and Lambda.
Strong skills in relational database design and modelling, with experience in NoSQL and Graph databases.
Experience with containerisation (Docker, Kubernetes/EKS).
Knowledge of healthcare data standards such as CDISC, HL7, FHIR, SNOMED CT, OMOP, and DICOM.
Familiarity with big data technologies, MLOps, and model deployment.
Bachelor’s degree in a relevant field (Master’s preferred) and 5–7 years’ experience in data engineering, including work with healthcare, research, or clinical data.

Senior Principal, Data Engineering

Consultant

Natasha Cole

Business Manager - KAM

Senior Principal, Data Engineering

Job description