Responsibilities

Design, develop, and maintain robust data pipelines using ETL and ELT methodologies to process and integrate data from various sources into a data lake, a central data warehouse, operational data stores, analytical data marts and various application interfaces
Implement and manage real-time data streaming solutions utilising Kafka, Debezium, Kafka Connect
Build, schedule and maintain custom workflows using Apache Airflow to ensure timely and accurate data processing and delivery
Work with a variety of database technologies, including relational databases (MySQL, PostgreSQL), NoSQL databases (MongoDB) and analytical/big data systems (Redshift, BigQuery, SingleStore)
Employ tools like Terraform, Kubernetes, and Helm to manage and provision infrastructure efficiently
Develop and maintain continuous integration and deployment pipelines to streamline development processes
Conduct unit and integration testing to ensure high code quality, data integrity and system reliability
Engage with cross-functional teams, including data scientists, analysts, and business stakeholders, to understand data needs and deliver solutions
Maintain clear and comprehensive documentation of data processes, workflows and systems
Monitor system performance and address faults and failures in production systems as part of an on-call rotation

Requirements

Bachelor's degree in Computer Science, Engineering or a related field
Proficiency in Python is essential
Minimum of 3 years of experience in data engineering roles, with a focus on building and managing data pipelines
Minimum of 2 years experience in software and/or application development roles (can be concurrent with data engineering experience)
Hands-on experience with Kafka, Debezium, and Kafka Connect
Proficiency in a data pipeline orchestration tool or suitable workflow orchestration tool like Apache Airflow (preferred), Databricks, Dagster or Airbyte
Strong understanding and hands-on experience working with various database technologies, including MySQL, PostgreSQL, MongoDB and Redshift (BigQuery and SingleStore advantageous)
Experience with Terraform, Kubernetes, and Helm for infrastructure management
Solid knowledge of cloud computing concepts, with experience in AWS services being advantageous
Ability to write complex SQL queries across different dialects
Familiarity with unit and integration testing methodologies
Experience in setting up and maintaining CI/CD pipelines
Good verbal and written communication skills, with the ability to convey complex technical concepts to non-technical stakeholders
Demonstrated ability to work collaboratively within a team and across departments
Comfortable working in a fast-paced environment with changing priorities, technologies and tooling
Strong analytical and problem-solving skills
JavaScript and Scala development experience is advantageous
Exposure to analytical systems and basic data science tooling advantageous
Familiarity with basic machine learning and analytical modelling concepts advantageous
Exposure to self-service reporting tools like Tableau, Looker and DOMO advantageous

Senior Data Engineer

Job Description

Skills

Ready to apply for this role?