About the role

Intermediate Data Engineer designing and building data pipelines for travel industry data management. Collaborating across teams to ensure reliable data for analytics and reporting.

Responsibilities

Design, develop, and maintain robust ETL/ELT pipelines to integrate data from multiple sources into a centralized cloud-based data platform
Build scalable data ingestion, transformation, and enrichment processes using Python, SQL, and PySpark
Optimize data workflows for performance, scalability, and cost efficiency in the cloud
Implement data quality and validation checks to ensure trust in reporting, analytics, and data-driven products
Collaborate with cross-functional teams to translate business requirements into technical data solutions
Support large-scale transformations using distributed processing frameworks
Troubleshoot and resolve issues in data pipelines, ensuring reliability and uptime
Participate in code reviews and contribute to engineering standards and best practices
Document data processes, pipelines, and schemas to improve transparency and reusability
Stay current with modern data engineering tools, practices, and cloud technologies, with a passion for continual learning and knowledge sharing
Build with stakeholders in mind, not just raw pipelines.

3+ years of experience in data engineering, data development, or data management
Strong hands-on experience with Snowflake and modern data warehouse concepts (data lakes, lakehouse, streaming)
Proficiency in Python and SQL for building and optimizing data pipelines
Hands-on experience with AWS services such as S3, Glue, Lambda, Redshift, and data platforms such as Snowflake
Experience with ETL/ELT, data modeling, and data warehousing concepts
Experience with orchestration tools (Airflow, Dagster)
Hands-on experience with PySpark and distributed data processing frameworks (e.g., AWS EMR)
Knowledge of pipeline performance optimization and debugging
Strong problem-solving, analytical, and collaboration skills
Experience with version control (Git) and CI/CD workflows