MLOps Data Engineer bridging data science and production systems at Triton Digital. Designing CI/CD pipelines and optimizing data processing with Apache Spark for advertising systems.
Responsibilities
Design, implement, and maintain CI/CD pipelines for machine learning workflows using tools like GitHub Actions, Azure DevOps, or Jenkins.
Build and optimize data processing pipelines in Apache Spark (PySpark and Scala) for large-scale, distributed listener datasets.
Deploy and manage Databricks environments, ensuring efficient cluster usage, job scheduling, and cost optimization.
Collaborate with data scientists to productionize ML models, integrating them into scalable APIs or batch processing systems that feed real-time, machine-readable audience signals.
Implement automated testing, monitoring, and alerting for ML pipelines to ensure the reliability and reproducibility that certified buyers require.
Champion best practices in version control, model registry management, and environment reproducibility.
Help evolve our listener data infrastructure toward agent-compatible supply — live, structured, queryable data feeds that autonomous buying systems can discover and act on without human mediation.
Requirements
Proven experience in Data Engineering, MLOps, and DevOps roles with a focus on automation and scalability.
Strong programming skills in Python, with hands-on experience in Apache Spark.
Scala is a huge plus.
Advanced expertise in Databricks, including Delta Lake, structured streaming, feature engineering.
Solid understanding of CI/CD principles and tools (e.g., GitHub Actions, Jenkins, Azure DevOps, GitLab CI, ArgoCD).
Familiarity with cloud platforms (AWS, Azure, or GCP) for data and ML workloads.
A problem-solving mindset and the ability to work closely with cross-functional teams.
Strong architectural mindset, capable of evaluating trade-offs across cost, performance, scalability, and maintainability when selecting tools and designing systems.
Experience working with containerized and orchestrated environments (Kubernetes / OpenShift), including deployment, scaling, and fault tolerance of data and ML workloads.
Advanced English required.
French is an asset.
Familiarity with IAB data standards, programmatic advertising infrastructure, or AdTech data pipelines is a strong asset.
Benefits
Fully remote position (must be based in ONTARIO or QUEBEC)
4 weeks of vacation + 5 paid personal days annually
Group insurance programs as of your first day, including access to telemedicine and an EAP
Engineer building scalable data platforms for commercial trucking API company. Designing and optimizing the data platform to tackle complex backend challenges.
Engineer building scalable data platforms for a telematics startup tackling complex backend challenges. Designing and optimizing data platforms for API, managing large - scale data processing.
Principal Data Engineer at RAVL designing secure and scalable data and AI platforms. Leading architectural guidance and building capabilities across the organization to enhance data operations.
Data Engineer at RAVL designing, building, and operating data pipelines for decision - making. Transforming raw data into reliable assets for analytics and machine learning.
Data Engineer role for Alberta Securities Commission, focusing on data pipelines and infrastructure. Involves designing, building, and securing data assets for organizational decision - making.
Sr. Data Engineer at Luxury Presence building AI growth platform for real estate. Shaping platform architecture and driving AI - powered product delivery for innovative solutions.
Senior Data Engineer optimizing data pipelines for AI - driven insights at Homebase. Involves collaboration with product and analytics teams to drive data architecture transformation.
Lead Data Engineer / Snowflake Engineer at Brillio in Montreal focusing on Snowflake and Generative AI data solutions. Responsibilities include architecting data platforms and collaborating with AI teams.
Data Engineer at The Fedcap Group architecting and leading enterprise data warehouse solutions. Focused on enabling scalable growth and operational excellence across the organization.