Senior Databricks Lead/Data Architect needed for a large-scale data and AI modernization program in the Greater Toronto Area. Must be hands-on with lakehouse architecture, legacy ETL migration, and team leadership.
Responsibilities
We are hiring a senior, deeply hands-on Databricks Lead / Data Architect to drive the Databricks workstream of a large-scale data and AI modernization program for a major Canadian enterprise retail client. This is a build-and-lead role: you will own the technical direction of Databricks-based solutions end to end — architecture, lakehouse design, data engineering, migration of legacy ETL workloads, and production operations — while remaining personally hands-on in code and design. You will work side by side with the client’s VP of Data & AI and the AVP of Data Platforms & Integration and their teams, acting as the senior technical authority who turns strategy into delivered, production-grade outcomes. The immediate focus is modernizing a large on-premise ETL estate (IBM DataStage) to an Azure-native lakehouse on Azure Data Factory and Databricks, and then scaling the platform to power enterprise analytics and AI use cases. Key Responsibilities: Architect the lakehouse: Design and own scalable, secure Databricks Lakehouse architecture on Azure (Delta Lake, Unity Catalog, medallion bronze/silver/gold, ADLS Gen2) aligned to enterprise standards. Stay hands-on: Personally build and review PySpark / Spark SQL pipelines, Delta Live Tables, notebooks, and orchestration — setting the engineering bar, not just directing it. Lead legacy migration: Drive conversion of complex legacy ETL (DataStage) workloads to Databricks/PySpark and ADF, including patterns, accelerators, and reusable frameworks for code conversion and validation. Own performance & cost: Optimize cluster configuration, job performance, partitioning, and cost; establish FinOps and right-sizing practices on Databricks. Embed governance: Implement data governance, lineage, quality, and access control through Unity Catalog and Purview; ensure security, privacy, and compliance by design. Enable analytics & AI: Design Gold-layer semantic models and feature pipelines that serve BI (Power BI), advanced analytics, and ML/GenAI use cases (MLflow, Azure ML). Lead the squad: Provide technical leadership and mentoring to data engineers; define best practices, coding standards, CI/CD (Azure DevOps), and review processes. Partner with the client: Work closely with the client’s VP (Data & AI), AVP (Data Platforms & Integration), platform architects, and business stakeholders to translate requirements into delivery roadmaps and measurable outcomes.
Requirements
12+ years in data engineering / data platform architecture, with 4+ years of deep, hands-on Databricks delivery. Expert-level Databricks: Spark (PySpark & Spark SQL), Delta Lake, Delta Live Tables, Unity Catalog, Workflows, performance tuning, and cluster/cost optimization. Strong Azure data stack: Azure Data Factory, ADLS Gen2, Azure Key Vault, Azure DevOps (CI/CD), and Azure networking/security fundamentals. Proven migration track record: Led at least one large-scale migration from legacy ETL (e.g., DataStage, Informatica, Teradata) to a cloud lakehouse, including complex transformation logic. Lakehouse design depth: Medallion architecture, dimensional & semantic modelling, SCD handling, surrogate keys, and data quality / reconciliation frameworks. Engineering rigor: CI/CD, version control (Git), automated testing/validation, observability, and production support of mission-critical pipelines. Leadership with hands-on credibility: Demonstrated ability to lead engineers and engage senior client stakeholders while still contributing code and designs directly. Preferred: Databricks certifications (e.g., Databricks Certified Data Engineer Professional / Solutions Architect) and relevant Azure certifications (DP-203, AZ-305). Experience in retail, supply chain, merchandising, or financial-services data domains. Familiarity with IBM DataStage, DB2, Oracle, and legacy on-prem ETL estates. Exposure to agentic AI / GenAI patterns, MLOps/LLMOps, and AI-assisted code migration tooling. Experience operating a warm-standby DR and high-availability data platform.
Technology Designer at iA Financial Group, designing and evolving robust data platforms and DataOps practices. Collaborating with teams to implement cloud infrastructures and CI/CD pipelines.
Senior Data Engineer developing a trustworthy analytics data layer for commonsku's platform. Leading projects and mentoring team members in a remote - first work environment.
Data Engineer III developing scalable data infrastructure and systems for Ad Hoc, a tech company transforming public - sector digital services. Collaborating with teams to drive data engineering improvements.
Senior Software Engineer on Data Platform developing and maintaining infrastructure for data ingestion and processing. Collaborating on projects that drive AI and analytics initiatives at Samsara.
Senior Data Engineer at Sumsub, developing scalable data integration solutions and maintaining data quality. Collaborating with teams to optimize data processes in a remote - first environment.
Operations Data Engineer at Dotmatics focused on managing cloud architecture and optimizing costs. Collaborating with engineering and leadership to enhance operational health effectively.
Senior Data Engineer at Confluence responsible for designing and delivering scalable data solutions. Collaborating with teams and mentoring engineers in modern data engineering practices.
Associate Data Engineer building reliable data pipelines and contributing to data solutions at Confluence. Collaborating closely with experienced engineers and analysts in a modern data environment.
Senior Data Engineer designing and delivering scalable data solutions at Confluence. Collaborating cross - functionally and mentoring engineers while managing key components of the data platform.
Associate Data Engineer joining Confluence to build data pipelines and collaborate in a modern data environment. Solve real data challenges and contribute to meaningful projects supporting business decisions.