Senior Engineer leading design and operation of GitLab's Kubernetes platform and developer tooling. Driving strategic initiatives for reliability and developer experience in a remote team.
Responsibilities
Lead the operation and evolution of production-grade Kubernetes clusters across cloud environments, making architectural decisions on upgrades, scaling, disaster recovery, and reliability improvements that impact the entire organization.
Define and drive GitOps strategy and standards across the organization, owning ArgoCD-based workflows by architecting Application Sets, sync policies, and deployment standards, and mentoring teams on GitOps best practices.
Architect and establish Terraform-based infrastructure-as-code standards across teams, building reusable modules and practices that enable safe, scalable cloud infrastructure provisioning while establishing clear patterns for state management and drift detection.
Lead platform observability strategy and incident response processes, set standards for monitoring and post-incident reviews, and drive organization-wide improvements to availability, performance, and resilience.
Partner with and mentor application teams to onboard services onto the platform, establishing patterns for documentation, runbooks, and self-service tooling that scale across the organization and improve developer productivity.
Design and establish security control standards such as role-based access control (RBAC), network policies, and secrets management (for example, Vault, Sealed Secrets, or External Secrets Operator) that meet compliance requirements and scale across the organization.
Drive integration of platform capabilities with continuous integration pipelines (for example, GitHub Actions, GitLab CI, or Tekton) to establish end-to-end delivery workflows that set standards across the organization.
Requirements
Experience operating and evolving production Kubernetes clusters (upgrades, scaling, disaster recovery, reliability) across one or more cloud environments (for example, Amazon EKS, Google GKE, or Azure AKS).
Experience designing and running GitOps-based continuous delivery workflows with ArgoCD, Flux, or similar tools; able to establish and maintain deployment standards across environments.
Experience with infrastructure as code (Terraform or equivalent), including reusable modules, state management, and drift detection practices for safe infrastructure provisioning.
Ability to write and maintain automation using a scripting language (for example, Python, Bash, or Go) and guide others on best practices.
Working knowledge of networking fundamentals (DNS, load balancing, ingress) and related platform patterns (for example, service mesh) to design reliable network architectures.
Strong written and verbal communication skills, including mentoring, writing clear system documentation, and establishing runbooks and best practices across teams.
Benefits
Benefits to support your health, finances, and well-being
Flexible Paid Time Off
Team Member Resource Groups
Equity Compensation & Employee Stock Purchase Plan
Backend Developer at iLogos Game Studios with focus on .NET / ASP.NET Core. Supporting core platform systems, including payments, user management, and analytics, while working flexibly.
Senior Backend Engineer architecting and developing robust backend systems for Cambio’s decarbonization platform. Collaborating with cross - functional teams in a hybrid role with a focus on sustainability.
Backend Engineer designing and maintaining Go services for a globally distributed platform at Luxor. Building mission - critical systems and collaborating on architecture decisions while ensuring code quality.
Staff JavaScript Developer designing and building a Web SDK to enhance fraud detection for AI Risk Decisioning at Oscilar. Collaborating across teams to deliver innovative solutions.
Senior Backend Engineer for HTS Media, focusing on high - performance ad serving platform. Designing scalable systems to enhance ad technology and improve advertiser success.
Senior Software Engineer developing APIs and ensuring scalability for AI GTM platform in a remote Canadian role. Join a dynamic team focused on impactful technology and customer solutions.
Senior Software Engineer building core AI technology at Centari. Collaborating on software design, development, and customer interactions in a remote environment.
AI Rust Engineer responsible for designing systems that integrate AI into workflows. Building infrastructure for language models and improving development tools.
Backend Software Developer for Atimi, providing software solutions remotely. Collaborate in product development and ensure code quality with a focus on AWS and Java.
Staff Backend Engineer developing scalable backend systems for Glider.fi, an innovative crypto trading platform. Designing data models and optimizing trading strategies in a dynamic environment.