Senior DevOps Engineer designing and operating cloud-native infrastructure for distributed systems at ELITS. Collaborating with teams to ensure reliable streaming and high availability in production.
Responsibilities
Design, deploy and operate containerized microservices and distributed systems in production Kubernetes environments.
Build and maintain CI/CD pipelines to enable frequent, reliable releases and automated testing.
Implement and manage real‑time streaming data platforms (for example, Kafka or similar technologies) for low‑latency, high‑throughput workloads.
Design and operate infrastructure with a strong focus on reliability, performance and cost‑efficiency across cloud and on‑prem/hybrid environments.
Own infrastructure as code (IaC) using tools such as Terraform and Helm for repeatable, auditable environments.
Monitor, troubleshoot and optimize Linux‑based systems, containers and services, including performance tuning and incident response.
Collaborate with development teams to improve operability, observability and resilience of services (SRE mindset).
Document architectures, runbooks and operational procedures, and contribute to continuous improvement of processes and tooling.
Requirements
10+ years of experience in DevOps, SRE, Platform Engineering or similar roles.
Strong hands‑on experience with streaming technologies and real‑time data processing (for example, Apache Kafka, Kinesis, Pulsar or equivalent).
Solid background in distributed systems: microservices, event‑driven architectures, scalability and fault tolerance.
Strong understanding of hardware and infrastructure concepts (servers, networking, storage) and experience with on‑prem or hybrid environments.
Deep knowledge of Linux/Unix operating systems, system internals, performance and troubleshooting.
Extensive experience with cloud‑native technologies: • Containers and orchestration: Docker, Kubernetes (AKS/EKS/GKE or similar) • Infrastructure as Code: Terraform, Helm (and/or similar tools) • CI/CD pipelines: GitHub Actions, Jenkins, Argo CD or equivalent • Observability: monitoring, logging and alerting (for example, ELK/EFK, Prometheus, Grafana).
Experience with at least one major cloud provider (Azure, AWS or GCP); Azure experience is a strong asset.
Good understanding of networking (VPN, IPsec, load balancing, DNS, certificates).
Experience with agile ways of working and tools such as JIRA and Git.
Strong debugging and troubleshooting abilities across multiple layers (application, infrastructure, network).
Ability to understand users’ technical issues and provide clear, pragmatic recommendations.
Principal Site Reliability Engineer responsible for AWS infrastructure and reliability engineering. Collaborating across teams to enhance platform performance and security practices.
Junior/Intermediate DevOps Engineer role in Toronto (Hybrid). Build CI/CD pipelines with GitHub Actions, deploy Java/Spring Boot apps on OpenShift, and collaborate with DevOps teams.
Platform DevOps managing the Enterprise Data and AI Platform across AWS and Kubernetes. Implementing Infrastructure as Code with Terraform and maintaining CI/CD pipelines for secure solutions.
Lead DevOps specialized in AWS/GCP Cloud solutions for FinOps team. Driving cross - functional activation and managing cloud environments, data integrations, and automation strategies.
Skilled DevOps Engineer providing expertise in deployment automation for TD's technology solutions team. Engaging in improving development and release processes while ensuring security and system integrity.
Ingénieur fiabilité des infrastructures pour soutenir les services SaaS critiques. Collaborer, innover et optimiser la fiabilité et la performance des systèmes cloud sur AWS et Kubernetes.
DevOps Engineer to help scale cloud and on - prem environments, automating deployments and enhancing security posture for energy - intelligent compute applications.
Reliability Engineering Architect at Carbon60 managing a team to deliver AWS cloud solutions. Focus on mentoring engineers and integrating AI tools into automated systems.
DevOps Specialist taking over build, release, and environments for Sparrow’s product team. Leading DevOps practices while collaborating with CTO and senior developers in an agile setting.