Senior DevOps Engineer operating AWS infrastructure and Kubernetes for BlueCat Cloud SaaS platform. Focused on automation and operational stability while collaborating with cross-functional teams.
Responsibilities
Own the day-to-day operation, reliability, and performance of production services running on AWS.
Operate and support containerized workloads across ECS and Kubernetes (EKS) environments.
Maintain and evolve an EKS-based platform, including cluster upgrades, add-ons, and operational tooling.
Manage Kubernetes workloads using Helm and standard deployment and release practices.
Build, maintain, and improve CI/CD pipelines to support safe, repeatable, and efficient deployments.
Automate infrastructure and operational workflows using Infrastructure as Code (Terraform preferred).
Participate in an on-call rotation, respond to customer-impacting production incidents, and lead troubleshooting efforts.
Drive incidents through resolution, perform root cause analysis (RCA), and implement preventative improvements.
Troubleshoot Kubernetes networking, ingress, service discovery, and workload-level issues.
Implement and maintain monitoring, alerting, and logging solutions (CloudWatch, Prometheus, Grafana, InfluxDB, etc.).
Partner with application teams to ensure services are production-ready and operationally supportable.
Work closely with engineers across Toronto and Serbia teams to support production systems.
Provide technical guidance and informal mentorship to junior DevOps and SRE engineers.
Requirements
5–8+ years of experience in DevOps, cloud infrastructure, or production operations roles.
Site Reliability Engineer managing scalable, self - healing systems at Yelp. Collaborating with global teams and ensuring platform reliability across thousands of users.
Principal Site Reliability Engineer responsible for AWS infrastructure and reliability engineering. Collaborating across teams to enhance platform performance and security practices.
Junior/Intermediate DevOps Engineer role in Toronto (Hybrid). Build CI/CD pipelines with GitHub Actions, deploy Java/Spring Boot apps on OpenShift, and collaborate with DevOps teams.
Platform DevOps managing the Enterprise Data and AI Platform across AWS and Kubernetes. Implementing Infrastructure as Code with Terraform and maintaining CI/CD pipelines for secure solutions.
Lead DevOps specialized in AWS/GCP Cloud solutions for FinOps team. Driving cross - functional activation and managing cloud environments, data integrations, and automation strategies.
Skilled DevOps Engineer providing expertise in deployment automation for TD's technology solutions team. Engaging in improving development and release processes while ensuring security and system integrity.
Ingénieur fiabilité des infrastructures pour soutenir les services SaaS critiques. Collaborer, innover et optimiser la fiabilité et la performance des systèmes cloud sur AWS et Kubernetes.
DevOps Engineer to help scale cloud and on - prem environments, automating deployments and enhancing security posture for energy - intelligent compute applications.