Senior Site Reliability Engineer ensuring reliability and performance of Vantage’s services while collaborating across teams. Engaging in incident response and driving infrastructure improvements.
Responsibilities
Collaborate with a diverse team of software engineers, engaging in iterative processes and effective task planning to drive our projects forward.
Take ownership of the availability, scalability, and performance of our services, to proactively identify issues, and implement automation to prevent the recurrence of problems.
Participate in the on-call rotation, responding to incidents and working with the team to restore service and prevent recurrence.
Contribute to automating infrastructure provisioning, configuration, and management using IaC principles with tools like Terragrunt and Ansible.
Help design and enhance monitoring, logging, and alerting systems to improve observability and ensure system health.
Participate in blameless post-mortems, documenting issues, and following up on action items to foster a culture of learning and continuous improvement.
Foster collaboration with other engineering teams, promoting the reuse of existing frameworks and gaining insights into their operation.
Stay current with industry trends, emerging technologies, and best practices in SRE, DevOps, and automation.
Requirements
6+ years of experience as a Site Reliability Engineer, DevOps Engineer, or similar role working with software and infrastructure.
Proficiency with either Python or Bash.
Hands-on experience with Azure or AWS.
Familiarity with CI/CD pipelines and infrastructure as code (IaC) and its tooling such as terraform and ansible.
Demonstrated ability to triage and prioritize effectively when troubleshooting incidents.
History of engaging effectively with cross-functional teams during events such as incident-response and post-mortems.
Track-record of proactively tailoring infrastructure to meet the unique needs of the product it supports.
Hands - on Senior DevOps Developer designing, building, and operating secure cloud infrastructure. Enabling engineering teams to deploy mission - critical digital solutions into the nuclear industry.
DevSecOps Engineer responsible for building CI/CD pipelines and collaborating with security and operations teams at Aviso Wealth. Contributes to a culture of continuous improvement by implementing best practices.
DevOps Engineer developing functional systems that improve customer experience for S&P Global's applications. Responsibilities include automation, monitoring and maintaining infrastructure using cutting - edge technologies.
DevOps Manager leading engineering operations for a global translation company. Overseeing cloud infrastructure, deployment pipelines, and enhancing operational reliability while working remotely.
Build & Release Engineer at Parallel Domain improving CI/CD for simulation and Physical AI systems. Leading infrastructure initiatives ensuring efficient build processes.
Integrator role in Azure DevSecOps at Desjardins focusing on the stability of Azure infrastructure and supporting developer teams. Involves cloud platform management and automation for optimal service delivery.
Reliability Engineer focusing on developing and improving maintenance strategies for rotating equipment in Orica's Manufacturing Centre. Ensuring safety, efficiency, and compliance in operations.
Senior DevOps Developer managing a large monorepo and mentoring teammates at Caseware. Join us in building advanced AI software by automating scalable pipelines for AI applications.