Site Reliability Engineer at BMO focusing on code deployment, IT operations, and system reliability through automation and monitoring. Collaborating between development and operations teams to improve service health.
Responsibilities
Designs how code is deployed, configured, and monitored
Helps teams determine new features by using service-level agreements (SLAs) and service-level objectives (SLO)
Applies software engineering to automate IT operations tasks
Acts as a link between the development and operations teams
Conducts chaos tests and performance tests for critical business requirements
Debugs production issues across services and levels of the technology stack
Computes the cost of SLA breaches and assists management in calculating impact of system reliability
Improves service health visibility by recording metrics, logs, and traces across all services
Requirements
Typically between 4 - 6 years of relevant experience
Foundational level of proficiency: DevOps, Cybersecurity and privacy concepts
Emotional agility, IT infrastructure library, Robot Process Automation, Cloud Computing, Configuration Management, Container Orchestration, System Design and Implementation, Incident management, Learning Agility, Building and managing relationships
Intermediate level of proficiency: API Management, Automation and Automation Pipelines, Automated Testing, Quality Assurance and Control, Verbal & written communication skills, Collaboration & team skills, Analytical and problem solving skills, Data driven decision making
Post-secondary degree in related field of study or equivalent combination of education and experience
Principal Site Reliability Engineer responsible for AWS infrastructure and reliability engineering. Collaborating across teams to enhance platform performance and security practices.
Junior/Intermediate DevOps Engineer role in Toronto (Hybrid). Build CI/CD pipelines with GitHub Actions, deploy Java/Spring Boot apps on OpenShift, and collaborate with DevOps teams.
Platform DevOps managing the Enterprise Data and AI Platform across AWS and Kubernetes. Implementing Infrastructure as Code with Terraform and maintaining CI/CD pipelines for secure solutions.
Lead DevOps specialized in AWS/GCP Cloud solutions for FinOps team. Driving cross - functional activation and managing cloud environments, data integrations, and automation strategies.
Skilled DevOps Engineer providing expertise in deployment automation for TD's technology solutions team. Engaging in improving development and release processes while ensuring security and system integrity.
Ingénieur fiabilité des infrastructures pour soutenir les services SaaS critiques. Collaborer, innover et optimiser la fiabilité et la performance des systèmes cloud sur AWS et Kubernetes.
DevOps Engineer to help scale cloud and on - prem environments, automating deployments and enhancing security posture for energy - intelligent compute applications.
Reliability Engineering Architect at Carbon60 managing a team to deliver AWS cloud solutions. Focus on mentoring engineers and integrating AI tools into automated systems.
DevOps Specialist taking over build, release, and environments for Sparrow’s product team. Leading DevOps practices while collaborating with CTO and senior developers in an agile setting.