Production Support Engineer at Miratech ensuring reliability for mission-critical contact center environments through proactive monitoring and troubleshooting. Join a global IT services company focused on digital transformation.
Responsibilities
Ensure reliability, stability, and operational excellence for mission-critical contact center environments through proactive monitoring, live troubleshooting, and automation
Provide incident response within defined SLAs, troubleshoot production issues, and perform root cause analysis
Monitor and maintain observability using Splunk, CloudWatch, Zabbix, and similar tools
Investigate issues across AWS services, networking, APIs, and integrations
Manage Amazon Connect configurations, contact flows, bots (Lex), and integrations with Lambda, S3, QuickSight, and DynamoDB
Develop visual process flows, standardized troubleshooting playbooks, and how-to guides for support teams
Document alert resolution steps and maintain runbooks, knowledge repositories, and playbooks
Analyze ServiceNow incidents, RCAs, and historical events to extract actionable insights for documentation
Collaborate with platform and operations teams for incident triage, mock troubleshooting sessions, and continuous improvement
Requirements
4+ years of experience in Production Support, NOC, or Site Reliability Engineering roles
Minimum 3 years of hands-on experience with Amazon Connect (CCaaS)
Strong knowledge of AWS services including Amazon Connect, Lambda, S3, DynamoDB, and CloudWatch
Proficiency with monitoring and logging tools such as Splunk, CloudWatch, Dynatrace, Zabbix, and Grafana
Solid understanding of SLIs, SLOs, and core reliability engineering practices
Excellent communication abilities and strong documentation skills
Nice to have: AWS certifications
Experience with Docker/Kubernetes
Knowledge of CCaaS workflows and compliance frameworks (HIPAA, SOC-II)
Production Support Engineer / SRE role supporting critical digital applications with SRE practices. Requires 5+ years experience with Ansible, Elasticsearch, MongoDB, Redis, OpenShift, Azure, and Linux/Windows administration.
Production Support Engineer ensuring system stability and reliability for Manulife's critical services. Collaborative role bridging development and infrastructure, providing seamless service for customers.
Senior SRE Engineer for cloud - native solutions, CI/CD automation, and infrastructure - as - code. Hybrid role in Mississauga, ON with Azure/Kubernetes focus.
Senior SRE role building Kubernetes infrastructure, CI/CD pipelines, and automation. Hybrid contract in Mississauga with potential for full - time conversion.
Production Engineer ensuring compliance with manufacturing procedures and standards at Galderma. Optimizing production processes and supporting autonomous work cells for operational improvements.
Production Engineering Specialist providing support to the Production and Planning departments at Coperion. Implementing design improvements and ensuring efficiency of manufacturing processes.
Senior SRE role designing secure, scalable AKS clusters and automating infrastructure using Terraform. Requires 6+ years SRE/software engineering experience with Azure, Kubernetes, and CI/CD pipelines.
Senior Site Reliability Engineer (SRE) - Hybrid role in Mississauga. Design, build, and maintain cloud infrastructure through code, automate CI/CD, and manage Kubernetes clusters.
Contract Site Reliability Engineer role in Brampton, ON requiring 5 - 8 years of OpenShift, Azure, Kubernetes experience with monitoring tools expertise.
Site Reliability Engineer (SRE) role focused on automation, resilience, and scale across cloud - native platforms. Responsibilities include monitoring, Kubernetes, AWS, disaster recovery, and mentoring teams.