Site Reliability Engineer (SRE) role focused on automation, resilience, and scale across cloud-native platforms. Responsibilities include monitoring, Kubernetes, AWS, disaster recovery, and mentoring teams.
Responsibilities
Drive automation, resilience, and scale across cloud-native platforms. Work on monitoring & observability (alerts, dashboards), automation & IaC (Python, Terraform, CloudFormation), Kubernetes (K8s) & AWS Cloud, disaster recovery (DR) strategies, ServiceNow workflows (incident management), production troubleshooting (on-call rotations), coaching delivery teams on SRE best practices, and blameless postmortems for continuous learning.