Staff Platform Site Reliability Specialist, Observability – Kubernetes

Posted 2 weeks ago

Apply Now

About the role

  • Staff Platform Site Reliability Specialist at Everbridge managing observability stack and ensuring system reliability. Collaborating on cloud technologies within EKS and GCP environments.

Responsibilities

  • Head the design, operation, and evolution of Everbridge’s observability stack
  • Build and maintain a highly available, scalable observability platform
  • Standardize instrumentation, dashboards, alerts, and SLOs
  • Support incident response, root cause analysis, and capacity planning
  • Operate and scale Grafana and technology
  • Maintain reliability and security of EKS clusters running observability
  • Manage cluster lifecycle and upgrades
  • Terraform for infrastructure provisioning
  • Gitlab CI/CD at Scale

Requirements

  • 6+ years in SRE / Platform Engineering
  • Strong Grafana ecosystem experience
  • Kubernetes and Amazon EKS expertise
  • Terraform proficiency

Benefits

  • healthcare
  • dental care
  • mental health benefits
  • disability income benefits
  • life and AD&D insurance
  • retirement savings plan with employer match
  • paid time off

Job type

Full Time

Experience level

Lead

Salary

CA$135,000 - CA$165,000 per year

Degree requirement

Bachelor's Degree

Tech skills

GrafanaKubernetesTerraform

Location requirements

RemoteCanada

Report this job

Found something wrong with the page? Please let us know by submitting a report below.