Senior Site Reliability Engineer

Posted 15 hours ago

Apply Now

Resume Score

Check how well your resume matches this job before you apply.

Sign in to check score

About the role

  • Senior Site Reliability Engineer establishing infrastructure to support Thunderbird’s privacy-respecting tools. Collaborates remotely with a distributed team across various time zones.

Responsibilities

  • Operate and evolve our EKS-based Kubernetes platform, supporting service migrations, platform improvements, and reliability initiatives.
  • Design and develop CI/CD systems supporting websites, services, and Thunderbird desktop releases, contributing to pipeline reliability and OIDC-based authentication across GitHub Actions workflows.
  • Write and maintain infrastructure in Pulumi and/or Terraform/OpenTofu across multiple AWS accounts.
  • Operate and evolve our observability stack (VictoriaMetrics, VictoriaLogs, Grafana, Vector) and partner with engineering teams to incorporate instrumentation and monitoring into service design.
  • Apply security-conscious infrastructure practices, including least-privilege IAM, secrets management via AWS Secrets Manager and External Secrets Operator, and network segmentation.
  • Diagnose and debug production incidents; drive root-cause analysis and post-incident improvements to prevent recurring problems.
  • Participate in on-call rotation and collaborate with SDEs and fellow SREs to ship, maintain, and monitor new builds and support service onboarding.
  • Contribute to runbooks, architecture documentation, and team processes.

Requirements

  • 7+ years of experience in infrastructure, platform engineering, or site reliability roles, including hands-on production Kubernetes experience in workload operations, troubleshooting, and cluster management.
  • Hands-on experience with infrastructure-as-code on AWS using Terraform, OpenTofu, or Pulumi.
  • Security awareness in day-to-day infrastructure work: identity, least privilege, secrets hygiene, and network controls.
  • Demonstrated ownership mindset with the ability to proactively identify issues, drive work to completion, and communicate risks early.
  • Excellent async written communication skills; comfortable working with a geographically distributed team.
  • Ability to collaborate effectively with software engineers and non-engineering stakeholders to improve platform reliability and operational efficiency.
  • Ability to learn, evaluate, and responsibly use emerging technologies, including AI-enabled tools, to improve work processes.

Benefits

  • Fully remote work & schedule flexibility
  • Company-provided laptop
  • Annual bonus program
  • Monthly remote work stipend
  • Annual professional development stipend
  • Industry conferences
  • Company all-hands and team gatherings
  • 24 days PTO per year (prorated)
  • Your birthday
  • Year-end company shutdown
  • 9 wellbeing days
  • Public holidays
  • Other paid leave
  • Quarterly wellbeing stipend for personal / family activities
  • RRSP contributions
  • Health, dental, & vision insurance
  • Disability insurance
  • Life insurance
  • Employee assistance program
  • Paid parental leave
  • Paid sick days

Job type

Full Time

Experience level

Senior

Salary

CA$108,000 - CA$125,000 per year

Degree requirement

Bachelor's Degree

Tech skills

AWSGrafanaKubernetesTerraform

Location requirements

RemoteCanada

Report this job

Found something wrong with the page? Please let us know by submitting a report below.