Site Reliability Engineer

Posted last week

Apply Now

Resume Score

Check how well your resume matches this job before you apply.

Sign in to check score

About the role

  • Site Reliability Engineer enhancing reliability and operational readiness of services at Newton. Collaborating with engineering teams for system design and incident management.

Responsibilities

  • Improve the reliability, resilience, and operational readiness of our services
  • Work closely with engineering teams to improve system design and operational excellence
  • Prevent incidents, lead response efforts, and drive improvements through post-mortems
  • Implement improvements to the reliability, fault tolerance, scalability, and performance of our infrastructure
  • Manage incidents using your technical know-how to involve the appropriate teams and automate away manual practices
  • Provide support to our critical services by responding to automated alerts through our on-call rotation
  • Define and maintain SLIs, SLOs, SLA, and error budgets to guide reliability decisions
  • Improve observability across our systems (metrics, logs, tracing) to reduce time to detection and resolution
  • Make production issues easier to detect, troubleshoot, and resolve
  • Improve monitoring, alerting, dashboards, tracing and runbooks for critical services
  • Lead postmortems and follow-up actions to reduce repeat incidents

Requirements

  • You have experience designing and operating scalable, reliable systems in AWS or a similar cloud environment
  • You have handled on-call shifts for critical systems
  • You are experienced with chaos engineering (i.e. Gremlin)
  • You are able to dive in and debug live production systems
  • You enjoy working in a growing system, and writing and deploying code without any downtime
  • You have experience scripting and/or development (i.e. Linux Shell, Python, Javascript, Java)
  • You are a self-starter, taking initiative in an ambiguous space preferably within a start-up environment

Benefits

  • Commitment to integrity and transparency to our users!
  • A dynamic team fueled by collaboration uniting our strengths to overcome any obstacles
  • Continuous improvement and embrace creativity and encourage experimentation

Job type

Full Time

Experience level

Mid levelSenior

Salary

Not specified

Degree requirement

Bachelor's Degree

Tech skills

AWSCloudJavaJavaScriptLinuxPython

Location requirements

RemoteCanada

Report this job

Found something wrong with the page? Please let us know by submitting a report below.