Site Reliability Engineer, Production Reliability

Posted 5 hours ago

Apply Now

Resume Score

Check how well your resume matches this job before you apply.

Sign in to check score

About the role

  • Site Reliability Engineer managing scalable, self-healing systems at Yelp. Collaborating with global teams and ensuring platform reliability across thousands of users.

Responsibilities

  • Bring your curiosity, tenacity and experience
  • Working with engineers across Yelp in supporting new features and services
  • Integrating tools to monitor platform stability and performance
  • Help scale our Kubernetes clusters and AWS-based infrastructure while maintaining our platform's SLOs
  • Ensure the reliability of Yelp’s primary datastores (MySQL and Cassandra)
  • Troubleshoot site issues using industry-leading tools like Splunk, Grafana, and Prometheus
  • Automate everything with Python, Puppet, Git, Jenkins, Terraform and more!
  • Develop custom tools, when off-the-shelf solutions don’t work at our scale and contribute upstream to open source projects
  • Design and implement new systems, tests, and procedures
  • Participate in light on-call rotations

Requirements

  • Mastery of Linux (we use Ubuntu but any distro is fine)
  • Command of your favorite modern programming language to appreciate delivering safe and secure services: Python, Typescript, Ruby, Go, Rust, Java, C++, etc.
  • A solid understanding of Internet fundamental technologies in delivering services on the Internet (TCP/IP, HTTP, DNS, etc).
  • Experience with public cloud platforms (we use AWS and GCP, but others are also fine) and related tooling (Terraform, Puppet, Chef, Ansible etc.).
  • Experience with Linux containerisation and orchestration (e.g., Docker, Podman and Kubernetes).
  • Self-motivated to investigate, fix and improve Yelp in an ever changing environment.
  • Leading, Collaborating and Sharing technical activities with global teams.
  • Own the total lifecycle of a system.

Benefits

  • health insurance
  • flexible work arrangements
  • paid time off

Job type

Full Time

Experience level

Mid levelSenior

Salary

CA$135,000 - CA$185,000 per year

Degree requirement

Bachelor's Degree

Tech skills

AnsibleAWSCassandraChefCloudDNSDockerGoogle Cloud PlatformGrafanaJavaJenkinsKubernetesLinuxMySQLOpen SourcePrometheusPuppetPythonRubyRustSplunkTCP/IPTerraformTypeScriptGo

Location requirements

RemoteCanada

Report this job

Found something wrong with the page? Please let us know by submitting a report below.