Staff Software Engineer – Grafana Cloud k6

Posted last month

Apply Now

Resume Score

Check how well your resume matches this job before you apply.

Sign in to check score

About the role

  • Staff Software Engineer responsible for enhancing operational excellence in Grafana Cloud k6 product. Leading practices in reliability engineering and contributing to product development at Grafana Labs.

Responsibilities

  • Build and scale a strong culture of operational excellence by defining standards and coaching teams to own reliability and availability.
  • Drive mature DevOps/SRE practices, including incident response and PIRs, on-call readiness, runbooks, alerting, observability, and release/change management.
  • Establish reliability frameworks such as SLIs/SLOs and error budgets, and use them to guide prioritization and engineering trade-offs.
  • Provide visibility into system health through clear operational metrics and reliability reporting.
  • Guide teams in the design, development, evolution, and operation of large-scale, distributed cloud systems.
  • Influence product and system direction through design reviews, architectural discussions, and cross-team collaboration.
  • Share knowledge through clear, high-quality documentation and technical communication—internally and, where appropriate, externally—to help teams build and operate systems more effectively.
  • As the reliability foundation matures, grow into broader application and product development leadership, contributing architectural and technical depth beyond operations.

Requirements

  • Strong experience with DevOps/SRE practices, including operating and evolving production systems at scale
  • Strong programming background in a modern language (Python and Go are primary, but prior experience is not required)
  • Experience designing, building, and operating large-scale distributed systems
  • Strong understanding of reliability engineering concepts (e.g. incident management, observability, and failure modes)
  • Experience with test automation, including performance and functional testing
  • Ability to influence engineering practices through clear technical communication, reviews, and collaboration
  • Strong interpersonal skills and ability to work effectively across teams
  • Familiarity with modern software engineering processes and delivery practices
  • Self-driven and comfortable operating with a high degree of autonomy and ambiguity
  • Bonus Points For:
  • Experience with containerized and cloud-native systems (Docker, Kubernetes, AWS)
  • Familiarity with observability tooling and platforms (e.g. the Grafana stack)
  • Experience working with Python, Go, JavaScript and/or Jsonnet
  • Experience building or operating event-driven or asynchronous systems
  • Experience defining or applying SLIs/SLOs, error budgets, or reliability metrics
  • Interest in, or experience with, building testing frameworks or developer tooling

Benefits

  • Equity
  • Bonus (if applicable)
  • 30 days annual leave covering Grafana Shutdown Days

Job type

Full Time

Experience level

Lead

Salary

CA$186,368 - CA$223,642 per year

Degree requirement

Bachelor's Degree

Tech skills

AWSCloudDistributed SystemsDockerGrafanaJavaScriptKubernetesPythonGo

Location requirements

RemoteCanada

Report this job

Found something wrong with the page? Please let us know by submitting a report below.