About the role

Site Reliability Engineer overseeing cloud solutions for Akur8, focusing on reliability and automation. Collaborating across teams to enhance cloud infrastructure and maintain SLOs.

Responsibilities

Maintain and improve our infrastructure-as-code repositories (Terraform) to ensure the reliability and resilience of Akur8's cloud products.
Contribute to expanding Akur8's product offerings while maintaining our SLOs.
Strengthen automation and orchestration of pipelines to reduce repetitive manual tasks.
Train and support teams on DevOps best practices across the organization.
Contribute to the design of our AWS and Azure platform architectures, in collaboration with product and development teams, to improve performance, reliability, cost control, and to support new product features.
Help continuously improve monitoring and observability, primarily using Datadog.
Contribute to our CI pipelines (GitHub Actions), ensuring best practices are consistently applied when using containers (Docker).
Work closely with our Security team to secure workloads, maintain IT security standards and best practices, and participate in implementing infrastructure scanning.
Contribute to open-source projects where appropriate.
Actively participate in the on-call rotation (1 week every 4 to 6 weeks).

Degree in Computer Science, Information Technology, or a related field, or equivalent experience.
At least 5 years of professional experience configuring, monitoring, and maintaining AWS and/or Azure production systems across the full software development lifecycle.
Strong hands-on experience with Terraform in AWS and/or Azure environments.
Recent experience deploying, monitoring, optimizing, and maintaining Kubernetes and/or Nomad workloads.
Excellent proficiency with TeamCity and GitHub, including workflows and GitHub Actions.
Good experience with AWS CloudFormation.
Solid understanding of cybersecurity concepts and best practices (e.g., network security, firewalls, encrypted traffic, secrets management, security patching, identity management, etc.).
Strong interest in collaborating with various internal teams, including software engineering, IT security, IT operations, and platform teams.
Willingness to be both highly hands-on and to advocate DevOps best practices across the organization.
Curiosity and an ability to challenge the status quo while remaining pragmatic.
Ability to work independently and adapt to change.
Excellent communication skills, both verbal and written.
Excellent command of English, spoken and written; French is a plus.