Site Reliability Engineer overseeing cloud solutions for Akur8, focusing on reliability and automation. Collaborating across teams to enhance cloud infrastructure and maintain SLOs.
Responsibilities
Maintain and improve our infrastructure-as-code repositories (Terraform) to ensure the reliability and resilience of Akur8's cloud products.
Contribute to expanding Akur8's product offerings while maintaining our SLOs.
Strengthen automation and orchestration of pipelines to reduce repetitive manual tasks.
Train and support teams on DevOps best practices across the organization.
Contribute to the design of our AWS and Azure platform architectures, in collaboration with product and development teams, to improve performance, reliability, cost control, and to support new product features.
Help continuously improve monitoring and observability, primarily using Datadog.
Contribute to our CI pipelines (GitHub Actions), ensuring best practices are consistently applied when using containers (Docker).
Work closely with our Security team to secure workloads, maintain IT security standards and best practices, and participate in implementing infrastructure scanning.
Contribute to open-source projects where appropriate.
Actively participate in the on-call rotation (1 week every 4 to 6 weeks).
Requirements
Degree in Computer Science, Information Technology, or a related field, or equivalent experience.
At least 5 years of professional experience configuring, monitoring, and maintaining AWS and/or Azure production systems across the full software development lifecycle.
Strong hands-on experience with Terraform in AWS and/or Azure environments.
Platform DevOps managing the Enterprise Data and AI Platform across AWS and Kubernetes. Implementing Infrastructure as Code with Terraform and maintaining CI/CD pipelines for secure solutions.
Lead DevOps specialized in AWS/GCP Cloud solutions for FinOps team. Driving cross - functional activation and managing cloud environments, data integrations, and automation strategies.
Skilled DevOps Engineer providing expertise in deployment automation for TD's technology solutions team. Engaging in improving development and release processes while ensuring security and system integrity.
Ingénieur fiabilité des infrastructures pour soutenir les services SaaS critiques. Collaborer, innover et optimiser la fiabilité et la performance des systèmes cloud sur AWS et Kubernetes.
DevOps Engineer to help scale cloud and on - prem environments, automating deployments and enhancing security posture for energy - intelligent compute applications.
Reliability Engineering Architect at Carbon60 managing a team to deliver AWS cloud solutions. Focus on mentoring engineers and integrating AI tools into automated systems.
DevOps Specialist taking over build, release, and environments for Sparrow’s product team. Leading DevOps practices while collaborating with CTO and senior developers in an agile setting.
Developer Advocate advocating for security in cloud native infrastructure within a global leader in recruitment. Collaborating with thought leaders and driving awareness through various channels.
Senior Site Reliability Engineer at Rootly embedding with teams to enhance service performance and reliability. Own CI/CD pipelines and drive capacity planning efforts in a fast - paced environment.
Site Reliability Engineer maintaining and optimizing cloud infrastructure for Tecsys. Collaborating with engineering teams to drive reliability and performance in mission - critical SaaS environments.